Hi,

I am a little bit confused about how to standardize numerical variables and encode binary variables (non-symmetric) to put all coefficients on the same scale.

From Gelman_2007, I thought that in case of symmetric, 0/1 (default dummy encoding in R) encoded binary variables (mean 0.5, sd=0.5), he suggests to standardize the numerical coefficients by 2 sds in order to make the coefficients for both types of variables comparable (resulting in a comparable sd of 0.5)

In Gelman_2008, he states

• Binary inputs are shifted to have a mean of 0 and to differ by 1 in their lower

and upper conditions. (For example, if a population is 10% African-American

and 90% other, we would define the centered “African-American” variable to

take on the values 0.9 and −0.1.)

• Other inputs are shifted to have a mean of 0 and scaled to have a standard devia-

tion of 0.5. This scaling puts continuous variables on the same scale as symmet-

ric binary inputs (which, taking on the values ±0.5, have standard deviation 0.5).

If we follow the first advice, we might end up with non-symmetric binary inputs (0.9 and -0.1) that have mean zero and differ by 1 but have a sd != 0.5. In case of symmetric binary inputs, we end up shifting the 0/1 encoding to 0.5/-0.5, that differ by 1, have mean = 0, and sd= 0.5.

Following the second advice will lead to numeric variables scaled to mean 0 and sd =0.5, which is only the same scale as the scaled binary inputs if the latter are symmetric.

If we divide the shifted binary variables also by 2sd, we end up with mean=0 and sd = 0.5, but now the difference between the two encodings might not be 1 anymore.

I have trouble to understand what the advice is for non-symmetric binary variables.

Any help is appreciated.