How to set autoscaling = FALSE for one parameter, while autoscaling = TRUE for all other parameters


I have been trying to run Bayesian conditional logistic regression in my PhD project using package rstanarm. Based on the literature, we have some prior knowledge about the exposure variable but not for the confounding variables. So we decided to specify an informative prior for the exposure, and use the default prior in rstanarm (weakly informative priors, in comparison to uniform priors) for the confounding variables. So in the modelling, we want to set the prior for the exposure parameter to autoscaling = FALSE, while leaving all other parameters autoscaling = TRUE. The code I used to try to implement this is as follows (linear regression model using mtcars data as an example):

prior <- normal(location = c(MES, 0, 0), scale = c(SD, 2.5, 2.5), autoscale = c(FALSE, TRUE, TRUE))
mtcars$mpg10 <- mtcars$mpg / 10
fit <- stan_glm(mpg10 ~ wt + cyl + am,
prior = prior,
data = mtcars,
algorithm = "sampling") 

The output is as follows:

image

Sadly, it appears that in this way, rstanarm did not autoscale the priors for coefficients.

Does anyone have any experience with this? Any comments/suggestions would be highly appreciated!


After posting the question, I tried another way:

First, use stan_glm to generate the default autoscaled priors using the data;
Second, extract the autoscaled priors and fit them to a new stan_glm model.

The code is as below:

fit_ini <- stan_glm(mpg10 ~ wt + cyl + am, 
                data = mtcars, 
                algorithm = "sampling") 

prior_summary(fit_ini)

prior <- normal(location = c(1, 0, 0), scale = c(2, 0.84, 3.02), autoscale = FALSE) # extracted priors from fit_ini for the last two parameters
post <- stan_glm(mpg10 ~ wt + cyl + am, 
         data = mtcars, 
         prior = prior,
         algorithm = "sampling") 

prior_summary(post)

I am not sure if this makes sense! :(

I don’t know the answer for stanarm, but @bgoodri or @Jonah should be able to sort you out.

Also, I changed the topic to interfaces:rstanarm in the hope of the right people seeing this question.

Thanks very much for your help!

Hi, @bgoodri or @jonah, may I ask for a help with this question? Many thanks in advance for your help!

@zhangguoqianggu did you figure out how to do this? I have the exact same scenario – where I’m needing to set a prior on my exposure variable, but I want to use the default for all other covariates in the model. Would be so helpful if you have any insight!

Unfortunately I don’t think there is a way to only apply the autoscale to a single element right now. The way the defaults are computed is described in

so I guess you could set autoscale=FALSE but then set the prior SDs to the values you would have obtained if autoscale=TRUE.

Thank you so much @Jonah !! Any chance you could help me with how to set the priors for specific variables? My code is below – I only want the prior=normal(0,0.6024096) to be on the exposure variable, but I can’t figure out how to specify that. I see how to set the intercept to the default, but not sure how to do that for all other covariates. Any help would be so appreciated!!

depression_stan2 ← stan_lmer(CESD_depressed.x ~ Treatment + CESD_depressed.y + f_name + (1|Key), data=endline_vars100, iter = 1000, chains = 4, cores = 1,
prior=normal(0,0.6024096), prior_intercept=normal(autoscale=T))

No problem. I’m not sure which variable in your model is the exposure and I’m not sure whether any of the variables in your formula are factor variables, so I can’t give you the exact code you need. But I’ll give you an example that you should be able to adapt pretty easily to fit your needs.

Suppose you have this data:

# the mtcars dataset comes with R
df <- mtcars[, c("mpg", "cyl", "disp", "wt", "gear")]
df$cyl <- as.factor(mtcars$cyl)
df$gear <- as.factor(mtcars$gear)
head(df)
                   mpg cyl disp    wt gear
Mazda RX4         21.0   6  160 2.620    4
Mazda RX4 Wag     21.0   6  160 2.875    4
Datsun 710        22.8   4  108 2.320    4
Hornet 4 Drive    21.4   6  258 3.215    3
Hornet Sportabout 18.7   8  360 3.440    3
Valiant           18.1   6  225 3.460    3

And suppose you want to fit this model

fit <- stan_lmer(mpg ~ wt + disp + cyl + (1|gear), data = df)

and you want to use the default prior for everything except for normal(0,0.6024096) for the coefficient on the variable called wt.

In order to derive the default prior that would be used for the other variables if autoscaling were turned on, we can use the formula in Prior Distributions for rstanarm Models • rstanarm that says that the prior SD is 2.5 * sd(y) / sd(x). There are various ways to compute this but here’s a shortcut using model.matrix (this helps if you have factor variables since model.matrix will automatically convert them to columns of indicator variables like rstanarm and other packages do internally):

# remove first element of sd_x because model.matrix includes
# a column for the intercept but the prior for the intercept 
# is specified separately 
sd_x <- apply(model.matrix(~ wt + disp + cyl, data = df), 2, sd)
sd_x <- sd_x[-1]
sd_y <- sd(df$mpg)
prior_sd <- 2.5 * sd_y / sd_x
prior_sd["wt"] <- 0.6024096 # replace with the SD you want to use

Now you can specify prior = normal(0, prior_sd, autoscale = FALSE) when fitting the model:

fit <- stan_lmer(
  mpg ~ wt + disp + cyl + (1|gear), 
  data = df,
  prior = normal(0, prior_sd, autoscale = FALSE)
)

Now if I check prior_summary(fit), the part of the output for the regression coefficients says:

Coefficients
 ~ normal(location = [0,0,0,...], scale = [ 0.60, 0.12,35.87,...])

And you can see that for the first variable, which is wt the prior sd (scale) is 0.60 (it’s really 0.6024096 but it rounds to 2 digits when printing). The other SDs are the same as the ones you would get if you used the default prior.

Hope that helps!