Help implementing a Default Bayes Prior in stan_lmer()

I have found Rouder and Morey (2012) suggesting a default prior of cauchy(0,1).

I would like to implement this in a linear mixed effects model I’m computing using stan_lmer().

However I have both continuous standardized predictors and a dichotomous predictor of interest (sum coded with -.5 and .5s).

I’m not sure that simply using:

fixed_prior <- cauchy(0, 1, autoscale = TRUE)

and then setting prior = fixed_prior in my model is enough?

It depends which parameters you want to give the cauchy prior to. If you have a stan_lmer() formula something like y ~ x1 + x2 + ... + (1|group) then there will be a coefficients on x1, x2, etc. as well as varying intercepts by level of group. If you want to put a cauchy(0,1) prior on each of the coefficients on the x variables then your code is correct but you should change autoscale=TRUE to FALSE.

We used to also recommend the Cauchy but have started moving away from that due to the potential for pathological behavior in the tails. It’s not wrong to use the Cauchy, but if you want something with fatter tails than the normal distribution but better behaved than a Cauchy then a t-distribution lets you pick the degrees of freedom to make it more like the normal or more like the Cauchy (it’s identical to the normal with df=Inf and the Cauchy with df=1). The student_t() function in rstanarm lets you use that if you’re interested.

Hi Jonah, thanks very much for the reply!

It depends which parameters you want to give the cauchy prior to. If you have a stan_lmer() formula something like y ~ x1 + x2 + ... + (1|group) then there will be a coefficients on x1 , x2 , etc. as well as varying intercepts by level of group . If you want to put a cauchy(0,1) prior on each of the coefficients on the x variables then your code is correct but you should change autoscale=TRUE to FALSE .

What I was worried about was if stan_lmer() “knew” to do something different for the continuous and control variables? And, in addition, if it “knew” to do something different because the categorical variable was sum coded rather than dummy coded?

What rstanarm will do is let R compute the “model matrix” and then it will put the specified priors on whatever the columns of that end up being. So for example, if you had a formula like this:

# using an actual dataset mtcars that comes with R for demonstration
mpg ~ wt + as.factor(cyl) + (1|gear)

then the prior argument in stan_lmer() will be applied to the columns of

model.matrix(~ wt + as.factor(cyl), data = mtcars)

In this case that results in a variable wt (since wt is continuous) and then it splits up the factor variable cyl into dummy variables:

> model.matrix(~ wt + as.factor(cyl), data = mtcars)
                    (Intercept)    wt as.factor(cyl)6 as.factor(cyl)8
Mazda RX4                     1 2.620               1               0
Mazda RX4 Wag                 1 2.875               1               0
Datsun 710                    1 2.320               0               0
...

rstanarm will just take this design/model matrix (ignoring the intercept column, we do a separate prior for that) and apply the specified prior to each of columns (you can also pass vectors to the arguments of cauchy() and then it could use a different location and scale for each column if you wanted). So you can check the exact variables that end up getting the priors by checking what the model matrix is before you run it (in your case it may look different because of how you’ve coded the variables but the same principle applies). Does that make sense?

I haven’t used sum coding much myself. Is there something in particular it should be doing differently you’d expect it to do with the priors in that case?

1 Like

Thanks very much! So, I gather from your answer that I can supply rstanarm with a prior, and it will make sure it is made equivalently appropriate for a continuous and categorical variable?

I haven’t used sum coding much myself. Is there something in particular it should be doing differently you’d expect it to do with the priors in that case?

Yeah, essentially the coefficients for a sum coded variable end up being half the size of dummy coded ones, because they are computed relative to the mean. So that makes me suspect I should be halving the prior for sum coded variables, but I’m not sure. And then not sure how that affects things for the continuous variables.

It’s not so much that rstanarm will make sure it’s appropriate for continuous and categorical variables, but rather just that rstanarm assumes that the specified prior is intended to be used for the coefficients on the columns of the design matrix produced by model.matrix() and model.matrix() can handle both kinds of predictors. So it’s kind of R itself via model.matrix() and not so much anything we do specifically in rstanarm that makes it work for continuous and categorical predictors. But if I misunderstood what you meant then let me know and I can try explaining again.

As for the priors for sum coded variables, I’m not sure either but I think the best way to figure it out yourself if you can’t find the answer is to do some simulations. For example, you can generate some simulated data and then try it with rstanarm so that you can compare what happens to the estimates when you use sum coding vs when you don’t when using the same prior. Then you can also try adjusting the prior and seeing what affect that has on the estimates. If you’re using simulated data then you can try whatever you want without compromising your actual analysis (i.e., you don’t have to worry about issues related to choosing your prior based on your data).

Thanks again for the help! After all that, would someone mind taking a look at my code? I’ve created sample data and code below.

What I intend to do is use a default prior on my coefficients, and then to compute Bayes Factors for those coefficients.

Rouder and Morey (2012) say: “When using the Cauchy prior, s describes the interquartile range of a priori plausible standardized slopes. We find that s = 1 is a good default, and it specifies that the interquartile range of standardized slopes is from -1 to 1.”

So this is the prior I’m trying to apply, to a continuous and a binary variable. And then I am trying to compute a Bayes Factor for each predictor. Am I doing this correctly? Thanks a million in advance.

#Load Packages
library(rstanarm)
library(bayestestR)

#Create Data
d <- data.frame(ppt = as.factor(rep(1:10, 10)), item =as.factor(rep(1:10, each = 10)), iv1 = as.factor(rep(0:1, 500)), iv2 = rnorm(1000), dv = rnorm(1000))

#Scale Continuous Predictor
d$iv2 <- scale(d$iv2)

#Create Prior
fixed_prior <- cauchy(0, 1, autoscale = FALSE)

#Run Model
m <- stan_lmer(dv ~ iv1 + iv2 + (1 + iv2|ppt) + (1|item), data = d, prior = fixed_prior)

#Compute Bayes Factor
bayesfactor_parameters(m)