Priors in Negative Binomial Multilevel with random intercept

Hi all!

I’m new to the forum and new to rstanarm and need some help. I have searched the web and forum, but most questions concern a normal distribution, rather than a negative binomial.

I have the following model:

stan_glmer(y ~ x1 + (1|g1) + (1|g2), family = neg_binomial_2(link = “log”), data = data)

My questions concern the priors for the coefficient and the two random intercepts.

  1. Why is there a prior on my coefficient, although it is not random?
    The output of prior_summary gives me: Coefficients ~ normal …, so I guess that the coefficient of x1 follows a normal distribution? Since the variance is quite low, I assume I can view this coefficient as “fixed”. Am I right?

  2. Where can I see what default priors are chosen for my random intercepts?
    I think that I can find them in the Covariance matrix, but I don’t know how to extract them for each intercept. I think that I miss some mathematical derivation wrt the negative binomial distribution, which would help me to understand everything better.

  3. Why is the dispersion parameter named “reciprocal_dispersion”? Is this related to the derivation from a Poisson distribution?

I would be happy, if you can share me some links to the documentation, where I can find answers or if you could explain it to me here!


Hey there! Welcome to the Stan Forum!

  1. The regression coefficient is random. That’s more or less the the whole point of Bayesian stats, or at least the (or one of the) main difference to frequentist statistics. In Bayesian inference parameters are random variables – otherwise priors would not make sense, right? Not sure why you think it would not be random? Maybe you can elaborate more about what you meant here?


No. I think rstanarm adjusts the default prior to make sense in light of the variation in the independent varaible (in your case x1) and will be weakly informative, i.e. fairly wide. First, note that with a NegBin model are working on the log-scale with your regression coefficients so you’d expect smaller effect sizes a priori. Second, also note that even with moderate amount of data the prior on the coefficients will not have a super dramatic effect (usually!) in such regression models – so you can’t really say that the regression coefficient is fixed, especially not, when default priors are applied.

  1. Yes, you can see it in the prior_summary output. They way rstanarm does this (by default) is by specifying classes (loosely speaking) to apply to the model. For example regression coefficients all get the same (not literally, you still get one prior for each coefficients, but they are the same) prior which are individually then adjusted for the dependent variables variation (see here, which applies to NegBin regression as well). Same thing with random intercepts. They all get the same default prior, usually something like
 ~ decov(reg. = 1, conc. = 1, shape = 1, scale = 1)

Here is a link that describes how these are constructed. Note, that this is by default the same in a Normal model as in any GLM (Poisson, Logit, NegBin, …). Note that in your specific case only the scale is important as you don’t have varying slopes for your groups.

  1. Check out the parameterization here, where \phi is the dispersion parameter. rstanarm let’s you define a prior for the reciprocal of the dispersion parameter, i.e. 1/\phi.

Hope this helps!

1 Like

Hey Max,

thanks for the quick answer and the welcome!

  1. Is it not possible to have random and not random (fixed) coefficients and intercepts? I am thinking of the classical school example. I read an explanation, where three models were proposed: varying intercept, varying coefficient and varying intercept-and-coefficient. That is why I assumed that it is possible in my case to choose varying intercepts and a fixed coefficient. And that is also the reason why I am confused about the prior, which appears in my code.
    Do you think that it is not possible in Stan to have this kind of model and all variables are random (-> varying intercepts-and-coefficient model)?

  2. I reread the instruction on how the priors for the group intercepts are constructed and this helped a lot!!
    I have just two question left. They are quite detailed. Skip them, if you don’t have the time for that… I’m already happy about your previous help!

2a Why is the scale only important in my case?
For my formula and with two distinctions in group g1 and three in g3, Zb would be this:
Zb = (1 1 1 1 1) (g11 g12 g21 g22 g23)^T
and the prior:
(g11 g12 g21 g22 g23) ~ Normal(0, Sigma),
where Sigma is the 5x5 matrix, containing the variances of and covariances between the intercepts. The scale only concerns the variances of the intercepts. Why can I ignore the covariances?

2b If my independent variable x is also varying by some group (e.g. g3), it would be actually a part of Z, right?

  1. Thanks, that link explained it completely!

Hey there!

Sorry, I think my writing was not very clear… Of course it is possible to have random and fixed effects in your model. However, I meant that the fixed effects (actually a pretty meh term, see for example Why I don’t use the term “fixed and random effects” | Statistical Modeling, Causal Inference, and Social Science) are random variables in a Bayesian regression (they have a prior distribution, which could also be very diffuse). So in your regression model the x1 is a fixed effect (or better: a completely pooled effect), but that does not mean its prior has small/zero variance. It means that it doesn’t vary by (any) group(s).

Yes. But I think it’s a bit hard to think about those models in terms of y = Xbeta + Zb + e… because Zb has a not so trivial structure.

Also, I guess you could do it like this:

but in this case \Sigma would not just be some covariance matrix. g11 g12 would have the same variance and g21 g22 g23 would have the same variance. Therefore you can not have correlations on just the intercepts within groups. You only have correlation structures when you add varying coefficients (and correlations will be between the coefficients/intercept, not groups).

I can really recommend going through Chapter 12 (especially 12.5) of Data Analysis Using Regression and Multilevel/Hierarchical Models by Gelman/Hill. They present a bunch of useful ways to think about hierarchical models. And with those formulations it’ll also become more clear what the priors do.

Sorry, I could not give you “the” solution. All this being said, the default priors of rstanarm are almost always reasonable. So your analysis is not “wrong”. Also, you might want to check out the brms package. It’s probably not as efficient in running your model, but more flexible and the whole prior thing is a bit more intuitive (I would say).