Equation for negative binomial model with random effects on the intercept

I would like to express a generic version of a negative binomial model as an equation. The model run in BRMS is as follows:

modNB <- brm(y|weights(weightObs) ~ x1 + x2 + x3 + (1 | cluster), data = dat, prior = c(set_prior("normal(0.5, 1)", class = "b")), family = negbinomial())

My attempt at then representing this as an equation is then like so:

\begin{aligned} & y_{c} \sim NegBin(r_{c}, \pi_{c}), \\ & logit(\pi_{c})=log(\frac{\pi_{c}}{1-\pi_{c}})=\frac{1}{1+exp(-x^{T}_{c}\beta)}, \\ & \beta \sim Normal(0.5,\alpha^{-1}_{cv}), \\ & \alpha_{cv} \sim Gamma(e_{0}, 1/f_{c0}), \\ & f_{c0} \sim Normal(\mu, \sigma^{2}), \\ & r_{c} \sim Gamma(g_{0}, 1/h), \\ & h \sim Gamma(i_{0}, 1/j_{0}), \end{aligned}

My questions are then:

  • are the weights added to the vector of independent variables and their coefficients as: -wx^{T}_{c}\beta (as well as wy_{c})?
  • how do I represent random effects on the intercept? Above, I expressed it as an estimated parameter rate parameter (1/f_{c0}) with a normal distribution mu and sigma squared.

The equation can be rendered using this website: https://www.latex4technics.com/

Edit by @Max_Mantei to properly display \LaTeX.

1 Like


Sorry, I don’t really have time to answer your questions, but wanted to let you know, that you can actually use \LaTeX here. Double dollar sign for blocks $$ ... (stuff on new line) ... $$ and single dollar sign $ ... $ for inline.

One quick thought: I’m pretty sure, that brms uses the alternative parameterization of the Negative Binomial (or it’s special Log and/or GLM versions, which you will also find in the Stan functions reference). It’s probably a lot easier to write the model down this way.


1 Like

Thanks, Max!

That looks much better. It’s good to know the magic of $$ for the future too.

That parameterisation is much more succinct. To be honest, I would struggle to generalise it into a regression equation though. I would prefer to keep something closer to my original equation to match other model equations that I’ve presented.



The weights are not used to manipulate to x^T\beta. Instead, the log-pmf is multiplied with the weight (w_i log(NBpmf(y_i| r_i, \pi_i))) before it is added to the log posterior (target += in Stan code).

I am not sure I understand your question about the intercept.
Still, if one describes a non-hierarchical model with
y = \alpha+x^T\beta,
where \alpha is the intercept, then a random intercept model could be written as
y = \alpha_F + \alpha_c + x^T\beta_F,
where the subscripts ._F and ._c stand for fixed and random effect coefficients, respectively.


Thanks, Guido.

The weights make perfect sense. I’ll use words to describe that rather than in math.

My question about the intercept stemmed from the third line of the equation \beta \sim. I assumed that the \alpha_{v} in the equation was the intercept but it seems as though it’s a ‘precision parameter’ based pages 3351 and 3352 of this paper. Having the distribution of beta coefficients based on the intercept doesn’t make sense - I should’ve picked that up earlier.
I think this revised equation makes sense, right?

\begin{aligned} & y_{c} \sim NegBin(r_{c}, \pi_{c}), \\ & logit(\pi_{c})=log(\frac{\pi_{c}}{1-\pi_{c}})= \alpha + \frac{1}{1+exp(-x^{T}_{c}\beta)}, \\ & \beta \sim Normal(0, \phi^{-1}_{v}), \\ & \alpha_{v} \sim Gamma(e_{0}, 1/f_{c0}), \\ & r_{c} \sim Gamma(g_{0}, 1/k), \\ & k \sim Gamma(l_{0}, 1/m_{0}), \\ & \alpha \sim Normal(\mu, \sigma^{2}), \end{aligned}

Hi! I had a second look. Actually, I struggle to follow your equations intuitively, because I always work with the alternative parameterization.

I believe, it is actually much easier!

\begin{align} y &\sim \text{NegativeBinomial2}(\mu, \phi), \\ \log(\mu) &= \alpha + X\beta, \\ \alpha &= \alpha_0 + \alpha_c \\ \alpha_0 &\sim \text{Student-t}(3, \ldots, 10) \hspace{1em} \text{...you have to check for specific mean used by brms} \\ \alpha_c &\sim \text{Normal}(0, \sigma_c) \\ \sigma_c &\sim \text{Student-t}_+(3, 0, 10) \\ \beta &\sim \text{Normal}(0.5, 1) \\ \phi &\sim \text{Gamma}(0.01, 0.01) \end{align}

This is ignoring the weights. Note that most of these are the default priors, which you can change.

You can check that this is pretty much exactly what brms is doing by calling make_stancode instead of brm:

model {
  // initialize linear predictor term
  vector[N] mu = Intercept + Xc * b;
  for (n in 1:N) {
    // add more terms to the linear predictor
    mu[n] += r_1_1[J_1[n]] * Z_1_1[n];
  // priors including all constants
  target += normal_lpdf(b | 0.5, 1);
  target += student_t_lpdf(Intercept | 3, -2, 10);
  target += gamma_lpdf(shape | 0.01, 0.01);
  target += student_t_lpdf(sd_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10);
  target += normal_lpdf(z_1[1] | 0, 1);
  // likelihood including all constants
  if (!prior_only) {
    for (n in 1:N) {
      target += weights[n] * neg_binomial_2_log_lpmf(Y[n] | mu[n], shape);

Hope this helps! :) If you want to stick with your parameterization, you should definitely take into account the default priors brms uses.



I am not sure that I follow your equation, but here is what I understand:

y_c\sim NegBin(r_c,\pi_c) suggests to me that you are estimating a model with cluster specific precision/over-dispersion parameters (r_c) and cluster specific success probabilities (\pi_c). In my understanding, only the latter falls under what one describes as random effects. The brms model in your initial post has only random intercepts, nut not cluster-specific precision/over-dispersion parameters (though this can be done in brms).

I think your equation that begins with logit is not correct. As @Max_Mantei points out above, brms uses a log link function. Intercepts and random effects need to go into the log() part. The NegBin regression is just like a linear regression, except that we apply a link function (log(l)) to our linear predictor (l = \alpha+x^T\beta) to get number of successes, and we estimate a precision/over dispersion parameter instead of a error variance.

I am also not 100% sure about your priors for following reasons:

  • what are \alpha_v and \phi, f_{c0}, etc.?
  • brms does not use gamma priors, but normal piors on regression coefficients and Cauchy priors on intercepts (not sure about the over dispersion parameter, but I would guess its a normal prior plus a link function, edit: its the log link https://rdrr.io/cran/brms/man/brmsfamily.html).
1 Like

That is much clearer!

Yes, I see what you mean about the subscripts on the over dispersion parameter. The logit and those other parameters were from the reference I linked to earlier, which is related to their specific case.

Now I’m motivated to go through the Stan documentation in more detail. As I understand it z_1[1] is the \alpha_0 term. I’ll read, play with data and understand it better.



In the code @Max_Mantei posted Intercept in vector[N] mu = Intercept + Xc * b is the fixed effect intercept and r_1_1 are the random intercept coefficients (J_1 is an indicator variable, indicating to which group a participant belongs for the, Z_1_1 contains only ones, but this will be different for random slope models)