BRMS and modelling means in grouped data - how do I adjust for differences in sample size?


The variance of a mean (sigma/sqrt(n)) declines with the sample size, so with grouped data I would want to specify the relative precision of each observation. Is there a way of doing this in BRMS?

I know that I can specify weights, but this multiplies the likelihood contribution of each datapoint and I’m not sure that would be appropriate. At least in a frequentist OLS context, repeating each mean n times would give very misleading standard errors.

I also considered the yi | se(sei) syntax on the left hand side of the formula, but this requires me to know sigma as well - while I only know the relative precision of the mean from each group.


Not sure if this helps, but here is how you can specify group specific variances in brms.

lets assume you have following variables: outcome y, predicitor x, and grouping variable grp.

The you should be able to write a model as follows:

bf(y ~ x,
   sigma ~ grp),
family = gaussian())

see also the first example under the heading “A simple distributional model” here.

In the example above, sigma works because this is the variable name for the error variance in brms. If you use a different family with another dispersion parameters, you can also estimate that in a group specific manner, but you’ll of course have to chose the correct name (e.g. phi for a negative binomial model)


Please use the “interfaces - brms” tag for brms questions, or otherwise, I will likely overlook them

The suggestion of Guido is already pretty close. I would try something like

bform <- bf(y ~ ..., sigma ~ 1 + offset(-log(sqrt(n)))
brm(bform, ...)

This will model sigma on the log scale as

\log(\sigma_i) = \beta_0 -\log(\sqrt{n_i})

so that

\sigma_i = \exp(\beta_0 - \log(\sqrt{n_i})) = \frac{\exp(\beta_0)}{\exp(\log(\sqrt{n_i}))} = \frac{\exp(\beta_0)}{\sqrt{n_i}}


Thanks for the replies - the offset solution is exactly what I was looking for!