Random effects brms with few observations per group - meta-analysis

Hi everyone,

In the literature, I see an increasing use of hierarchical models to conduct a type of meta-analysis which involves analysing the raw data from across multiple studies in a model. In such models, a random effect is often included in the model to tell it about some structure across the data. For example within one study, data might be collected at 1 or more sites. If the data was included in the model at the site level (ie one row per site), then a random effect could be added for study ID.

I am working on a similar type of model and testing whether or not to include the study ID as a random effect. Within my data there are many studies which only have data for 1 site (so there is only 1 row per study), but some studies with between 2 and 18 sites.

I am only really interested in the fixed effects in the model, and I am not planning on making inferences about the random effects, only account for the fact that data from several sites came from the same study using the same method/team etc.

Including the random effect term in the model results in many high pareto-k values, but I read that the LOO may not be reliable in a mixed-effects model.

Is including a random effect for study even appropriate given that many studies only include data for 1 site?

Many thanks in advance.

IMO, yes, random effects are appropriate even if a minority of the papers contain studies with more than one site. As long as your diagnostics are good (rhat, ESS, …) you’re good to go.

1 Like

Appropriateness doesn’t depend on the number of observations per group.

Pareto-k diagnostic is specifically for Pareto smoothed importance sampling approximation of LOO. PSIS-LOO is likely to have high Pareto-k values if you remove all the observation corresponding to specific “random effect” parameter, which happens also in LOO if you have just one observation for some groups. LOO itself can be useful for hierarchical models, although sometime leave-one-group-out is preferred. You can reduce the number of high Pareto-k’s by using quadrature integration over the group specific parameter (“random effect”). See CV-FAQ Can cross-validation be used for hierarchical / multilevel models?.

1 Like