Multi-level meta-analysis when only means, sds & N are available

Hi folks,

I’m sure this is going to be a head-smackingly easy once I see the answer, but I’m having trouble thinking through how to write a generative model for this scenario: I have means, standard deviations and Ns from measurements made across multiple experiments. If I had the raw data, I’d model this as a hierarchical (aka. multi-level) model of the outcome ~ (1|experiment) style, possibly modelling experiment variances as samples from meta-distribution (i.e. “partial pooling”) in addition to the more traditional modelling of experiment means as samples from a meta-distribution. However, I’m having trouble thinking of how to express the likelihood function when I don’t have raw data but instead have just means, sds, & N. Any tips?


Hi Mike,

have you seen Matti Vuorre’s example?


Thanks for the pointer! Here’s a simplified version of the Stan code that brms generates from that example:

data {
  int<lower=1> N;  // number of observations
  vector[N] Y;  // response variable
  vector<lower=0>[N] se;  // known sampling error
parameters {
  real Intercept; // temporary intercept for centered predictors
  real<lower=0> sd; // group-level standard deviations
  vector[N] z; // standardized group-level effects
model {
  vector[N] mu = Intercept + sd * z;
  // priors including all constants
  target += normal_lpdf(Intercept | 0, 1); //modified from brms default, expects standardized data
  target += weibull_lpdf(sd | 2, 3); //modified from brms default, expects standardized data
  target += normal_lpdf(z | 0, 1);
  // likelihood including all constants
  target += normal_lpdf(Y | mu, se);

This seems to implement the assumption that the per-experiment variability (data variable se) is known (i.e. measured without any kind of error), which seems a little strong as an assumption, no? Any idea how I’d go about expressing the observed se values as samples from a latent distribution in the same way a given experiment’s mean is modeled as such?

Hm, I might have it:

data {
  int<lower=2> N;  // number of experiments
  vector<lower=2>[N] K ; // number of observations per experiment
  vector[N] obs_mean;  // observed mean from each experiment (priors assume that these have been standardized to a mean of 0 and sd of 1)
  vector<lower=0>[N] obs_sd;  // observed standard deviations in each experiment
parameters {
  //centrality parameters
  real centrality_intercept;
  real<lower=0> centrality_sd;
  vector[N] centrality_z;
  //variability parameters
  real variability_intercept;
  real<lower=0> variability_sd;
  vector[N] variability_z;
model {
  vector[N] true_experiment_mean = centrality_intercept + centrality_sd * centrality_z ;
  vector[N] true_log_experiment_sd = variability_intercept + variability_sd * variability_z ;
  // priors for centrality
  target += normal_lpdf(centrality_intercept | 0, 1); //expects standardized data
  target += weibull_lpdf(centrality_sd | 2, 1); //expects standardized data
  target += normal_lpdf(centrality_z | 0, 1);
  // priors for variability (these need tweaking!)
  target += normal_lpdf(variability_intercept | 0, 1); // expects standardized data
  target += weibull_lpdf(variability_sd | 2, 1); // expects standardized data
  target += normal_lpdf(variability_z | 0, 1);
  // likelihood 
  target += normal_lpdf(log(obs_sd) | variability_intercept , variability_sd );
  target += normal_lpdf(obs_mean | true_experiment_mean, exp(true_log_experiment_sd)./sqrt(K) );

So, in addition to modelling the mean of each experiment as sampled from a latent normal distribution, I also model the variability of each experiment as sampled from a latent log-normal distribution, and in the last line I account for different experiments’ sample sizes. I think I need to work out the priors on the variability parameters, but generally does this look right? (tagging @matti in case they have input)

Sorry I’m no use here. I have a vague feeling though that there was something in brms to allow relaxing that assumption.

We have a couple examples of this problem in chapter 5 of BDA (including the notorious 8 schools example)!

Does the chapter discuss the extension I mention above whereby we do not assume that the observed standard error is the true standard error? So far as I understand this is assumed by the 8schools Stan example in the getting started guide (which is also equivalent, with the exception of the flat priors, to the model @matti’s post applies via brms and extracted/simplified above).

@paul.buerkner Any thoughts?

You mean basically modeling the uncertainty in the SEs as well? There have been few attempts to do this in the literature (in a non-Bayesian way) with little or no improvements as compared to other more standard approaches. I would think one can implement such a model in Stan directly, but I wouldn’t expect this to change a lot in the results.

Thanks for your input. I’ll try some simulation based calibration checks of my attempt above.


That chapter does not cover that extension. I think it would be simple to do in Stan.