Bad estimates (possible misspecification of multilevel model)


Hi, I’ve been trying to find an error in my model for a few days, and I’m finally at the point where I need some help from community. I’m trying to fit a simple multilevel model with a single group-level predictor Z and a single individual predictor X. This is the way I specify the model:

bayes_model = "
  int<lower=1> J; //number of clusters
  int<lower=1> N; //number of individuals
  int<lower=1,upper=J> clust[N]; // cluster for each individual
  vector[N] y; //vector of responses
  vector[N] X; //vector of individual covariates
  vector[J] Z; //vector of cluster characteristics

parameters {
  real<lower=0> sigma_clust; // cluster-level SD
  real<lower=0> sigma_ind; //individual-level SD
  real gamma0; // group-level intercepts
  real a[J]; //random effects
  real gamma1; // group-level slope
  real beta; // individual slope

transformed parameters {
  vector[J] alpha;
  vector[N] y_hat;

  for (i in 1:J) {
    alpha[i] = a[i] + Z[i] * gamma1;

  for (i in 1:N) {
    y_hat[i] = alpha[clust[i]] + X[i] * beta;

model {
  sigma_clust ~ inv_gamma(0.001, 0.001);  //priors
  sigma_ind ~ inv_gamma(0.001, 0.001);
  gamma0 ~ normal(0, 25);
  gamma1 ~ normal(0, 25);
  beta ~ normal(0, 25);

  a ~ normal(gamma0, sigma_clust);
  y ~ normal(y_hat, sigma_ind);

When I use simulated data with true values of parameters gamma0 = 0.0, gamma1 = 0.1, beta = 0.2, I get completely wrong estimates for gamma0 (far from 0) and gamma1, while lmer function in R is much more accurate. I’ve tried different parameters for number of iterations, maximum tree depth and adapt_delta, getting the same results. Is there some kind of obvious mistake, or maybe there is a way to reparametrize the model? Thanks a lot.


It sounds like your comfortable with the lme syntax and are curious about how these sorts of models might get implemented in a Stan model?

I’m not an expert on this stuff. The language is kinda foreign to me, but you might try brms: . That does the lme models, but there’s functions make_stancode and make_standata that you can use to see the actual Stan model it’s using. This might be a way to track down the differences in whatever models you’re using.

You can use optimizing vs. sampling to see the difference in the MLE vs. sampled posterior then.


We’ve been trying to discourage priors like these as they can lead to estimates getting pushed heavily out into the tails.

lmer is giving you a max marginal likelihood answer, whereas Stan’s giving you a Bayesian posterior. With Bayes, you want to check posterior interval coverage, not the point estimate (though the point estimates from posterior means should be as good or better than those from lme4).

You really need the non-centered parameterization to make this fly. See the manual.