# Mistake in the specificaiton of a simple linear hierarchical model

Dear Stan community,

I have a simple varying-intercepts model that I usually fit in R like:

`glmer(y ~ x1 + x2 + x3 + x4 + x5 + (1 | groups), data = database, family=gaussian)`

It is my understanding that a simple translation of that to Stan would be like:

``````data{
int<lower=1> N;
int<lower=1> J;
real y[N];
real x1[N];
int<lower=0,upper=1> x2[N];
real x3[N];
real x4[N];
int<lower=0,upper=1> x5[N];
int<lower=1> groups[N];
}

parameters{
real fixed_a;
real b_x1;
real b_x2;
real b_x3;
real b_x4;
real b_x5;
vector[J] varying_a;
real<lower=0> sigma;
}

model{
vector[N] mu;

sigma ~ cauchy( 0 , 2 );

fixed_a ~ normal( 0 , 10 );
varying_a ~ normal( 0 , 10 );
b_x1 ~ normal( 0 , 10 );
b_x2 ~ normal( 0 , 10 );
b_x3 ~ normal( 0 , 10 );
b_x4 ~ normal( 0 , 10 );
b_x5 ~ normal( 0 , 10 );

for ( i in 1:N ) {
mu[i] = fixed_a + varying_a[groups[i]] + b_x1 * x1[i] + b_x2 * x2[i] + b_x3 * x3[i] + b_x4 * x4[i] + b_x5 * x5[i];
}

y ~ normal(mu, sigma);
}
``````

However, to my surprise, the results of “b_x4” and “b_x5” are very different from the “glmer” estimation:

``````> summary(glmer(y ~ x1 + x2 + x3 + x4 + x5 + (1 | groups), data = stan.db, family=gaussian))
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x1 + x2 + x3 + x4 + x5 + (1 | groups)
Data: stan.db

REML criterion at convergence: -1158.2

Scaled residuals:
Min      1Q  Median      3Q     Max
-6.4678 -0.3357  0.0847  0.4586  2.6398

Random effects:
Groups   Name        Variance Std.Dev.
groups   (Intercept) 0.006730 0.08204
Residual             0.004145 0.06438
Number of obs: 513, groups:  groups, 72

Fixed effects:
Estimate Std. Error t value
(Intercept)  2.548721   0.033433   76.23
x1          -0.016822   0.003816   -4.41
x2           0.042977   0.016200    2.65
x3          -0.006540   0.003601   -1.82
x4          -0.115440   0.032494   -3.55
x5          -0.015309   0.027933   -0.55
``````

And:

``````results <- extract(model.stanfit)
> mean(results\$b_x4)
 0.1604067
> mean(results\$b_x5)
 -0.5242973
``````

Am I doing something wrong when specifying the Stan model? I send attached the data, a stan model file and a R script to reproduce all the above.

varying-intercepts.stan (792 Bytes)
script.r (1.6 KB)
data.csv (43.5 KB)

If you want to stay closer to the `glmer` specification you probably want

``````varying_a ~ normal(0, sigma_a);
sigma_a ~ cauchy(0, 2); // or whatever you see appropriate.
``````

The `brms` and `rstanarm` packages let you specify these kind of models in Stan with `mer` style formulas. This could make your life a bit easier.

1 Like