# Having trouble fitting a simple varying intercept model

I’m trying to learn Stan and generated a simple synthetic data set. The data set is generated with the following R code:

``````library(dplyr)
library(rstan)
set.seed(12445)
n_cats <- 10
n <- 1000
alpha_mu <- 10
alpha_tau <- 2
sigma <- 0.2
df <- data.frame(cat = sample(1:n_cats, n, replace = TRUE))
alpha <- data.frame(cat = 1:n_cats,
alpha = rnorm(n_cats, alpha_mu, alpha_tau))

df1 <- left_join(df, alpha, by = "cat") %>%
mutate(y = alpha + rnorm(n, 0, sigma)) %>%
tbl_df()

data <- list(n_cats = n_cats,
n_samples = n,
cats = df\$cat,
y = df1\$y)
``````

Essentially, I’m generating a mean (alpha) for every category from a reference distribution. The y values are generated by sampling from a distribution with that mean and a constant variance. In other words, this is the data generating process:

y ~ N(alpha_i, sigma)
alpha_i ~ N(alpha_mu, alpha_tau)

Fitting this model in lmer, recovers the correct parameters.

``````library(lme4)
lmer(y ~ (1 | cat), df1) %>% coef()
``````

Output from this is:

``````   (Intercept)
1     9.504471
2    10.239418
3    13.213756
4     6.875253
5    13.899031
6     7.953558
7     9.806968
8     8.089552
9    10.960454
10    8.809423
``````

This is very close to the correct parameters.

I then tried to fit this using stan. The stan program is shown below:

``````data {
int<lower=1> n_cats;
int<lower=0> n_samples;
int<lower=1> cats[n_samples];
real y[n_samples];
}
parameters {
real<lower=9, upper=11> alpha_mu;
real<lower=1, upper=3> alpha_tau;
real alpha[n_cats];
real<lower=0, upper=1.0> sigma;
}
model {
for (i in 1:n_cats) {
alpha[i] ~ normal(alpha_mu, alpha_tau);
}

for (i in 1:n_samples) {
y ~ normal(alpha[cats[i]], sigma);
}
}
``````

But when I look at the coefficients, all the alpha_i are equal to the grand mean.

``````                 mean se_mean
alpha_mu         9.95    0.01
alpha_tau        1.14    0.00
alpha         9.96    0.00
alpha         9.96    0.00
alpha         9.96    0.00
alpha         9.96    0.00
alpha         9.96    0.00
alpha         9.96    0.00
alpha         9.96    0.00
alpha         9.96    0.00
alpha         9.96    0.00
alpha        9.96    0.00
sigma            1.00    0.00
lp__      -2368724.67    0.10
``````

What am I doing wrong here?

`y ~ normal(alpha[cats[i]], sigma);`

Should be:

`y[i] ~ normal(alpha[cats[i]], sigma);`

I think. See if that changes anything.

Doh! Of course. Yup that worked. Thanks!

1 Like

`y ~ normal(alpha[cats], sigma);` without the `for` loop should work as well (that’s the vectorized way to write the likelihood).

3 Likes