Hi all,

I am new to Stan, and could use some help replicating a hierarchical model that I have been estimating in Stata using the *mixed* command. The issue is that after trying to replicate the results in Stan, I seem to get some very different results on some variables in the regression, but not on others.

**What I am trying to do**:

I am trying to fit panel data. I have several observations per individual per year, and individuals over several years. I would furthermore like to explicitly model the sampling setup, so that I want to model a random intercept at the region level. This results in a 4-level hierarchy. I use a within-estimator à la Mundlak 1978 Econometrica for the time-changing effects. In addition, I want to account for characteristics that are time-invariant in the same regression. In Stata, I fit the model by using something like the following code:

`mixed y X Z || region: || respondent: || resp_year: `

,

where X is a matrix of time-demeaned variables and their means, and Z a matrix of time-invariant characteristics. The term resp_year consists of a sequential number by respondent and year, to reflect the hierarchical nesting of the observations.

**The problem I am having:**

I have troubles replicating this model in Stan. At first, I tried running it in R using lme4:lmer and lnme:lme. Here I also got different results from Stata. However, looking around the web and trying several things it turns out that these programmes give me results that are equivalent to those obtained running the Stata command above with the *relm* option, that is they seem to use an algorithm that resembles Stata’s restricted maximum likelihood. The results from my Stan model, however, are very different from both, with coefficients that are almost one order of magnitude smaller than in Stata, and much less also than in lme4. Ideally, I would like to replicate the Stata results using the full maximum likelihood estimator.

Interestingly, the results are almost identical for Stata, R, and Stan in terms of the time-varying part in *X*. It is only for the time-invariant part in *Z* that the differences emerge. This may be hinting at a mis-specification of the error/hierarchical structure, but I am at a loss what this may be.

**The Stan model I am using:**

Here is a minimal example of the Stan model:

```
model{
vector[N] mu;
sigma ~ cauchy( 0 , 0.25 );
sigma_region ~ cauchy( 0 , 1 );
sigma_respondents ~ cauchy( 0 , 1 );
sigma_years ~ cauchy( 0 , 1 );
omega ~ normal( 0 , sigma_region );
theta ~ normal( 0 , sigma_years );
rho ~ normal( 0 , sigma_respondents );
a ~ normal( 0 , 0.5 );
beta_tilde ~ normal( 0 , 0.25 );
mu = a + Q*beta_tilde + omega[region] + rho[respondent] + theta[resp_year];
y ~ normal( mu , sigma );
}
```

The matrix *Q* is the Q matrix from a QR decomposition of the design matrix, which lumps together both the time-varing and time invariant part above for convenience. The QR decomposition follows the manual:

```
transformed data{
real s = sqrt(N - 1.0);
matrix[M,M] R = qr_thin_R(X)/s;
matrix[N,M] Q = qr_thin_Q(X)*s;
matrix[M,M] R_inv = inverse(R);
}
```

Any ideas what is going wrong in my Stan model and how I could get it to imitate the results from my Stata model?

Max