Hi all, I am trying to build a model for my research, which is

y_{i,t} = x_{i,t}\beta + v_i + \eta_t + e_{i,t}

where:

e_{i,t}|\sigma \sim N(0,\sigma);

y_{i,t} | \beta, v_i,\eta_t,\sigma \sim N(x_{i,t}\beta+v_i+\eta_t, \sigma);

v_i | \sigma_v \sim N(0, \sigma_v);

\eta_t | \phi,\ theta, \sigma_\eta \sim ARMA(1,1).

And following is my code for r stan:

```
data{
int<lower=0> T;
int<lower=0> N;
matrix[N,T] x;
matrix[N,T] y;
}
parameters{
real beta;
real<lower=-1, upper=1> phi;
real theta;
real<lower=0> sig2;
real<lower=0> sig2eta;
real<lower=0> sig2v;
vector[N] v;
vector[T] eta;
}
model{
vector[T] a; // error for time t
for(i in 1:N){
for(t in 1:T){
y[i,t] ~ normal(x[i,t]*beta+v[i]+eta[t], sqrt(sig2));
}
}
sig2 ~ uniform(0,5);
for(i in 1:N){ //v[i] ~ normal(0,sig2v), so for each state i, v changes
v[i] ~ normal(0,sqrt(sig2v));
}
sig2v ~ uniform(0,5);
//ARMA(1,1) part
a[1] = eta[1]; //this is ARMA(1,1) part and assume eta[0]=0, a[0]=0, mean coeff=0.??? what is eta[1]?
for (t in 2:T) {
a[t] = eta[t] - phi * eta[t-1] - theta * a[t-1]; //a[t]=eta[i,t]-eta.hat[i,t|t-1]
}
phi ~ uniform(-1,1);
sig2eta ~ uniform(0,5);
a ~ normal(0, sqrt(sig2eta));
}
```

I have a question about the ARMA(1,1) part, is a[1]=eta[1] correct? there are 116 divergent transitions and the largest R-hat is 1.6, Bulk Effective Samples Size (ESS) is too low. Any suggestions are welcome! Thanks.