Help with multilevel model. Chains do not mix


#1

Hi there,

since my multilevel model does not converge properly in jags, I am trying to run it with Stan. However, although the model runs, the chains do not mix at all. The different chains barely vary at all and I am trying to figure out why.

The model is the following:

y_{i} \sim Poisson(\theta_{i}),\, \, \, \text{for}\, \, i = 1,...,n

\theta_{i} = e^{\alpha_{c[i]} + \beta_{c[i]}^{1}x_{i}^{1} + \beta^{2}x_{i}^{2} + \beta_{c[i]}^{3}x_{i}^{1}x_{i}^{2} + \epsilon_{i}},\, \, \, \text{for}\, \, i = 1,...,n,

\begin{pmatrix}\alpha_{c}\\\beta_{c}^{1}\\\beta_{c}^{3}\end{pmatrix} \sim N\begin{pmatrix}\begin{pmatrix}\mu_{c}^{\alpha}\\\mu_{c}^{\beta^{1}}\\\mu_{c}^{\beta^{3}}\end{pmatrix}, \begin{pmatrix}\sigma_{\alpha}^{2}&\rho_{12}\sigma_{\alpha}\sigma_{\beta^{1}}&\rho_{13}\sigma_{\alpha}\sigma_{\beta^{3}}\\\rho_{21}\sigma_{\alpha}\sigma_{\beta^{1}}&\sigma_{\beta^{1}}^{2}&\rho_{23}\sigma_{\beta^{1}}\sigma_{\beta^{3}}\\\rho_{31}\sigma_{\alpha}\sigma_{\beta^{3}}&\rho_{32}\sigma_{\beta^{1}}\sigma_{\beta^{3}}&\sigma_{\beta^{3}}^{2}\end{pmatrix}\end{pmatrix}, \, \, \, \mbox{for}\, \, c = 1,...,C,

\mu_{c}^{\alpha} = \gamma_{0}^{\alpha} + \gamma_{1}^{\alpha}x_{c}^{3}\,,\, \, \, \mbox{for}\, \, c = 1,...,C,

\begin{equation} \mu_{c}^{\beta^{1}} = \mu_{c}^{\beta^{3}} = \gamma_{0}^{\beta} + \gamma_{1}^{\beta}x_{c}^{3}\,,\, \, \, \mbox{for}\, \, c = 1,...,C. \end{equation}


The current stan code for the model is:

data {
  int N;
  int ncoa; // Number of groups
  // group vector
  int coa[N];
  
  // DV
  int<lower=0> y[N];
  
  // Individual level variables
 real x1[N];
 real x2[N];
  
  // group level variable
  real x3[ncoa];
  matrix[3,3] W;
  real df;
}
parameters {
  
// individual level parameters
  real beta2;

  real epsilon[N];
  real<lower=0> sigma_epsilon;
 real a0_raw;
 real gamma_a_raw;
	real b0_raw;
	real gamma_b_raw;
	real c0_raw;
	real gamma_c_raw;
 
	matrix[ncoa,3] B_raw;
	matrix[3,3] Tau_B_raw;
	real xi_a;
	real xi_b;
	real xi_c;
  
}
transformed parameters {

  real lp[N];
  vector[ncoa] beta3;
  vector[ncoa] beta0;
  vector[ncoa] beta1;

	// paramters at group stage
	matrix[ncoa,3] B_raw_hat;
	matrix[3,3] Sigma_B_raw;
	matrix[3,3] rho_B;
	real sigma_a;
	real sigma_b;
	real sigma_c;
	
	
	real a0;
	real gamma_a;
	real b0;
	real gamma_b;
	real c0;
	real _c;
  
// group linear predictors  
for(j in 1:ncoa){
	beta0[j] = xi_a * B_raw[j,1];
	beta1[j] = xi_b * B_raw[j,2];
	beta3[j] = xi_c * B_raw[j,3];
    B_raw_hat[j,1] = a0_raw + gamma_a_raw * x3[j];
    B_raw_hat[j,2] = b0_raw + gamma_b_raw * x3[j];
    B_raw_hat[j,3] = c0_raw + gamma_c_raw * x3[j];	
}
a0 = xi_a * a0_raw;
gamma_a = xi_a * gamma_a_raw;
b0 = xi_b * b0_raw;
gamma_b = xi_b * gamma_b_raw;
c0 = xi_c * c0_raw;
gamma_c = xi_c * gamma_c_raw;

Sigma_B_raw[,] = inverse(Tau_B_raw[,]);
for(k in 1:3){
  for(k_prime in 1:3){
    rho_B[k,k_prime] = Sigma_B_raw[k,k_prime]/
      sqrt(Sigma_B_raw[k,k]*Sigma_B_raw[k_prime,k_prime]);
  }
}
sigma_a = xi_a*sqrt(Sigma_B_raw[1,1]);
sigma_b = xi_b*sqrt(Sigma_B_raw[2,2]);
sigma_c = xi_c*sqrt(Sigma_B_raw[3,3]);


// Linear predictor
for(i in 1:N){
  lp[i] = exp(a[coa[i]] + 
  beta1[coa[i]] * x1[i] + 
  beta2 * x2[i] + 
  beta3[coa[i]] * x2[i] * x3[i] + 
  epsilon[i]);
}
}
model {

// priors
x2 ~ normal(0,100);

a0_raw ~ normal(0,100);
b0_raw ~ normal(0,100);
c0_raw ~ normal(0,100);
gamma_a_raw ~ normal(0,100);
gamma_b_raw ~ normal(0,100);
gamma_c_raw ~ normal(0,100);

xi_a ~ uniform(0,100);
xi_b ~ uniform(0,100);
xi_c ~ uniform(0,100);

sigma_epsilon ~ uniform(0,100);

for(p in 1:ncoa){
B_raw[p,] ~ multi_normal(B_raw_hat[p,], Tau_B_raw[,]);
}

Tau_B_raw[,] ~ wishart(df,W[,]);

y ~ poisson(lp);

for(z in 1:N){
epsilon[z] ~ normal(0, sigma_epsilon);
}
}

Initial values for Tau_b_raw are drawn from a wishart distribution. The matrix W is simply a 3 times 3 identity matrix and df declares that thereare 4 degres of freedom for the wishart distribution.

I am very new at Stan and I would be very happy if somebody could give me a hand. Thanks in advance for your guidance and your patience.


#2

What is epsilon?


#3

Epsilon accounts for overdispersion as described by Gelman and Hill (2007, p. 325). The model is an overdispersed poisson regression.