Draws from the Posterior Do Not Vary

Hello, I am fitting a Sequential IRT model with RStan but encounter some problems with the MCMC results. The values of my posterior draws do not change across iterations (i.e. the trace plot is a flat line)

I have tried to simulate datasets with more observations, change adapt_delta to 0.99, increase the number of iterations but none of them seems to work. I wonder what might be wrong in my setup or code.

The model is:
\Pr(U_{ij} = 1) = (1 - F(\theta_j - b_{ij}^E))\cdot(1 - F(\theta_j - b_i^R))
\Pr(U_{ij} = 2) = (1 - F(\theta_j - b_{ij}^E))\cdot F(\theta_j - b_i^R)
\Pr(U_{ij} =3)= F(\theta_j - b_{ij}^E)
where b_{ij}^E = \beta^\top X_{ij} + c_i^E


The Stan Code I wrote is

data {
  int<lower=1> I;               // # subjects
  int<lower=1> J;               // # questions
  int<lower=1> N;               // # observations
  int<lower=1> P;               // # covariates
  int<lower=1, upper=I> ii[N];  // question list
  int<lower=1, upper=J> jj[N];  // subject list
  int<lower=1, upper=3> y[N];   // correctness
  matrix[N,P] R;                // covariates
}
parameters {
  vector<lower=0>[I] be;     
  vector<lower=0>[I] br;    
  vector<lower=0>[J] theta;             
  vector[P] delta;            // parameters of covariates
}
transformed parameters {
  vector[N] be_ij;
  vector[N] influence;
  influence = R * delta;
  for (n in 1:N){
	be_ij[n] = be[ii[n]] + influence[n];
}
}
model {
  real pe;
  real pr;
  vector[3] cat_prob;
  be ~ normal(0,10);
  br ~ normal(0,10);
  theta ~ normal(0,1);
  delta ~ normal(0, 10);
  for (n in 1:N){
    pe = Phi(theta[jj[n]] - be_ij[n]);
    pr = Phi(theta[jj[n]] - br[ii[n]]);
    cat_prob[3] = pe;
    cat_prob[2] = (1 - pe)* pr;
    cat_prob[1] = (1-pe)*(1-pr);
    y[n] ~ categorical(cat_prob);
  }
}

I have been dealing with this issue for two weeks. Any help would be appreciated! Thank you very much!

  • Please execute it only one chain.
    I think it is not constant.
    Different scale graphs should not be plotted in the same plane.

  • The model is problematic? since \theta_j - b_i^R = (\theta_j+ \alpha) - (b_i^R + \alpha) for any \alpha \in \mathbb{R}, that is, the solution seems not unique, and I guess it also causes the non-convergent MCMC sampling.

  • cat_prob should be simplex instead of vector?
    But the following page, TeX documents said parameter is simplex, on the other hand, the last part, it denotes it as vector,… I am confused.

I suggest the following .stan file. But, I am not sure it is correct.
I changed four parts.

  1. I changed the vector to simplex and move it into transformed parameter block.
  2. I also changed the following such that the sum is precisely 1.
    cat_prob[3] = pe;
    cat_prob[2] = (1 - pe)* pr;
    cat_prob[1] = 1- pe - (1 - pe)* pr
  3. I moved the Parameter declaration in model block into parameter block and transformed parameter block.
  4. I also used proper uniform priors instead of improper priors as follows
    real<lower=0,upper =1> pe;
    real<lower=0,upper =1>pr;

When I had made my own stan file, I also encountered such phenomenon, very often.
Anyway, I am not sure these treatment remove odd sampling.

.stan file

data {
  int<lower=1> I;               // # subjects
  int<lower=1> J;               // # questions
  int<lower=1> N;               // # observations
  int<lower=1> P;               // # covariates
  int<lower=1, upper=I> ii[N];  // question list
  int<lower=1, upper=J> jj[N];  // subject list
  int<lower=1, upper=3> y[N];   // correctness
  matrix[N,P] R;                // covariates
}


parameters {

  real<lower=0,upper =1> pe;
  real<lower=0,upper =1>pr;

  vector<lower=0>[I] be;     
  vector<lower=0>[I] br;    
  vector<lower=0>[J] theta;             
  vector[P] delta;            // parameters of covariates
      }


transformed parameters {
   vector[N] be_ij;
   vector[N] influence;
    simplex[3] cat_prob;

  influence = R * delta;
        for (n in 1:N){
      	be_ij[n] = be[ii[n]] + influence[n];
      }


      for (n in 1:N){  
       pe = Phi(theta[jj[n]] - be_ij[n]);
       pr = Phi(theta[jj[n]] - br[ii[n]]);
    }
    cat_prob[3] = pe;
    cat_prob[2] = (1 - pe)* pr;
    cat_prob[1] = 1- pe -  (1 - pe)* pr;
    
}


 model {

  be ~ normal(0,10);
  br ~ normal(0,10);
  theta ~ normal(0,1);
  delta ~ normal(0, 10);
      for (n in 1:N){  
              y[n] ~ categorical(cat_prob);
      }
}

Thank you very much for the help!

You are absolutely correct! If I only run one chain, the posterior draws indeed vary but the variation is rather small.

I definitely agree with you. To address this issue, I follow some standard practices and set \theta_j \sim N(0,1) and constrain all parameters except delta to be positive. Do you think there are additional constraints that I need to add?

Thank you so much for the modified script! I will try experimenting with the new one.

Thanks again!