I am having trouble recovering known parameters from simulated data using STAN. I am translating the model into stan from a Gibbs sampler hand-rolled in R. The Gibbs sampler works, but is slow. I am using the same priors in STAN as in the Gibbs sampler, which are weakly informative.
A wikisurvey is a kind of survey where questions take the form of pairwise comparisons between two items. I use K to represent the number of items. There are K*(K-1) possible comparisons, but
respondents only provide a limited set of responses. I use J to refer to the number of respondents.
The object of inference in this model is the KxJ ‘opinion matrix’ Theta. Each item in theta represents a respondent’s opinion of an item. Phi(Theta_1_j - Theta_2_j) is the probability that respondent j prefers item 1 to item 2.
A second parameter mu[K], allows for partial pooling of information about the items and estimates the overall rank of the items. The model is hierarchical as theta_j_k ~ normal(mu[k]).
I am aware of two reasons the model is difficult to fit:
- Most values of Theta are uninformed. Respondents do not generally see every item.
- The mapping between Theta and Y seems difficult to sample from.
Below is my stan code. An R file with code to run the model on simulated data is attached. simulate_wikisurvey_unvectorized.R (2.2 KB)
Running the model gives me max treedepth warnings. I tried raising the max tree depth to 20! But the warnings persisted. I guess this means the problem is with the model.
I have also tried a couple variations of the model including parameterizing sigma, sampling theta once for each response, and a complicated (possibly buggy) vectorization scheme that eliminated the loops in the model. These attempts were fruitless. The following code is version that most clearly aligns with the data-generation process.
Thank you for your help!
data{
// number of respondents
int<lower=1> J;
// number of responses
int<lower=1> V;
// number of items
int<lower=1> K;
// the next four vectors describe each response
// the item shown on the left
int<lower=1, upper=K> ll[V];
// the item shown on the right
int<lower=1, upper=K> rr[V];
// respondent
int<lower=1, upper=J> jj[V];
// y = 1 if the left item is chosen
int<lower=0,upper=1> y_vec[V];
// hyper prior for variance of mu. set to 4
real<lower=0> p_tau_2_0;
// hyper prior for mean of mu. set to 0
real p_mu_0;
// for now I treat sigma as a fixed prior. set to 1
real<lower=0> sigma;
}
transformed data {
vector[K] mu_0;
vector[K] tau_2_0;
for (k in 1:K){
mu_0[k] = p_mu_0;
tau_2_0[k] = p_tau_2_0;
}
// pin an arbitrary mu to 0
tau_2_0[1] = 0.000001;
}
parameters{
matrix[K,J] theta;
vector[K] mu;
}
model{
mu ~ normal(mu_0, tau_2_0);
// sample theta
for(k in 1:K){
theta[k] ~ normal(mu[k], sigma);
}
for (v in 1:V){
y_vec[v] ~ bernoulli(Phi_approx(theta[ll[v],jj[v]] - theta[rr[v],jj[v]]));
}
}