Good morning,
This is another post in my efforts to really understand how to specify hierarchical models in Stan. This post is related to linked to previous posts on hierarchical modeling and prior specification. (You can find the most recent discussion here: Options for Priors on Random Effects with Non-Centered Parameterizations. Previous versions of the code can be seen in this post: Cross-classified Hierarchical Model, Heterogenous Variances
The model code that I have works well. I am able to recover mean parameters, and the biggest challenge(s) I have are around the recovery of variances within the model - which is to be expected given the model structure.
Based on suggestions from great folks like @betanalpha and @bgoodri, I have attempted to specify hierarchical priors on the variances (actually the standard deviations) in my model. This has, however, lead to two less-than-deal outcomes that I think might be related and are related to how I have specified priors/hyperpriors:
- The code runs MUCH more slowly than when I specified the priors directly; and
- I am getting the following message when I run my model for several chains:
Rejecting initial value: Log probability evaluates to log(0), i.e. negative infinity. Stan can't start sampling from this initial value.
My code is below. The specification of the hyperpriors in the model statement (and the corresponding declaration/definition of those parameters) is the only newly added piece of code which has lead to the slowing down of my code and the initial evaluation problems.
It may very well be that I have specified the hyperpriors incorrectly and there is a simple way to improve performance. It also might be that the model is more difficult to estimate. That said, this group has been great, and any suggestions would be greatly appreciated.
data {
int<lower=1> N; // number of observations
int<lower=1> J; // number of students
int<lower=1> K; // number of items
int<lower=1> P; // number of families
int<lower=1> M; // number of item models
int<lower=1> Q; // number of templates
int<lower=1> X; // length of multten array
int<lower=1> Y; // length of threedig array
int<lower=1> A; // length of vv array
int<lower=1> B; // length of hh array
int peeps[N]; // student giving response n
int<lower=1,upper=K> item[N]; // item for response n
int<lower=0,upper=1> score[N]; // correctness for response n
int<lower=0,upper=J> pid[J]; // Person ID number
int<lower=0,upper=K> iid[K]; // Item ID number
int<lower=0,upper=P> fid[P]; // family ID number
int<lower=0,upper=M> mid[M]; // item model ID number
int<lower=0,upper=Q> tid[Q]; // template ID number
int<lower=1,upper=P> parent[K]; //indexes items to families
int<lower=1,upper=M> mm[P]; //indexes families to item models
int<lower=1,upper=Q> tt[M]; //indexes item models to templates
int multten[X]; // Array of indices for families - numbers are some multiple of ten
int threedig[Y]; // Array of indices for families - numbers are maximum of three digits
int vv[A]; //Array of indices for imodels - display format is verbal
int hh[B]; //Array of indices for imodels - display format is horizontal
}
parameters {
vector[J] uj;
real<lower=0> ub_si;
real<lower=0> ub_fr;
real<lower=0> ub_mr;
real<lower=0> ub_st;
real<lower=0> sigma_item;
real<lower=0> fam_resid;
real<lower=0> mod_resid;
real<lower=0> sigma_temp; //when level 4 is treated as random, large sample condition
vector[K] betai_offset;
vector[P] fammu_offset;
vector[M] modmu_offset;
vector[Q] tempmu_offset;
// vector[Q] template_mu; //when template means are treated as fixed, small sample (empirical) condition
real grand_mean; //when level 4 is treated as random, large sample condition
real disp_horiz;
real disp_verb;
real char_m10; //fixed effects of content characteristics
real char_d3; //fixed effects of content characteristics
}
transformed parameters{
vector[N] eta;
vector[K] betai;
vector[P] family_mu;
vector[M] model_mu;
vector[Q] template_mu;
// when the fourth level is treated as random
template_mu = grand_mean + tempmu_offset*sigma_temp;
// varying item family diffculty across item families within models
// decomposition of family means into a model mean and fixed effects
model_mu = template_mu[tt] + modmu_offset*mod_resid;
model_mu[vv] = model_mu[vv] + disp_verb;
model_mu[hh] = model_mu[hh] + disp_horiz;
// varying item family diffculty across item families within models
// decomposition of family means into a model mean and fixed effects
family_mu = model_mu[mm] + fammu_offset*fam_resid;
family_mu[multten] = family_mu[multten] + char_m10;
family_mu[threedig] = family_mu[threedig] + char_d3;
// item difficulties parameterized as random, with parent-specific means
betai = family_mu[parent] + betai_offset*sigma_item;
//log odds of a correct probability
eta = uj[peeps]-betai[item];
}
model {
//hyperpriors on standard deviations
ub_si ~ uniform(0,2.5);
ub_fr ~ uniform(0,2.5);
ub_mr ~ uniform(0,2.5);
ub_st ~ uniform(0,2.5); //when level 4 is treated as random
//prior on standard normal deviates for non-centered parameterization at levels 3, 4
betai_offset ~ normal(0,1);
fammu_offset ~ normal(0,1);
modmu_offset ~ normal(0,1);
tempmu_offset ~ normal(0,1);
//prior on random variable theta to scale the item difficulty parameters
uj ~ normal(0,1);
//priors on standard deviations
sigma_item ~ normal(0,ub_si);
fam_resid ~ normal(0,ub_fr);
mod_resid ~ normal(0,ub_mr);
sigma_temp ~ normal(0,ub_st); //when level 4 is treated as random
//likelihood function
score ~ bernoulli_logit(eta);
}