Hello,
I’m working on model comparison with K-fold CV, but I cannot figure out how to handle some varying intercepts. A little background: I’m modeling conflict with parents among depressed adolescents, with various forms of parent psychopathology as predictors. I also model the outcome variable via a 2pl IRT model. Below is the Stan code for the model without predictors, as set up for K-fold CV. As far as I can tell it’s working as it should. In simulation it recovers parameters reasonably well. I could share a simulated dataset if someone is curious.
As the unit of observation is the individual parent, I have included a varying intercept per adolescent. That means the size of the clusters is between 1 and 2. The problem I can’t figure out is whether or not to include the varying intercepts in the calculation of the pointwise log-likelihood of the held-out data (as I am currently doing). When one of these tiny clusters is held out from fitting the model, their varying intercept posterior is simply samples from the prior, after all. Wouldn’t that make the elpd unreasonably low, as the varying intercept without any data is injecting noise into the predictions?
As I’m writing this, I’m starting to think that the varying intercepts are only important for the fitting process, to handle the non-independent observations when estimating, and that checking model fit to the held-out data could be done without the varying intercepts. But I’m a clinical psychologist with pretty sparse formal matemathical/statistical training, so these are just my intuitions. While I’ve had a pretty steep learning curve lately, I don’t trust my intuition just yet.
Another thing I’ve asked myself is whether I should make the varying intercepts conditional on there being two parents observed, as well. But that doesn’t really change anything as far as the K-fold CV problem is concerned.
Erling
data{
int<lower=1> N;
int n_id;
int<lower=1> id[N];
int<lower=0, upper=1> mother[N];
int cbq_items;
// index vector for cases with no missing data on CBQ
int ind_obs_cbq[N];
int nobs_cbq;
// index vector for cases with single items missing on CBQ
int ind_mis_cbq[N];
int nmis_cbq;
// index vector for cases with no response to CBQ
int ind_no_cbq[N];
int nno_cbq;
// matrix of observed complete CBQ data
int<lower=0, upper=1> cbq_obs[nobs_cbq,cbq_items];
// matrix of observed CBQ data with missing items
int<lower=-1, upper=1> cbq_mis[nmis_cbq,cbq_items];
int kfold_n; //number of cases the model is fitted to in this fold
int kfold_ind[kfold_n]; //index of cases used for fitting model
int fold; //current fold
}
parameters{
real a;
vector[n_id] a_var;
real<lower=0> sigma;
vector[cbq_items] cbqbeta;
vector[cbq_items] cbqbetaM;
vector<lower=0>[cbq_items] cbqalpha;
vector[nobs_cbq] cbqtheta_obs;
vector[nmis_cbq] cbqtheta_mis;
vector[nno_cbq] cbqtheta_no;
real<lower=3> nu;
}
model{
vector[N] y_raw;
vector[N] var_a_raw = a_var[id];
vector[kfold_n] y;
vector[kfold_n] var_a;
for(i in 1:N){
if (ind_obs_cbq[i]>0)
y_raw[i] = cbqtheta_obs[ind_obs_cbq[i]];
else if (ind_mis_cbq[i]>0)
y_raw[i] = cbqtheta_mis[ind_mis_cbq[i]];
else
y_raw[i] = cbqtheta_no[ind_no_cbq[i]];}
y = y_raw[kfold_ind];
var_a = var_a_raw[kfold_ind];
// priors
a ~ normal(0,10);
a_var ~ normal(a,3);
sigma ~ cauchy(0,1);
cbqtheta_obs ~ normal(0,2);
cbqtheta_mis ~ normal(0,2);
cbqtheta_no ~ normal(0,2);
// allowing for item difficulty to vary by parent gender
cbqbeta ~ normal(0,2);
cbqbetaM ~ normal(cbqbeta, 1);
cbqalpha ~ gamma(2,0.5);
nu ~ gamma(2,0.1);
// irt model
for(i in 1:nobs_cbq){
for (j in 1:cbq_items){
cbq_obs[i,j] ~ bernoulli_logit(cbqalpha[j]*(cbqtheta_obs[i] - cbqbeta[j] + cbqbetaM[j] * mother[i]));
}}
for(i in 1:nmis_cbq){
for(j in 1:cbq_items){
if(cbq_mis[i,j]>-1){
cbq_mis[i,j] ~ bernoulli_logit(cbqalpha[j]*(cbqtheta_mis[i] - cbqbeta[j] + cbqbetaM[j] * mother[i]));
}}}
// likelihood
y ~ student_t(nu, a + var_a, sigma);
}
generated quantities{
vector[N] y;
vector[N] var_a = a_var[id];
vector[N] log_lik;
for(i in 1:N){
if (ind_obs_cbq[i]>0)
y[i] = cbqtheta_obs[ind_obs_cbq[i]];
else if (ind_mis_cbq[i]>0)
y[i] = cbqtheta_mis[ind_mis_cbq[i]];
else
y[i] = cbqtheta_no[ind_no_cbq[i]];}
for (n in 1:N) log_lik[n] = student_t_lpdf(y[n] | nu, a + var_a[n], sigma);
}