Hi
I’m working with a dataset I found on Statistical Rethinking, the Trolley datase. I’m following the book’s approach for this problem but I’m using STAN directly instead of “rethinking”. I’m regressing an ordinal variable against an ordinal predictor using the following model. “R” is an integer from 1 to 7 indicating how morally permissible the participant found a certain action and “E” is an integer from 1 to 8 representing the education level of that participant.

Data prep

dat <- list(
R = R ,
E = E,
alpha = rep( 2 , 7 ) )

STAN model

data{
int R[9930];
int E[9930];
vector[7] alpha;
}
parameters{
ordered[6] kappa;
real bE;
simplex[7] delta;
}
transformed parameters {
vector[9930] delta_sum;
vector[8] delta_j = append_row(0, delta);
vector[9930] phi;
for ( i in 1:9930 ) {
delta_sum[i] = sum(delta_j[1:E[i]]);
}
phi=bE*delta_sum;
}
model{
delta ~ dirichlet( alpha );
bE ~ normal( 0 , 1 );
kappa ~ normal( 0 , 1.5 );
R ~ ordered_logistic( phi , kappa );
}

The model run well and with no issues. However I found that some delta_sum are NaN values. These NaN values all occur in situations where E=1. Why is that? What am I doing wrong?

I have a second question. Suppose I have reason to believe the association between R and E might change depending on the participant’s religion/country/sate, etc. How could/should I add a varying effects for religion/country/sate?

Thanks

I’m attaching the data in case it helps: data.csv (38.8 KB)

Thanks. These are some of the the NaN I was referring to.

print(fit,pars=c("delta_sum[3961]","delta_sum[3962]"))
Inference for Stan model: model.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
delta_sum[3961] 0 NaN 0 0 0 0 0 0 NaN NaN
delta_sum[3962] 0 NaN 0 0 0 0 0 0 NaN NaN
Samples were drawn using NUTS(diag_e) at Tue Jul 28 10:13:28 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

After looking at this some more, I think it makes sense “delta_sum” is zero when E=1, because when E=8 delta_sum=1

Inference for Stan model: model.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
delta_sum[601] 1 0 0 1 1 1 1 1 85 1
Samples were drawn using NUTS(diag_e) at Thu Jul 30 14:55:59 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

I also went back to “Statistical Rethinking” and noticed that McElreath wrote that “the first level (E=1) will be absorbed into the intercept” which seems to support the above conclusion.

I still have one more question, which is how do I turn this into a hierarchical mode (see first post).

Question to the moderators: should I open a separate question for this?