Hi, I’m learning Stan using the McElreath Statistical Rethinking textbook and R package and I’ve run into some difficulty with which I’d be grateful to receive some help. I initially asked my question on the Rethinking message board and was directed here, as no one there could replicate the issue I was having.
I’m trying to run the following model from one of the lectures (2023 lecture 12, available on YouTube), and all the chains fail. Best I can tell, it has something to do with how the model is trying to estimate X & Y simultaneously.
Y model:
Y \sim \text{Bernoulli}(p)
\text{logit}(p) = \alpha_g + \beta_{XY}X + \beta_{ZY}Z_g + \beta_{UY} U_g
\alpha = \bar{\alpha} + Z * \tau
X model:
X \sim \text{Normal}(\mu, \sigma)
\mu = \alpha_X + \beta_{UX}U_g
U \sim \text{Normal}(0,1)
Priors:
Z_g \sim \text{Normal}(0, 1)
\alpha_X \sim \text{Normal}(0, 1)
\beta_{XY} \sim \text{Normal}(0, 1)
\beta_{UY} \sim \text{Normal}(0, 1)
\beta_{ZY} \sim \text{Normal}(0, 1)
\beta_{UX} \sim \text{Exponential}(1)
\bar{\alpha} \sim \text{Normal}(0, 1)
\tau \sim \text{Exponential}(1)
\sigma \sim \text{Exponential}(1)
The model uses the following simulated data. I think all the simulations can be run outside of the Rethinking package except rbern(), so I rewrote that.
library(rethinking)
N_groups <- 30
N_id <- 200
a0 <- (-2)
bZY <- (-0.5)
g <- sample(1:N_groups,size=N_id,replace=TRUE) # sample into groups
Ug <- rnorm(N_groups,1.5) # group confounds
X <- rnorm(N_id, Ug[g] ) # individual varying trait
Z <- rnorm(N_groups) # group varying trait (observed)
Y <- rbinom(N_id, size=1, p=plogis( a0 + X + Ug[g] + bZY*Z[g]))
#Y <- rbern(N_id, p=inv_logit( a0 + X + Ug[g] + bZY*Z[g] ) ) # Rethinking version
dat <- list(Y=Y,X=X,g=g,Ng=N_groups,Z=Z)
Here is the Stan code for the model:
functions{
real reducer(
array[] int Y,
int start , int end ,
vector Xbar,
int Ng,
vector Z,
vector a,
vector X,
array[] int g,
vector u,
vector z,
real bzy,
real buy,
real bxy,
real aX,
real bux,
real abar,
real tau,
real sigma ) {
vector[size(Y)] p;
vector[size(Y)] mu;
for ( i in 1:size(Y) ) {
mu[i] = aX + bux * u[g[start+i-1]];
}
return normal_lpdf( X | mu , sigma );
for ( i in 1:size(Y) ) {
p[i] = a[g[start+i-1]] + bxy * X[start+i-1] + bzy * Z[g[start+i-1]] + buy * u[g[start+i-1]];
p[i] = inv_logit(p[i]);
}
return bernoulli_lpmf( Y | p );
}
}
data{
vector[30] Xbar;
int Ng;
array[200] int Y;
vector[30] Z;
vector[200] X;
array[200] int g;
}
parameters{
vector[Ng] u;
vector[30] z;
real bzy;
real buy;
real bxy;
real aX;
real<lower=0> bux;
real abar;
real<lower=0> tau;
real<lower=0> sigma;
}
transformed parameters{
vector[Ng] a;
a = abar + z * tau;
}
model{
sigma ~ exponential( 1 );
tau ~ exponential( 1 );
abar ~ normal( 0 , 1 );
bux ~ exponential( 1 );
aX ~ normal( 0 , 1 );
bxy ~ normal( 0 , 1 );
buy ~ normal( 0 , 1 );
bzy ~ normal( 0 , 1 );
z ~ normal( 0 , 1 );
u ~ normal( 0 , 1 );
target += reduce_sum( reducer , Y , 1 ,
Xbar,
Ng,
Z,
a,
X,
g,
u,
z,
bzy,
buy,
bxy,
aX,
bux,
abar,
tau,
sigma );
}
For reference, the Rethinking code looks like this:
mru <- ulam(
alist(
# Y model
Y ~ bernoulli(p),
logit(p) <- a[g] + bxy*X + bzy*Z[g] + buy*u[g],
transpars> vector[Ng]:a <<- abar + z*tau,
# X model
X ~ normal(mu,sigma),
mu <- aX + bux*u[g],
vector[Ng]:u ~ normal(0,1),
# priors
z[g] ~ dnorm(0,1),
c(aX,bxy,buy,bzy) ~ dnorm(0,1),
bux ~ dexp(1),
abar ~ dnorm(0,1),
tau ~ dexp(1),
sigma ~ dexp(1)
) , data=dat , chains=4 , cores=14 , threads=4, sample=TRUE )
I’ve tried uninstalling & reinstalling cmdstan, cmdstanr, rethinking, and even as a last resort R, RStudio, and Rtools. Nothing helped. One last note: When I reinstalled cmdstanr, old Stan models that used to work stopped working. I had to ensure that my computer downgraded from C++20 to C++17.
Any help would be greatly appreciated!