Hey folks!
I’m a Stan novice, so I wanted to check whether I had coded a multilevel model with an interaction term at level 1 correctly. I’m using cmdstanr. If any advice or resources to refer to would be very very welcome!
I’m running an experiment where I’ve recruited participants from three sources. The basic elements of the experiment are participants are randomly assigned to one of two conditions, complete a self-report measure of sexual attractiveness, and respond about how likely they are to engage in a certain behavior. I want to test the effects of experimental condition, sexual attractiveness, and their interaction on their behavioral intentions. Because of the three recruitment sources, I want to run a multilevel model (assuming some dependency in these variables attributable to the recruitment source).
I’ve prepared the data as a matrix (using this handy tutorial) where columns represent the intercept, binary experimental condition (either 0 or 1), a continuous self-report measure, and their interaction (so condition*the measure), and rows represent individual participants.
The model I’ve put together is this:
data {
int n; //number of observations
int s; //number of samples
int k; // number of predictors + intercept, so: 1 intercept and 3 slopes
array[n] int id; //participant id
matrix[n, k] x; // matrix of predictors
array[n] real y; //outcome vector
}
parameters {
matrix[k, s] zp; //matrix of intercepts and slopes
vector<lower = 0>[k] sigma_s; //variance of slopes and intercepts
vector[k] beta; //multivariate normal priors for sample intercepts and slopes
cholesky_factor_corr[k] lp; //correlation matrix of slopes and intercepts
real<lower = 0> sigma; //population sigma
}
transformed parameters{
matrix[k, s] z; //non-centered sample parameters. column 1 = sample intercepts, column 2-4 = sample slopes
z = diag_pre_multiply(sigma_s, lp)*zp;
}
model {
vector[n] mu; // mean
// priors
beta ~ normal(0, 1); //priors for intercepts and slopes
lp ~ lkj_corr_cholesky(2); //prior for their correlation
sigma_s ~ exponential(1); //sample variance
sigma ~ exponential(1); // population variance
to_vector(zp) ~ normal(0, 1); //sample parameters are drawn from a population distribution of intercepts and slopes
// likelihood
for(i in 1:n) {
mu[i] = beta[1] + z[1, id[i]] + //intercept
(beta[2] + z[2, id[i]])*x[i,2] + //effect of condition
(beta[3] + z[3, id[i]])*x[i,3] + //effect of desirability
(beta[4] + z[4, id[i]])*x[i,4]; //their interaction
}
y ~ normal(mu, sigma);
}
generated quantities {
matrix[k,k] omega; //the correlation matrix for intercepts and slopes
omega = multiply_lower_tri_self_transpose(lp);
}
I’ve run the model on simulated data, and it seems to mix fine. My biggest concern is how I’ve model the interaction, which is the substantive relationship of interest in this study.
Thanks heaps in advance!