Hello!
I am building a stan model (using rstan) to fit ordinal data (1-8 on a scale) from an experiment with two testing sessions. The effects of two independent variables and their interaction are modeled. I used a hierarchical structure to define each of the four model parameters (two IVs, interaction, and intercept): a parameter for each individual and a group-level distribution (mean and sd) for the entire group.
parameters {
// Group-level parameter means
real mu_mean_lex;
real mu_mean_noise;
real mu_mean_inter;
real mu_mean_cept;
// Group-level parameter SDs
real<lower=0> mu_sd_lex;
real<lower=0> mu_sd_noise;
real<lower=0> mu_sd_inter;
real<lower=0> mu_sd_cept;
// Individual-level parameters
vector[N] mu_i_lex;
vector[N] mu_i_noise;
vector[N] mu_i_inter;
vector[N] mu_i_cept;
}
The model is fit to the data X[i,n] of two testing sessions separately using the ordered_logistic() function:
model {
...
// For each subject
for (i in 1:N) {
// For each trial
for(n in 1:T_ctr){
row_vector[param_num] conds = [Cond[1,n],Cond[2,n],(Cond[1,n]*Cond[2,n]), 1];
vector[param_num] params = [mu_i_lex[i], mu_i_noise[i], mu_i_inter[i], mu_i_cept[i]]';
target += ordered_logistic_lpmf(X[i,n]| conds*params, c);
}
}
}
But the posterior means of mu_i_inter showed a low correlation across the two testing sessions. So I tried to model the experiment using brms:
model.s1 <- brm(response ~ condition*ar + (condition*ar|participant), data = df_data1, family = cumulative(), iter = 5000, chains = 1)
Different from my stan model, the brms-estimated interaction effect has a much higher correlation across two sessions.
I used stancode(model.s1) to check its stan code and found two main things that are different from my model.
The first one is the different implementation of group-level effects. Instead of a hierarchical model, the brms model used the combination of both group-level and individual-level effects in the model-fitting:
model {
...
vector[N] mu = rep_vector(0.0, N);
mu += Xc * b;
for (n in 1:N) {
// add more terms to the linear predictor
mu[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n] + r_1_4[J_1[n]] * Z_1_4[n];
}
for (n in 1:N) {
target += ordered_logistic_lpmf(Y[n] | mu[n], Intercept);
}
...
}
Xc and b are group-level conditions and effect coefficients. Z_1_1[n] - Z_1_4[n] are individual-level effect coefficients.
Another difference is the lprior:
transformed parameters {
real lprior = 0; // prior contributions to the log posterior
lprior += student_t_lpdf(Intercept | 3, 0, 2.5);
lprior += student_t_lpdf(sd_1 | 3, 0, 2.5)
- 4 * student_t_lccdf(0 | 3, 0, 2.5);
lprior += lkj_corr_cholesky_lpdf(L_1 | 1);
}
model {
if (!prior_only) {
...
// model fitting using ordered_logistic()
}
target += lprior;
target += std_normal_lpdf(to_vector(z_1));
}
So just wondering what the function of lprior is and whether it is the two different implementations above that contributed to the difference in parameter estimation?