Assuming that you missed a plus sign between \beta_{2}x{2} and \beta_{3}x_{1}x{2} and assuming that the coding of the predictors x1 and x2 remains the same (+/- 0.5), the two parametrizations are mathematically the same, no? That is, the interpretation of \beta_{2} should also be the same (i.e. the main effect of x2).
So we have Version 1:
y_{i} \sim \text{Bernoulli}(\theta_{i}) \\
\text{logit}(\theta_{i}) = \alpha_{subject[i]} + \alpha_{item[i]} + \beta_{1} x1_{i} + \beta_{3} x1_{i} x2_{i} \\
\alpha_{item} ~\sim \text{Normal}(\alpha + 0.5 \beta_{2}, \sigma_{a}) \text{ for } i \text{ in } a_{1}, a_{2}, ..., a_{20} \\
\alpha_{item} ~\sim \text{Normal}(\alpha - 0.5 \beta_{2}, \sigma_{b}) \text{ for } i \text{ in } b_{1}, b_{2}, ..., b_{20} \\
And Version 2:
y_{i} \sim \text{Bernoulli}(\theta_{i}) \\
\text{logit}(\theta_{i}) = \alpha + \alpha_{subject[i]} + \alpha_{item[i]} + \beta_{1} x1_{i} + \beta_{2} x2_{i} + \beta_{3} x1_{i} x2_{i} \\
\alpha_{item} ~\sim \text{Normal}(0, \sigma_{a}) \text{ for } i \text{ in } a_{1}, a_{2}, ..., a_{20} \\
\alpha_{item} ~\sim \text{Normal}(0, \sigma_{b}) \text{ for } i \text{ in } b_{1}, b_{2}, ..., b_{20} \\
Running both of these gives essentially the same results. But actually it’s the first version that is both much faster and has higher numbers of effective parameters.
However, the estimates I get are not sensible and, apart from \sigma_{subject}, the uncertainty intervals are very wide. I’m running a simplified version of these models where I ignore x1.
Here’s the Stan code for (simplified) Version 1:
data{
int<lower=1> n_observation;
int<lower=1> n_subject;
int<lower=1> n_item;
int<lower=1, upper=n_subject> subject[n_observation];
int<lower=1, upper=n_item> item[n_observation];
int<lower=1, upper=2> x1_i[n_observation];
int<lower=0, upper=1> y[n_observation];
}
parameters{
real alpha;
vector[n_subject] alpha_subject;
real<lower=0> sigma_subject;
matrix[n_item, 2] alpha_item;
vector<lower=0>[2] sigma_item;
real beta_x1;
}
transformed parameters{
vector[n_observation] y_hat;
for (i in 1:n_observation)
y_hat[i] = alpha_subject[subject[i]] + alpha_item[item[i], x1_i[i]];
}
model{
y ~ bernoulli_logit(y_hat);
alpha ~ normal(1, 1);
alpha_subject ~ normal(0, sigma_subject);
sigma_subject ~ normal(0, 1);
alpha_item[, 1] ~ normal(alpha + 0.5*beta_x1, sigma_item[1]);
alpha_item[, 2] ~ normal(alpha - 0.5*beta_x1, sigma_item[2]);
sigma_item ~ normal(0, 3);
beta_x1 ~ normal(0, 1);
}
And for Version 2:
data{
int<lower=1> n_observation;
int<lower=1> n_subject;
int<lower=1> n_item;
int<lower=1, upper=n_subject> subject[n_observation];
int<lower=0, upper=n_item> item[n_observation];
real<lower=-0.5, upper=0.5> x1[n_observation];
int<lower=1, upper=2> x1_i[n_observation];
int<lower=0, upper=1> y[n_observation];
}
parameters{
real alpha;
vector[n_subject] alpha_subject;
real<lower=0> sigma_subject;
matrix[n_item, 2] alpha_item;
vector<lower=0>[2] sigma_item;
real beta_x1;
}
transformed parameters{
vector[n_observation] y_hat;
for (i in 1:n_observation)
y_hat[i] = alpha + alpha_subject[subject[i]] + alpha_item[item[i], x1_i[i]] +
beta_x1*x1[i];
}
model{
y ~ bernoulli_logit(y_hat);
alpha ~ normal(1, 1);
alpha_subject ~ normal(0, sigma_subject);
sigma_subject ~ normal(0, 1);
alpha_item[, 1] ~ normal(0, sigma_item[1]);
alpha_item[, 2] ~ normal(0, sigma_item[2]);
sigma_item ~ normal(0, 3);
beta_x1 ~ normal(0, 1);
}
And idea what the issue is?