Hello!
I’ve been using Stan for several years, but this is my first forum post. Thanks for the awesome tool.
I’m working with a fairly standard hierarchical linear regression problem. I have J groups, and am modeling each of those groups with its own intercept and variance. In addition, I am modeling the effect of group selection probability (in a standard PPS sample design) on each group mean. At the individual level there is the usual vector of P predictors.
To achieve this I came up with the following model, using a non-centered parameterization as suggested in this forum, the Stan User’s manual, and in @betanalpha’s case studies:
data {
int <lower = 0> N; // number of observations
int <lower=0> J; // number of groups
int <lower=0> R; // number of group predictors
matrix[J, R] X_group; // group predictor matrix
int <lower=0> P; // number of predictors
matrix[N, P] X_pop; // individual predictor matrix
int <lower=0> jj[N]; // group index j for observation n
real y[N]; // individual measurement
}
parameters {
real mu; // global mean
real<lower=0> alpha_sigma; // shape prior for sigma
real<lower=0> beta_sigma; // scale prior for sigma
real alpha_tilde[J]; // group alpha multiplier
real<lower=0> sigma[J]; // group population variance
vector[R] kappa; // group coefficients (in this case only group selection probability)
real<lower=0> tau; // between-group variance
vector[P] beta; // individual predictor coefficients
}
transformed parameters {
real alpha[J]; // group means
real mu_n[N]; // individual means
real sigma_n[N]; // individual variance
for (j in 1:J) {
alpha[j] = mu + tau * alpha_tilde[j] + X_group[j] * kappa;
}
for (n in 1:N) {
mu_n[n] = alpha[jj[n]] + X_pop[n] * beta;
sigma_n[n] = sigma[jj[n]];
}
}
model {
target += normal_lpdf(mu | 0, 0.2);
target += normal_lpdf(alpha_sigma | 0, 5);
target += normal_lpdf(beta_sigma | 0, 1);
target += inv_gamma_lpdf(sigma | alpha_sigma, beta_sigma);
target += normal_lpdf(tau | 0, 1);
target += normal_lpdf(alpha_tilde | 0, 1);
target += normal_lpdf(kappa | 0, 0.1);
target += normal_lpdf(beta | 0, 1);
target += normal_lpdf(y | mu_n, sigma_n);
}
generated quantities {
real y_new[N];
vector[N] log_lik;
for (n in 1:N) {
y_new[n] = normal_rng(mu_n[n], sigma_n[n]);
log_lik[n] = normal_lpdf(y[n] | mu_n[n], sigma_n[n]);
}
}
My basic problem is that the inference for kappa (the effect of selection probability on group mean) seems to track closely whatever I provide as the prior. This makes sense to me as I don’t actually expect the group selection probability to have a great deal of impact, but would like to include it considering it was part of the sample design (possibly this is a naive attitude - I am open to correction). In order to deal with this so far I have just made the prior on kappa fairly small (and could make it smaller still).
While I know that’s not cheating, exactly, I think that a better idea is to attempt to regularize the posterior for kappa by using a multivariate prior to model the correlation between the alpha_tilde (group-level multipliers) and kappa, but am having a difficult time figuring out how to do so with the non-centered parameterization.
Is it sufficient to add a 1 column vector to X_group and add something like the following:
transformed parameters {
real alpha[J];
real mu_n[N];
real sigma_n[N];
vector[2] gamma[J];
for (j in 1:J) {
gamma[j, 1] = tau * alpha_tilde[j];
gamma[j, 2] = kappa;
alpha[j] = X_group[j] * gamma;
}
for (n in 1:N) {
mu_n[n] = mu + alpha[jj[n]] + X_pop[n] * beta;
sigma_n[n] = sigma[jj[n]];
}
}
model {
gamma ~ multi_normal(mu_gamma, quad_form_diag(Omega, tau_gamma));
}
In reflection, though, it seems like this would essentially attempt to model the between-group variance twice, so maybe I should just abandon the decomposition of the group means if going to a multivariate parameterization? Or perhaps there is a better way to work this out? I am sure that I am just not fully understanding the section on multivariate reparameterization in the User’s Manual properly or how to apply it to my model, but possibly also the idea to regularize through a multivariate model on group effects is misguided entirely in this case.
Thanks for your help, and any additional feedback regarding possible model improvements are also appreciated.