Hello,
I am new to stan (and to bayesian statistics in general) and I am unsuccessfully trying to adapt the QR reparametrization as explained in
http://mc-stan.org/users/documentation/case-studies/qr_regression.html, and
in section 5.2 of the “Bayesian statistics using stan” manual, to the setting of a
simple multilevel model.
Assume a setting where we have:
- N observations
- a design matrix X of dimensions [N \times K], K being the number of predictors
- a number J of groups (in my application, these are different postcodes)
- A an integer vector of length N that for each observation encodes its group, one of 1:J.
Suppose we want to train a multilevel regression model using QR decomposition, so that each group in 1:J has its own regression coefficients vector \beta_j. My attempt to adapt the model from the aforementioned links looks as follows:
data {
int N; //number observations
int J; //nr of grups
int K; //nr of predictors incl. intercept
int<lower=1, upper=J> A[N]; //lookup y <-> J
vector[N] y; //response
matrix[N, K] X; //covariates
}
transformed data {
//QR decomposition
matrix[N, K] Q_ast_thin;
matrix[K, K] R_ast_thin;
matrix[K, K] R_ast_thin_inverse;
Q_ast_thin = qr_Q(X)[, 1:K] * sqrt(N - 1);
R_ast_thin = qr_R(X)[1:K, ] / sqrt(N - 1);
R_ast_thin_inverse = inverse(R_ast_thin);
}
parameters {
vector[K] trans_beta[J]; //coefficients on Q_ast_thin for each group; array of K-vects.
real<lower=0> sigma_y; //sd for the normal on y
}
transformed parameters {
vector[K] beta[J]; //betas on X
for (n in 1:J)
beta[n] = R_ast_thin_inverse * trans_beta[n];
}
model {
for (n in 1:J)
beta[n] ~ normal(0, 10);
for (n in 1:N)
y[n] ~ normal(Q_ast_thin[n] * trans_beta[A[n]], sigma_y);
}
So there. Now, in my mind this should work, but when I compile and sample from the model no chain converges and the Rhat are huge. I must then be doing something wrong in the above adaptation to the multilevel setting, but I can’t see what. Any ideas?
Thanks,
Riccardo.