QR reparametrization in multilevel models


I am new to stan (and to bayesian statistics in general) and I am unsuccessfully trying to adapt the QR reparametrization as explained in
http://mc-stan.org/users/documentation/case-studies/qr_regression.html, and
in section 5.2 of the “Bayesian statistics using stan” manual, to the setting of a
simple multilevel model.

Assume a setting where we have:

  • N observations
  • a design matrix X of dimensions [N \times K], K being the number of predictors
  • a number J of groups (in my application, these are different postcodes)
  • A an integer vector of length N that for each observation encodes its group, one of 1:J.

Suppose we want to train a multilevel regression model using QR decomposition, so that each group in 1:J has its own regression coefficients vector \beta_j. My attempt to adapt the model from the aforementioned links looks as follows:

data {
  int N; //number observations
  int J; //nr of grups
  int K; //nr of predictors incl. intercept
  int<lower=1, upper=J> A[N]; //lookup y <-> J
  vector[N] y; //response
  matrix[N, K] X; //covariates

transformed data {
  //QR decomposition
  matrix[N, K] Q_ast_thin;
  matrix[K, K] R_ast_thin;
  matrix[K, K] R_ast_thin_inverse;

  Q_ast_thin = qr_Q(X)[, 1:K] * sqrt(N - 1);
  R_ast_thin = qr_R(X)[1:K, ] / sqrt(N - 1);
  R_ast_thin_inverse = inverse(R_ast_thin);

parameters {
  vector[K] trans_beta[J]; //coefficients on Q_ast_thin for each group; array of K-vects.
  real<lower=0> sigma_y; //sd for the normal on y

transformed parameters {
  vector[K] beta[J]; //betas on X

  for (n in 1:J)
    beta[n] = R_ast_thin_inverse * trans_beta[n];

model {
  for (n in 1:J)
    beta[n] ~ normal(0, 10);

  for (n in 1:N)
   y[n] ~ normal(Q_ast_thin[n] * trans_beta[A[n]], sigma_y);

So there. Now, in my mind this should work, but when I compile and sample from the model no chain converges and the Rhat are huge. I must then be doing something wrong in the above adaptation to the multilevel setting, but I can’t see what. Any ideas?


I think if you are letting the coefficients vary by group, then you should be doing group-specific QR decompositions.

Hi @bgoodri

I see! so you mean that each group 1…J should have its own matrices \{Q_j, R_j \mid j\in J\}?
That seems to make a lot of sense, I will try that and see if it works!


Hi Riccardo, I am about to do a model that would be very similar to yours. I was wondering if you managed to adapt the RQ decomposition and how you did it.

Hey Tommaso,

Were you able to figure this out? I am also working on something similar to adapt QR reparameterization for multilevel modeling.