QR decomposition error

I am trying to test some QR decomposition code and I got this weird error with no sampling done.

Here is my code

data {
  int<lower=1> N; // Number of observations
  int<lower=1> K; // Number of predictors
  int<lower=0> y[N];  // Outcome
  matrix[N, K] X; // predictor matrix

transformed data {
matrix[N, K] Q_ast;
matrix[K, K] R_ast;
matrix[K, K] R_ast_inverse;
// thin and scale the QR decomposition
Q_ast = qr_Q(X)[, 1:K] * sqrt(N - 1);
R_ast = qr_R(X)[1:K, ] / sqrt(N - 1);
R_ast_inverse = inverse(R_ast);

parameters {
  real w0; //intercept
  vector[K] theta; // coefficients on Q_ast

transformed parameters{
  vector[K] beta = R_ast_inverse * theta;

model {
  vector[N] log_lambda;
  // priors

  //non informative prior for the intercept
   w0 ~ normal(0,5);
  //Weakly informative prior for the other variables in data
  beta ~  normal(0,1); 
  // likelihood 
  for (n in 1:N) log_lambda[n] = w0  + Q_ast[n] * theta;
  y ~ poisson_log(log_lambda);

Here is my error

Warning (non-fatal):
Left-hand side of sampling statement (~) may contain a non-linear transform of a parameter or local variable.
If it does, you need to include a target += statement with the log absolute determinant of the Jacobian of the transform.
Left-hand-side of sampling statement:
beta ~ normal(…)

I initially tried the generated quantity version of QR decomp that was in the Stan man. I got the same error but it did not throw the first warning.

Is there something I am missing?


Looks like you ran out of RAM. It is better to use the qr() function in R to do a thin QR decomposition when there are many predictors.

Interesting thanks Ben! So bad_alloc has to do with RAM. I will look into the R function and process the data pre stan. I have run several Stan models before and have never gotten this error. Is the QR decomposition a common RAM problem?

Also I wouldn’t say I have a lot of predictors ~20 but I do have a lot of observations ~800,000. Would QR still be useful in this scenario?

Yes with enough data

Yes. Substantively, it doesn’t really matter how many observations you have but it does eat more RAM with more observations.