Syntax question regarding normal_id_glm()

Dear Stan community,

I have a very simple massive univariate linear regression model, for each individual in [1:N], it is just y \sim normal(x*\beta, \sigma), where y is a vector[M], x is a matrix[M,D], \beta is a vector[D], \sigma is just the scale. I can just make a for loop to fit N models, but it sounds a bit silly. Then I thought I may use normal_id_glm() to just build one model and fit once. The results should be identical. In this case, Y becomes a [M, N] matrix, x is same as before, \beta becomes a matrix[D, N], and \Sigma is a vector[N]. Note that since it is a massive univariate analysis, each individual is considered to be indenpend.

I wrote this single model as follows after trying different options regarding the way to define the matrix beta, it compiled, mod$check_syntax(pedantic = TRUE) returned no issue, and it fitted well when given data, but I don’t understand why it worked mainly due to the syntax. From what I understand, Y[:,n] is a vector[M], X is a matrix[M,D], then beta[n] should be a vector[D], however from what I read in Stan manual, if beta is a matrix, beta[n] should be a row_vector, which means it should take each row of beta matrix in the for loop, and I should have gotten error since D is the number of rows of beta matrix rather than N. Could you please take a look and explain why it worked? so I can understand the syntax better. For instance, I also tried to change parameter beta to matrix[D, N] beta;, I will get syntax errors, but just for learning purposes, imagine I have to use matrix[D, N] beta, how do I fix the syntax error?

Another question is I try to compare its speed (speed_{singlemodel}) to the case in which I just fit N univariate linear regression model (N*speed_{univariate}). They are quite similar. In the end, the only benefit of this single is that it is neat. I see somewhere we could use GPU to speed up normal_id_glm(), could you please post here a ref link where I can give it a try?

data {
  int<lower=0> M; // number of observations     
  int<lower=0> N; // number of individual   
  int<lower=0> D; // number of covariate
  matrix[M,N] Y; // data matrix 
  matrix[M,D] X; 
}

parameters {
  vector[D] beta[N]; // beta per individual per covariate
  vector<lower=0>[N] sigma; // sigme per individual
}

model {
  beta[N] ~ normal(0, 10);        
  sigma ~ normal(0, 10);       
  for (n in 1:N) {
    Y[:,n] ~ normal_id_glm(X, 0, beta[n], sigma[n]);
  }
}

Thanks a lot,
zcai

1 Like

I also found when increasing the number of individuals (i.e., N), speed_{singlemodel} becomes much slower than N*speed_{univariate}. In theory, they should be similar no matter the number N since each calculation in for loop (whether the loop is done inside or outside Stan model) is independent. This may imply that my model is not doing what it suppose to do, but what should be the right syntax?

I guess I figured out the reason, it is related to the definition of the matrix in Stan, vector[D] beta[N]; is a NxD matrix, not DxN. When accessing like beta[n] Stan will return a vector[D], not a row_vector[D], since in definition (vector[D] beta[N];) each beta[n] is a vector. Then everything is valid for normal_id_glm() synax (vector y | matrix x, real alpha, vector beta, real sigma).

Then if I force beta to use matrix definition matrix[N,D] beta;, the following code also works, the difference to the original code is that I have to take transpose for beta[n], since Stan returns a row_vector when accessing matrix m by m[n]. I can also declare beta as matrix[D,N] beta; and then I will need to replace beta[n]' in the for loop with beta[:,n]. Could you please confirm my understanding is right or not?

data {
  int<lower=0> M; // number of observations     
  int<lower=0> N; // number of individual   
  int<lower=0> D; // number of covariate
  matrix[M,N] Y; // data matrix 
  matrix[M,D] X; // design matrix
}

parameters {
  //vector[D] beta[N]; // beta per individual per covariate
  matrix[N,D] beta;
  vector<lower=0>[N] sigma; // sigme per individual
}

model {
  beta[:,D] ~ normal(0, 10);        
  sigma ~ normal(0, 10);       
  for (n in 1:N) {
    Y[:,n] ~ normal_id_glm(X, 0, beta[n]', sigma[n]);
  }
}

One thing I am not sure about is how to assign prior to beta. Are these three ways 1) beta[:,D] ~ normal(0, 10); , 2) beta[:,D] ~ normal(0, 10); and 3) beta[:,D] ~ normal(0, 10); equivalent? In my case, beta is independent among individuals and independent for each covariate.

I am worrying that the model above is not sampling for each column of Y independently as planned, then I could not find what’s wrong. should I use multi_normal with an identity correlation matrix to avoid for loop, so it may be faster?

Thanks