Vectorizing varying slopes models

ants007 · August 25, 2021, 8:06am

Hello all,

I am trying to fit a varying slope model. I have 5 predictors.

Usually, when I am estimating a simple fixed effects model, I declare independent variables to be a matrix in the data section of Stan as matrix[N, K] X; where K is the number of predictors and N is the number of observations.

Then in the parameters section coefficients to be estimated is declared as vector[K] beta;

Finally in the model section, I usually set up likelihood as :

for (i in 1:N) {
Y[i]~ student_t(nu, grand_mean + X[i]*beta, sigma);
}

However, I do not know how to put all the coefficients in vector form when estimating a varying slope model. I have quite a lot of predictors, so I really need it to be vectorized for efficiency.

Here is my code for not so efficient varying slopes model.
If someone could just help me with a more efficient declaration of the model, I would be very grateful.
Thanks.

data {
int<lower=0> N; // Number of data items
int<lower=0> J; // Number of units
int<lower=0> K; // Number of predictors
vector[N] Y; // dependent variable
vector[N] X_IP;  /// Independent variables
vector[N] X_OP;
vector[N] X_OTH;
vector[N] X_CLIN;
vector[N] X_OUT;

int<lower=1,upper=J> dhb_id[N]; // unit id's

}

parameters {
vector[5] beta;
real grand_mean;
real<lower=0> sigma;

vector[J] DHB_specific_intercept;
real<lower=0> sigma_DHB_specific_intercept;


matrix[5,J] DHB_re; // DHB specific slope

vector<lower=0> [5] sigma_DHB_re;



real nu;

}
transformed parameters {

}

// The model to be estimated. 
   model {
   

beta ~ normal(0,5);
grand_mean ~ normal(0,5);

sigma_DHB_specific_intercept~ inv_gamma(2,0.5); 
DHB_specific_intercept ~ normal(0,sigma_DHB_specific_intercept);


for(k in 1:K) {
sigma_DHB_re[k] ~ inv_gamma(2,0.5);
DHB_re[k] ~ normal(0, sigma_DHB_re[k]);

}


sigma ~ inv_gamma(2,0.5);


nu ~ gamma(2,0.05);


for(i in 1:N) {
Y[i] ~ student_t(nu, grand_mean + DHB_specific_intercept[dhb_id[i]] 
+ (beta[1] + DHB_re[1,dhb_id[i]])*X_IP[i] 
+ (beta[2] + DHB_re[2,dhb_id[i]])*X_OP[i] 
+ (beta[3] + DHB_re[3,dhb_id[i]])*X_OTH[i] 
+ (beta[4] + DHB_re[4,dhb_id[i]])*X_CLIN[i] 
+ (beta[5] + DHB_re[5,dhb_id[i]])*X_OUT[i], sigma); 
}


}
generated quantities {

}

mike-lawrence · August 25, 2021, 12:38pm

Just remove all the instances of [i]:


Y ~ student_t(nu, grand_mean + DHB_specific_intercept[dhb_id] 
+ (beta[1] + DHB_re[1,dhb_id])*X_IP
+ (beta[2] + DHB_re[2,dhb_id])*X_OP
+ (beta[3] + DHB_re[3,dhb_id])*X_OTH
+ (beta[4] + DHB_re[4,dhb_id])*X_CLIN
+ (beta[5] + DHB_re[5,dhb_id])*X_OUT, sigma);

ants007 · August 25, 2021, 12:46pm

Hi Mike, thanks.

I was wondering if instead of beta[1] , beta[2] … beta[5] and DHB_re[1,dhb_id]) … DHB_re[5,dhb_id]

I could just do like beta[k] and DHb_re[k,dhb_id] or something. I have 24 predictors so writing beta[1] , beta[2] … beta[24] and DHB_re[1,dhb_id]) … DHB_re[24,dhb_id] will be very long and inefficient.

any suggestions?
antony

mike-lawrence · August 25, 2021, 1:00pm

Ah, yes, you can just pass in the predictors as a single matrix, then use the rows_dot_product() function to compute the expectation for the mean.

The “hwg” model at this repo has what I think is the same fundamental structure as your model (you can easily switch its normal observation-level structure to student_t), but it has some tweaks that will speed up inference when there is high redundancy in the predictor matrix.

The fact that you have 24 predictors leads me to infer that many of your predictors are observational (vs experimentally manipulated) variables, which usually means that that your predictor matrix is likely to have a decent amount of correlation among the predictors, in which case you’d probably want to QR transform the predictor matrix. But that’s just an operation to add to the transformed data block and compatible with the efficiency tricks of the model I linked above.

ants007 · August 25, 2021, 1:55pm

Hi Mike , thank you.

I have recoded it as per your advise.
does this code looks okay to you?

data {
int<lower=0> N; // Number of data items
int<lower=0> K; // number of predictors
int<lower=0> J; // number of groups
int<lower=1,upper=J> dhb_id[N];
matrix[N,K] X;
vector[N] Y;

}

parameters {
real alpha;

matrix[J,K] beta;

real<lower=0> sigma;

real nu;

}
transformed parameters {

}

model {

to_vector(beta) ~ normal(0,5);

alpha ~ normal(0,5);

sigma ~ inv_gamma(2,0.5);

nu ~ gamma(2,0.05);

Y ~ student_t(nu, alpha + rows_dot_product(beta[dhb_id] , X), sigma);

}
generated quantities {

}

mike-lawrence · August 25, 2021, 2:06pm

Yup!

Topic		Replies	Views
Linear mixed effects model with varying intercept and slope in matrix notation Modeling	2	3318	January 17, 2019
Rstan vectorized modelling efficiency and its components setup Modeling	2	565	March 22, 2018
Creating a linear predictor Modeling	5	497	April 23, 2019
The simplest varying intercepts/slopes model there is Modeling rstan , example-models	12	1232	April 14, 2020
Random parameter model using matrices/vectors Modeling	1	328	July 12, 2019

Vectorizing varying slopes models

Related topics