Vectorizing varying slopes models

Hello all,

I am trying to fit a varying slope model. I have 5 predictors.

Usually, when I am estimating a simple fixed effects model, I declare independent variables to be a matrix in the data section of Stan as matrix[N, K] X; where K is the number of predictors and N is the number of observations.

Then in the parameters section coefficients to be estimated is declared as vector[K] beta;

Finally in the model section, I usually set up likelihood as :

for (i in 1:N) {
Y[i]~ student_t(nu, grand_mean + X[i]*beta, sigma);
}

However, I do not know how to put all the coefficients in vector form when estimating a varying slope model. I have quite a lot of predictors, so I really need it to be vectorized for efficiency.

Here is my code for not so efficient varying slopes model.
If someone could just help me with a more efficient declaration of the model, I would be very grateful.
Thanks.

data {
int<lower=0> N; // Number of data items
int<lower=0> J; // Number of units
int<lower=0> K; // Number of predictors
vector[N] Y; // dependent variable
vector[N] X_IP;  /// Independent variables
vector[N] X_OP;
vector[N] X_OTH;
vector[N] X_CLIN;
vector[N] X_OUT;

int<lower=1,upper=J> dhb_id[N]; // unit id's

}

parameters {
vector[5] beta;
real grand_mean;
real<lower=0> sigma;

vector[J] DHB_specific_intercept;
real<lower=0> sigma_DHB_specific_intercept;


matrix[5,J] DHB_re; // DHB specific slope

vector<lower=0> [5] sigma_DHB_re;



real nu;

}
transformed parameters {

}

// The model to be estimated. 
   model {
   

beta ~ normal(0,5);
grand_mean ~ normal(0,5);

sigma_DHB_specific_intercept~ inv_gamma(2,0.5); 
DHB_specific_intercept ~ normal(0,sigma_DHB_specific_intercept);


for(k in 1:K) {
sigma_DHB_re[k] ~ inv_gamma(2,0.5);
DHB_re[k] ~ normal(0, sigma_DHB_re[k]);

}


sigma ~ inv_gamma(2,0.5);


nu ~ gamma(2,0.05);


for(i in 1:N) {
Y[i] ~ student_t(nu, grand_mean + DHB_specific_intercept[dhb_id[i]] 
+ (beta[1] + DHB_re[1,dhb_id[i]])*X_IP[i] 
+ (beta[2] + DHB_re[2,dhb_id[i]])*X_OP[i] 
+ (beta[3] + DHB_re[3,dhb_id[i]])*X_OTH[i] 
+ (beta[4] + DHB_re[4,dhb_id[i]])*X_CLIN[i] 
+ (beta[5] + DHB_re[5,dhb_id[i]])*X_OUT[i], sigma); 
}


}
generated quantities {

}

Just remove all the instances of [i]:


Y ~ student_t(nu, grand_mean + DHB_specific_intercept[dhb_id] 
+ (beta[1] + DHB_re[1,dhb_id])*X_IP
+ (beta[2] + DHB_re[2,dhb_id])*X_OP
+ (beta[3] + DHB_re[3,dhb_id])*X_OTH
+ (beta[4] + DHB_re[4,dhb_id])*X_CLIN
+ (beta[5] + DHB_re[5,dhb_id])*X_OUT, sigma); 

Hi Mike, thanks.

I was wondering if instead of beta[1] , beta[2] … beta[5] and DHB_re[1,dhb_id]) … DHB_re[5,dhb_id]

I could just do like beta[k] and DHb_re[k,dhb_id] or something. I have 24 predictors so writing beta[1] , beta[2] … beta[24] and DHB_re[1,dhb_id]) … DHB_re[24,dhb_id] will be very long and inefficient.

any suggestions?
antony

Ah, yes, you can just pass in the predictors as a single matrix, then use the rows_dot_product() function to compute the expectation for the mean.

The “hwg” model at this repo has what I think is the same fundamental structure as your model (you can easily switch its normal observation-level structure to student_t), but it has some tweaks that will speed up inference when there is high redundancy in the predictor matrix.

The fact that you have 24 predictors leads me to infer that many of your predictors are observational (vs experimentally manipulated) variables, which usually means that that your predictor matrix is likely to have a decent amount of correlation among the predictors, in which case you’d probably want to QR transform the predictor matrix. But that’s just an operation to add to the transformed data block and compatible with the efficiency tricks of the model I linked above.

1 Like

Hi Mike , thank you.

I have recoded it as per your advise.
does this code looks okay to you?

data {
int<lower=0> N; // Number of data items
int<lower=0> K; // number of predictors
int<lower=0> J; // number of groups
int<lower=1,upper=J> dhb_id[N];
matrix[N,K] X;
vector[N] Y;

}

parameters {
real alpha;

matrix[J,K] beta;

real<lower=0> sigma;

real nu;

}
transformed parameters {

}

model {

to_vector(beta) ~ normal(0,5);

alpha ~ normal(0,5);

sigma ~ inv_gamma(2,0.5);

nu ~ gamma(2,0.05);

Y ~ student_t(nu, alpha + rows_dot_product(beta[dhb_id] , X), sigma);

}
generated quantities {

}

2 Likes

Yup!

1 Like