Speeding up a multi_normal model

Hello!

I’ve recently started using Stan and I’m loving it!

I developed a multiple regression model and it’s taking quite some time to run when I use 4 outcomes, 1500 observations and 6 independent variables. I’ve already set up the code so that I’m benefiting from matrix-like operations and using the Cholesky decomposition.

However, in the final step, I need to use a for loop to properly state my model. Here is my Stan code:

data 
{
    int<lower=1> N; // Number of observations
    int<lower=1> J; // Number of dependent variables i.e. number of regression lines 
    int<lower=0> K; // Number of independent variables
    matrix[N,K] x;  // Matrix containing data on independent variable
    matrix[N,J] y;  // Matrix containing data on dependent variable
}

parameters 
{
    matrix[K,J] beta;
    vector<lower=0>[J] st_devs;
    cholesky_factor_corr[J] L_corr;
}

model 
{
    matrix[N,J] xbeta = x * beta;
    
    for (j in 1:J)
        beta[,j] ~ normal(0,10);
        
    st_devs ~ cauchy(0, 2.5);
    L_corr ~ lkj_corr_cholesky(1.0);
    
    for (i in 1:N)
        y[i] ~ multi_normal_cholesky(xbeta[i],diag_pre_multiply(st_devs, L_corr));
}

generated quantities
{
    corr_matrix[J] cor_mtx = multiply_lower_tri_self_transpose(L_corr);
    cov_matrix[J]  cov_mtx = quad_form_diag(cor_mtx, st_devs);
}

My question here is actually pretty simple: is there any way I could get rid of that for loop in the model (the one with the multi_normal statement)?

The multi_normal distribution can’t receive the whole xbeta matrix as is, but the distribution is indeed already vectorized to receive array-like inputs. So I thought about transforming the xbeta matrix with dimensions NxJ into an N-long array containing J-long vectors at each of its elements. Something like this:

matrix[N,J] xbeta = x * beta;
row_vector[J] xbeta_new[N] = transform_into_array_of_vectors(xbeta);

I’m hoping there’s a magical function like this that doesn’t involve for-loops, or else that would defeat the purpose of speeding up the whole process.

Is there any way of doing this? I tried some of the “to_array” and “to_vector” functions, but Stan kept giving me errors.

Thank you so much!

PS: Sorry for the poor choice of posting category. I wasn’t really sure where question this fit.

1 Like

For x and y, you should just declare it as

row_vector[K] x[N];
row_vector[J] y[N];

but for the conditional mean, you have to do

row_vector[J] xbeta[N];
for (n in 1:N) xbeta[n] = x[n] * beta;
y ~ multi_normal_cholesky(xbeta, diag_pre_multiply(st_devs, L_corr));

which is all still much faster than calling multi_normal_cholesky N times.

Thanks for the heads up, @bgoodri!

Here are the speed-up results:

  • using my original formulation: sampling time = 1899 seconds
  • using the formulation you suggested: sampling time = 1716 seconds

Note that this happens when I set n_jobs = 1 in the pystan call.

I know this is even more memory intensive, but I’ve gotten some pretty great results using both matrix and array-variables like this:

In data block:

matrix[N,K] x;

In model block:

matrix[N,J] xbeta_temp = x * beta;
row_vector[J] xbeta[N];
for (n in 1:N)
    xbeta[n] = xbeta_temp[n];

So here I’m benefiting from the quick matrix multiplication and just doing a row-wise (from 1 to N) assignment.

1 Like

Exactly what you need to do because matrix multiplication is faster, but vectorization is geared to arrays to avoid confusion in multivariate distributions.