Running time problem

I am working on double hierarchical generalized model with data simulated.

The model I set was

y ~ normal(mean, variance)

mean = Xbeta + Zu1

where X and Z are design matrices, beta is coefficients for fixed effects and u1 is coefficients for random effects.

Similarly,

log(variance) = phi + Z*u2

phi is just a intercept term and u2 is coefficients for random effects. Before defining priors for parameters, I used the idea from Gelman (2008) that u1 and u2 can be

u1[i] = dummy1 * Xi1[i]
u2[i] = dummy2 * Xi2[i]

Each element of Xi1 has normal(0, var = tauXi1), each element of X2 has normal(0, var = 1/tauXi2). Indeed, each of u1 and u2 have normal(0, tauMu) and normal(0, tauSigma) where tauMu = absolute(dummy1)/sqrt(tauXi1) and tauSigma = absolute(dummy2)/sqrt(tauSigma).

I am sampling tauXi1 and tauXi2 from gamma(1.5, 37.5) and dummy1 and dummy2 from normal(0,1). For other parameters, I used non-informative priors normal(0, 100^2).

Here is my stan code:

data {
int<lower=0> N;
int<lower=0> Nx1;
int<lower=0> Nx2;
int<lower=0> Nz;
matrix[N,Nx1] x1;
matrix[N,Nx2] x2;
matrix<lower=0,upper=1>[N,Nz] z;
real y[N];
vector[Nz] mvnMu;
matrix<lower=0>[Nz,Nz] mvnDiag;
}
parameters {
vector[Nz] u1;
vector[Nz] u2;
vector[Nx1] beta;
vector[Nx2] phi;
real dummy1;
real dummy2;
real<lower=0> invtauXi1Sq;
real<lower=0> invtauXi2Sq;
}
transformed parameters {
vector[N] mu;
vector[N] logSigma;
vector[N] sigma;
real<lower=0> tauMu;
real<lower=0> tauSigma;
matrix<lower=0>[Nz,Nz] mvnSigma1;
matrix<lower=0>[Nz,Nz] mvnSigma2;

mu = multiply(x1, beta) + multiply(z, u1);
logSigma = multiply(x2, phi) + multiply(z, u2);

sigma = exp(logSigma);

tauMu = abs(dummy1)/sqrt(invtauXi1Sq);
tauSigma = abs(dummy2)/sqrt(invtauXi2Sq);

mvnSigma1 = multiply(pow(tauMu,2), mvnDiag);
mvnSigma2 = multiply(pow(tauSigma,2), mvnDiag);

}
model {
for (i in 1:N) {
y ~ normal(mu[i], sigma[i]);
}
beta ~ normal(0, 100);
phi ~ normal(0, 100);
dummy1 ~ normal(0,1);
dummy2 ~ normal(0,1);
invtauXi1Sq ~ gamma(1.5, 37.5);
invtauXi2Sq ~ gamma(1.5, 37.5);
u1 ~ multi_normal(mvnMu, mvnSigma1);
u2 ~ multi_normal(mvnMu, mvnSigma2);
}

mvnDiag is just an identity matrix and mvnMu is vector of zero.

This stan code takes too long to complete! Can someone help me to improve running time?

A few things to try:

Use constraints to force the variables to be positive rather than abs(), similarly, you can do the same for sigma, and stan will handle the transformation for you.

You can use vectorization and just write:
y ~ normal(mu, sigma);

(it looks like you have a bug as it is right now and you actually wanted y[i] ~ normal(mu[i], sigma[i]);)

If u1 and u2 are each uncorrelated, you can just use normal, rather than multinormal.

1 Like

To elaborate on what @aaronjg said, abs() is a terrible idea for positive parameters as it makes the posterior multimodal by construction when it doesn’t have to be.

Also, do what he’s suggesting for multinormal. Multiplying a scalar by a constant diagonal matrix is crushingly expensive and then goes through a matrix solver, which is a cubic operation which doesn’t even need to happen if you use the normal as @aaronjg suggests.

Thanks for suggestions! Now, my code is much faster.