Slow Stan Code for Hierarchical Multivariate Model


Hi everyone! I am trying to fit a hierarchical multivariate model. However, the model takes long long hours to run. Below are the model, Rstan code and simulation data. I hope to get some help. Thank you!

  • The model I am trying to fit is the following:

\left( \begin{array}{c} Y_i \\ \log s_i \\ \end{array} \right) \sim N_2 \biggl[ \left( \begin{array}{c} \theta_i \\log \sigma_i \\ \end{array} \right), \left( \begin{array}{cc} \sigma_i^2 & \rho_1\sigma_i \\\rho_1\sigma_i & 1 \\ \end{array} \right) \biggr]
Hierarchical model for \theta_i and \sigma^2_i,
\left( \begin{array}{c} \theta_i \\ \log \sigma_i \\ \end{array} \right) \sim N_2 \biggl[ \left( \begin{array}{c} \mu_\theta \\\mu_\sigma \\ \end{array} \right), \left( \begin{array}{cc} r_\theta^2 & \rho_2 r_{\theta}r_{\sigma} \\\rho_2 r_{\theta}r_{\sigma} & r_\sigma^2 \\ \end{array} \right) \biggr]

  • Rstan code
data {
      int<lower=1> N;
      vector[2] x[N];
    parameters {
      vector[2] mu;
      vector<lower=0>[2] lambda;
      vector[2] theta[N];
      real<lower=-1,upper=1> r1;
      real<lower=-1,upper=1> r2;
    transformed parameters {
      vector<lower=0>[2] sigma;
      cov_matrix[2] T;
      cov_matrix[2] M[N];
      sigma[1] = inv_sqrt(lambda[1]);
      sigma[2] = inv_sqrt(lambda[2]);
      T[1,1] = square(sigma[1]);
      T[1,2] = r2 * sigma[1] * sigma[2];
      T[2,1] = r2 * sigma[1] * sigma[2];
      T[2,2] = square(sigma[2]);
      for (i in 1:N){
      M[i][1,1] = square(exp(theta[i,2]));
      M[i][1,2] = r1 * exp(theta[i,2]);
      M[i][2,1] = r1 * exp(theta[i,2]);
      M[i][2,2] = 1;
    model {
    lambda[1] ~ gamma(1,1);
    lambda[2] ~ gamma(1,1);
    for (i in 1:N){
    theta[i] ~ multi_normal(mu, T);
    x[i] ~ multi_normal(theta[i], M[i]);}
  • Data generation
i <- 999
n <- 50
mu <-100
tau <- 50


data_simu=cbind(y,theta,sigma2, s2)
data <- data.frame(data_simu)[,c("theta","y","s2")]
x <- cbind(data$y,log(sqrt(data$s2)))


Not completely understanding your code, but here are some quick thoughts:

  • It seems like the lambda parameters represent precision of the normal distribution, with a gamma prior. Andrew and others seem to be of the opinion that this is not a great parametrization - working with sigma directly and putting a Half-normal prior on sigma might improve things.
  • exp(theta[i,2]) can be computed once and then reused
  • multi_normal_cholesky tends to be a faster version. For the T matrix, which is reused, you should get a speed up by calling cholesky_decompose once and then using the result. You could be able to do some algebra and compute the Cholesky factor of the covariance matrix directly from the parameters which would be useful for the M matrices.
  • The “folk theorem” states that poor performance usually signals some problems with the model. Does the model converge? Does it recover simulated data well? Are the Rhat and n_eff statistics reasonable?
  • lambda[1] ~ gamma(1,1); lambda[2] ~ gamma(1,1); could be vectorized as lambda ~ gamma(1,1);