Speeding up inference on ARHMM

hanyas · March 7, 2018, 1:41pm

I am playing around with an auto regressive HMM with continuous emissions and incomplete data. Unfortunately the simple code I have, mainly adapted from the HMM example in the docs, is hopelessly slow.

I would appreciate any feedback on how I might be able to boost the performance. Any hints about using parameter and data transformations would be great. I am new to Stan and MCMC in general and I am lacking the expert intuition.

I am including a pystan script of the model and data generation.

pystan_arhmm_ge.py (2.5 KB)

Bob_Carpenter · March 8, 2018, 1:36am

I’m pasting the model in here (with auto-formatting from our emacs mode and reduced vertical space):

data {
  int<lower=1> T;  // num steps
  int<lower=1> K;  // num categories
  int<lower=1> D;  // dim of obs
  matrix[T, D] x; // obs
  vector<lower=0>[K] alpha;  // init category prior
  vector[D] lambda;  // init emit prior
  cov_matrix[D] rho;
  vector<lower=0>[K] beta;  // transit prior
  int nu;  // prior over covariances
  cov_matrix[D] kappa;
}
parameters {
  simplex[K] pi; // init category
  vector[D] imu[K]; // init emission 
  cov_matrix[D] isigma[K];
  simplex[K] trans[K];  // transit probs
  matrix[D, D] mat[K]; // emit prob matrix
  cov_matrix[D] sigma[K];
}
model {
  pi ~ dirichlet(alpha);
  for (k in 1:K) {
    imu[k] ~ multi_normal(lambda, rho);
    isigma[k] ~ inv_wishart(nu, kappa);
  }
  for (k in 1:K)
    trans[k] ~ dirichlet(beta);
  for (k in 1:K)
    sigma[k] ~ inv_wishart(nu, kappa);
  {
    // forward algorithm computes log p(x|...)
    real aux[K];
    real gamma[T, K];
    vector[D] mu;

    for (k in 1:K)
      gamma[1, k] = multi_normal_lpdf(x[1,] | imu[k], isigma[k]) + log(pi[k]);
		  
    for (t in 2:T) {
      for (k in 1:K) {
        for (j in 1:K)
          aux[j] = gamma[t - 1, j] + log(trans[j, k]);
					
        mu = mat[k,,] * x[t - 1]';
        gamma[t, k] = log_sum_exp(aux) + multi_normal_lpdf(x[t] | mu', sigma[k]);
      }
    }
    target += log_sum_exp(gamma[T]);
  }
}

This will never be lightning fast given the multivariate normal emissions.

What you can do to speed things up is try to precompute and store things like cholesky factors of covariance matrices for the multivariate normals. For example, each of the sigma[k] should be Choleksy factored and you should get an L_sigma[k] which can be reused for all T. That’ll go a long way to speeding things up.

I’d also get rid of the inverse Wishart and use our decomposition into cholesky factors of correlation matrices and scales (described in the manual chapter on regression).

Then you want to work on vectorizing things like this:

for (k in 1:K) {
    imu[k] ~ multi_normal(lambda, rho);

which only requires you to replace the loop with

imu ~ multi_normal(lambda, rho);

You’re probably then going to need the multivariate non-centered parameterization here to get decent mixing. The manual also describes that.

This is a complicated model to speed up!

Topic		Replies	Views
Rstan much slower than self-coded MCMC Modeling performance	10	2049	January 14, 2022
Very slow MCMC sampling in stan_jm() due to large number of individuals Modeling fitting-issues , performance , mixed-model , rstanarm , paralellization	6	988	April 30, 2021
Forward algorithm for hidden Markov models General hmm	11	2634	April 13, 2018
Code takes too long to run, despite small dataset size Modeling	7	558	April 11, 2022
Improving the performance of a model with lots of iteration Modeling specification , performance	14	1814	January 14, 2022

Speeding up inference on ARHMM

Related Topics