I’m pasting the model in here (with auto-formatting from our emacs mode and reduced vertical space):

```
data {
int<lower=1> T; // num steps
int<lower=1> K; // num categories
int<lower=1> D; // dim of obs
matrix[T, D] x; // obs
vector<lower=0>[K] alpha; // init category prior
vector[D] lambda; // init emit prior
cov_matrix[D] rho;
vector<lower=0>[K] beta; // transit prior
int nu; // prior over covariances
cov_matrix[D] kappa;
}
parameters {
simplex[K] pi; // init category
vector[D] imu[K]; // init emission
cov_matrix[D] isigma[K];
simplex[K] trans[K]; // transit probs
matrix[D, D] mat[K]; // emit prob matrix
cov_matrix[D] sigma[K];
}
model {
pi ~ dirichlet(alpha);
for (k in 1:K) {
imu[k] ~ multi_normal(lambda, rho);
isigma[k] ~ inv_wishart(nu, kappa);
}
for (k in 1:K)
trans[k] ~ dirichlet(beta);
for (k in 1:K)
sigma[k] ~ inv_wishart(nu, kappa);
{
// forward algorithm computes log p(x|...)
real aux[K];
real gamma[T, K];
vector[D] mu;
for (k in 1:K)
gamma[1, k] = multi_normal_lpdf(x[1,] | imu[k], isigma[k]) + log(pi[k]);
for (t in 2:T) {
for (k in 1:K) {
for (j in 1:K)
aux[j] = gamma[t - 1, j] + log(trans[j, k]);
mu = mat[k,,] * x[t - 1]';
gamma[t, k] = log_sum_exp(aux) + multi_normal_lpdf(x[t] | mu', sigma[k]);
}
}
target += log_sum_exp(gamma[T]);
}
}
```

This will never be lightning fast given the multivariate normal emissions.

What you can do to speed things up is try to precompute and store things like cholesky factors of covariance matrices for the multivariate normals. For example, each of the `sigma[k]`

should be Choleksy factored and you should get an `L_sigma[k]`

which can be reused for all `T`

. That’ll go a long way to speeding things up.

I’d also get rid of the inverse Wishart and use our decomposition into cholesky factors of correlation matrices and scales (described in the manual chapter on regression).

Then you want to work on vectorizing things like this:

```
for (k in 1:K) {
imu[k] ~ multi_normal(lambda, rho);
```

which only requires you to replace the loop with

```
imu ~ multi_normal(lambda, rho);
```

You’re probably then going to need the multivariate non-centered parameterization here to get decent mixing. The manual also describes that.

This is a complicated model to speed up!