# Vectorization in HMM model specification

I’m working on inference on a HMM with `cmdstan` and, following Bales’s example, I need to define the likelihood at all states to be used in the `hmm_marginal` function to integrate out the hidden states.
The data is defined as `row_vector[N] y;` and the parameters as

``````vector[Q] Mu;
real<lower=0.0> Sigma;
matrix[Q, N] log_omega;
``````

then the likelihoods

``````  for(n in 1:N) {
for(i in 1:Q){
log_omega[i, n] = normal_lpdf(y[n] | Mu[i], Sigma);
}
}
``````

the number of states `Q` and of observed points `N` could be quite large so I was wondering if there’s a more efficient way of coding this step rather than a double loop.

In principle, yes, but you would need to write a custom likelihood function equivalent to the normal_lpdf, but which returns a vector. The existing function is vectorized, but it sums all the log likelihods and returns a scalar. This would work:

``````
functions {
row_vector normal_lpdf_vec(row_vector y, real mu, real sigma) {
return -((y - mu)/sigma)^2 / 2 - log(sigma) - log2()/2 - log(pi())/2;
}
}

// other parts

transformed parameters {
// other parts

for(i in 1:Q){
log_omega[i, ] = normal_lpdf_vec(y | Mu[i], Sigma);
}
}
``````

But I doubt that would have a significant effect, as loops in C++ are quite efficient. For the example in the link you posted, I updated the code with such a vectorized version and the times are identical. But that’s with small N and Q, so maybe with more it would work. Feel free to try it

PS: some of the operations in normal_lpdf_vec are the same across loops (because sigma is teh same and the constants), so you could take these out and only apply them to the final log_omega matrix. Maybe that speeds it up a bit more

2 Likes

Thanks for working on this, @Ven_Popov
I tried the vectorised function with Q~40 and N~5000 but, as you anticipated, there is no significant improvement.I’ll rethink the approach and see if I can come up with a better way of proceeding.

thanks again!
Davide

1 Like