How to train a Hidden Markov Model with multiple sequences

Hello. I’m new to Stan and trying to implement a Hidden Markov model that can fit using multiple training sequences.

I found a similar question on another website but I have no idea how to implement using Stan.
Training a Hidden Markov Model, multiple training instances

Could anyone share with me an example code if you know?
Thank you in advance.

1 Like

Nothing special for the multiple sequences thing. If you have two sets of independent measurements y1 and y2, then your likelihood would look something like:

p(y_1 | \theta) p(y_2 | \theta) p(\theta)

So assuming we’re just dealing with a normal measurement model, when you go to code your model you’ll just have code for both terms:

model {
  y1 ~ normal(mu1, sigma1); // contribution from p(y1 | theta)
  y2 ~ normal(mu2, sigma2); // contribution from p(y2 | theta)

If you assume the observations are correlated it would be different.

So if you can code up a Stan model that handles any one HMM sequence, p(y_n | \theta) or however you want to write it, then your Stan code would look like:

model {
  target += log_prob_of_observation(1); // Contributions from sequence 1

And then multiple obsevations would just be:

model {
  for(n in 1:N)
    target += log_prob_of_observation(n);

Anyway when you’re writing a Stan model you’re just trying to compute the log of the likelihood times the prior. Step 1 in writing the model is writing down the likelihood, so if you can do that for your multiple observation HMM, then hopefully translating it to something Stan can work with won’t be impossible.


Hi @bbbales2, thank you very much for your kind reply.
I’m sorry that the way I asked caused a misunderstanding since my question was not about coding up a model like p(y_1 | \theta) p(y_2 | \theta) p(\theta).

Let me explain more detail about what I wanted to ask.
Let’s say we have a sensor that has three hidden states (y) and emit a value (X) every second depending on the state. We can get an observation sequence when we have an experiment.
As a result of N times independent experiments, we have obtained N observation sequences of different lengths as follows:

First time experiment: y1 = [1, 1, 2, 3], X1 = [0.3, 0.3, 1.2, 2.5]
Second time: y2 = [1, 2, 3, 3, 3], X2 = [0.2, 1.0, 2.2, 2.6, 2.5]

N time: yN = [1, 2, 2], XN = [0.1, 1.0, 0.9]

What I’d like to realize is to train a HMM using these N observation sequences so that making a hidden state estimator using a sequence of sensor values.

I’m very sorry to bother you but could you give me an example code that realizes things above?

\prod_n^N p(y_n | \theta) represents N independent measurements.

The likelihoods there are conditionally independent given \theta, but \theta could be shared across the different likelihood terms or not. So if you collect data from the same system N times for instance, you’d probably assume the transition matrix is the same for every time.

Nah nah, we should have examples for this stuff. There’s a new HMM interface that got added in 2.24. Use cmdstanr/cmdstanpy to get access to it (,

Here’s a not-quite-finished case study for using HMMs: hmm-example.pdf (257.5 KB) , and the Stan model here: hmm-example.stan (1.2 KB) . If you find problems with this (quite likely) lemme know.


Hi @bbbales2,

It’s very helpful if the Stan guide has examples for this stuff.

I’ve read the materials you gave me and I’ll give it a try.
Thank you again for your help, I appreciate it.

1 Like