Fitting Markov models with latent discrete states

Hi, here’s an example of the sort that we can’t fit in Stan:
It’s a hidden Markov model where each data point comes from one of two states, so with N data points there are 2^N possibilities so you can’t just use the mixture model formulation as in the Stan manual (that is, unless I’m missing some simplification that would allow this, as I guess is possible).

As I describe at the link, I don’t actually like this model for this particular application (hot and cold spells in sports), but (a) there are real problems with unobserved latent states (for example, Phil Price’s models of indoor energy use), and (b) in any case, people want to fit these models all the time. The Jags code for this particular model is here:

So what’s to be done? Yes, I understand that discrete models are in general in some sense impossible to fit. Nonetheless, people are fitting them all the time.

So I don’t want the response on this thread to be “don’t do it” or “this problem is impossible to solve.” It may be impossible to get finite-time algorithms that are provably close or whatever, but it’s not impossible to get reasonable fits.

This particular example is not ideal–as I said, I don’t really like the model for this sports example–but I think it’s a good enough example for us to think seriously about how to attack these problems in Stan.

Is this different from the HMM stuff that’s already in Stan ( It wasn’t super easy to read their code to see what’s going on.

I’m not having an easy time reading their code either. It’s been a while since I’ve gone through any nontrivial jags code.

Yes, I think it’s the same as the HMMs in the Stan manual. I guess the challenge is programming it up. Does this mean we should have a pre-written HMM function in Stan that would be easier for people to use?

That model’s easy to fit. See the manual. The forward algorithm provides an O(N * K^2) function for evaluating the likelihood where K is the number of discrete states and N is the length of the sequence. Same algorithm that everyone uses for EM for max likelihood estimates of HMMs.]

There’s a description in the manual. And on your blog as @bbbales2 pointed out. I put that up there the last time this issue came up and people said you couldn’t fit HMMs in Stan.

Yup, and they’re even fitting them with Stan. You just have to be careful of all the usual pitfalls of mixture models.

Probably. But there are two obstacles:

  1. The emission distribution somehow needs to be passed in as an argument like the system function for ODEs.

  2. In realistic applications, there are often predictors informing the transitions and the emissions in addition to underlying state.

1 Like

Damn . . . now I wish I could remember who told me he was fitting this sort of model, so I could point him to the Stan solution.

Is there a list where we have open tasks like the pre-written HMM function? Then I could post something encouraging someone to solve it for us.


You did write a blog post about it.

A hard-coded HMM function isn’t that hard to write. I could write that. The problem is how to do it with reasonable flexibility so that you can plug in different emission distributions. We could code it like the integrate ODE functions to take an emission probability density as an argument. This would be much easier if we had higher-order types, which is the next big push for the language after tuples, I think.

The real problem is that it’s not a very flexible approach. A lot of HMMs add in special sauce in terms of things like using predictors for transition probabilities or adding in extra predictors for emissions or in terms of adding longer-range dependencies than first order. I was looking at the R moveHMM package, and they hard code up a bunch of these.