Handling Permutation-Invariance in HMMs

To be completely unhelpful I don’t think we have a strong enough understanding of the degeneracies inherent to Hidden Markov Models quite yet. Label switching is just one problem out of many, the others often being far more insidious and problematic when the label switching can be eliminated.

One of the problems with trying to match to Viterbi is that it makes it very easy to overfit when there are more degeneracies than pure label switching, taking one chain that seems “close enough” and ignoring many other modeling configurations that are also consistent with the data. In other words trying to filter out chains by how similar they are to the Viterbi sequence works only when the posterior concentrates around a small neighborhood of sequences around Viterbi, and if that were true then you wouldn’t be seeing such chain-by-chain variation.

With that said, when trying to eliminated label switching one has to keep in mind that label switching in HMMs happens at the observational level. If the component observational models for each state are identical then there’s nothing that can distinguish between the latent states; it’s a fundamental flaw with the experimental design. Consequently the key is to differentiate those observational processes somehow, for example with priors on the auxiliary parameters of those observational models that don’t overlap, so that each observational model captures a unique set of behaviors.

If the transitions are rigid enough then even occasional measurements that can distinguish between the states can be enough to identify everything at intermediate times, although a time-dependent transition matrix will make this tough.