I’m currently working on training a Hidden Markov model on a collection of sequence data using unsupervised methods. Essentially, I have a collection of time series samples, each of which I assume is generated by a HMM with four hidden states, with each state having a Poisson emissions (and each emission having a different mean). However, I assume that, while the HMMs generating each sequence all have the same transition matrix, each one is allowed to have different Poisson emission means - so that they all have the same transitional probabilities, but different sequences are allowed to have different means in corresponding states.
I’d also like to be able to place some restrictions on the values taken by the transition matrix (i.e., force some of the transitions to be zero). I.e., I’d like the transition matrix to have form
[a b 0 0]
[0 c d 0]
[0 0 e f ]
[0 0 0 1]
and for the initial state probabilities to be [1 0 0 0] - i.e., the HMM starts in the first state always. I know that Baum-Welch does this if you set the initial value of the transition probability to zero, but so far, I’ve run into “initialization failed” issues when specifying zero probabilities for the initial state, and I’m not sure how to approach ensuring zero-probability transitions where I want them.
At the moment, I’ve adapted the Forward algorithm approach outlined in the Stan documentation to this problem. In its current state, it doesn’t work due to the zero initial probabilities that I’ve specified, but it works when I don’t specify any sort of initial probabilities - although the resulting parameters are not in line with what I’d like - a pretty uniform transition matrix, and approximately equal Poisson means for each sample. I’ve tried providing initial values for the transition matrix, such as
[0.900 0.033 0.033 0.034]
[0.050 0.800 0.100 0.050]
[0.050 0.050 0.800 0.100]
[0.033 0.033 0.034 0.900]
but this isn’t optimal, because I’d like to enforce that state 1 can only transition to states 1 and 2, state 2 can only transition to states 2 and 3, etc. Is there any way I can do this in Stan?
int<lower=1> K; // num states
int<lower=1> Q; // Length of value array
int<lower=1> S; // Number of sites
int<lower=0> s[S+1]; // Indices for each site's data
int<lower=0> u[Q]; // flattened values for all site
}
parameters {
simplex[K] theta[K]; // transit probs
vector<lower=0>[K] lambda[S]; // poisson means for each site
real<lower=0, upper=10> a[K]; // Gamma parameters
real<lower=0, upper=10> b[K]; // Gamma parameters
}
model {
a ~ uniform(0, 10);
b ~ uniform(0, 10);
for (ss in 1:S) {
for (k in 1:K)
lambda[ss, k] ~ gamma(a[k], b[k]);
{
real acc[K];
real gamma[s[ss+1]-s[ss], K];
int v[s[ss+1]-s[ss]] = u[s[ss]+1:s[ss+1]];
gamma[1, 1] = log(1); // Hoping that this forces the model into the first state
for (k in 2:K)
gamma[1, k] = log(0);
for (t in 2:size(v)) {
for (k in 1:K) {
for (j in 1:K) {
if (j == k || j == k + 1) {
acc[j] = gamma[t-1, j] + log(theta[j, k]) + poisson_lpmf(v[t] | lambda[ss, k]);
}
else {
acc[j] = negative_infinity();
}
}
gamma[t, k] = log_sum_exp(acc);
}
}
target += log_sum_exp(gamma[size(gamma)]);
}
}
}```