Hi,
I am trying to estimate multinomial hidden markov model (i.e., states and observations are all discrete) using this code by Luis Damiano.
The HMM problem that I am dealing with has 2 hidden states and 4 possible outcomes (observations).
When fitting the model using Stan, the posterior is multimodal as expected, likely because of the possible rotation.
One solution that I am thinking about is setting a constraint that the emission probability of observation 1 for State 1 is always higher than State 2.
In mathematical form, if we define B_{ij} as probability of observing j, j = 1,...4 for state i, i = 1,2, set a constraint that B_{11} > B_{21}.
However, I cannot use simple solutions like ordered
, since B is implemented as rows of simplexes like below:
parameters {
// ...
// Discrete observation model
simplex[L] phi_k[K]; // event probabilities
}
Instead, my alternative plan is manually exchanging two simplexes phi_k[1]
and phi_k[2]
when phi_k[1,1] < phi_k[2,1]
.
One problem with this approach is that I cannot directly access to phi_k
. Alternatively, I created ordered_phi_k
in transformed parameters
and used it in the place of phi_k
.
So, the original code was
transformed parameters {
vector[K] unalpha_tk[T];
{ // Forward algorithm log p(z_t = j | x_{1:t})
real accumulator[K];
for (j in 1:K) {
unalpha_tk[1][j] = log(p_1k[j]) + log(phi_k[j, x[1]]);
}
for (t in 2:T) {
for (j in 1:K) { // j = current (t)
for (i in 1:K) { // i = previous (t-1)
// Murphy (2012) Eq. 17.48
// belief state + transition prob + local evidence at t
accumulator[i] = unalpha_tk[t-1, i] + log(A_ij[i, j]) + log(phi_k[j, x[t]]);
}
unalpha_tk[t, j] = log_sum_exp(accumulator);
}
}
} // Forward
}
and I have modified it into
transformed parameters {
vector[K] unalpha_tk[T];
vector[L] ordered_phi_k[K];
{ // Forward algorithm log p(z_t = j | x_{1:t})
real accumulator[K];
// use ordered phi
if (phi_k[2,1] > phi_k[1,1]) {
ordered_phi_k[1] = phi_k[2];
ordered_phi_k[2] = phi_k[1];
} else {
ordered_phi_k[1] = phi_k[1];
ordered_phi_k[2] = phi_k[2];
}
for (j in 1:K) {
unalpha_tk[1][j] = log(p_1k[j]) + log(ordered_phi_k[j, x[1]]);
}
for (t in 2:T) {
for (j in 1:K) { // j = current (t)
for (i in 1:K) { // i = previous (t-1)
// Murphy (2012) Eq. 17.48
// belief state + transition prob + local evidence at t
accumulator[i] = unalpha_tk[t-1, i] + log(A_ij[i, j]) + log(ordered_phi_k[j, x[t]]);
}
unalpha_tk[t, j] = log_sum_exp(accumulator);
}
}
} // Forward
}
What was the result?
I have simulated a HMM sequence with 1000 observations, where A = [0.75, 0.25; 0.35, 0.65], and B = [0.5, 0.3, 0.1, 0.1; 0.1, 0.15, 0.4, 0.35].
The model reasonably recovers the true parameters:
Altogether, my question is,
is this approach justifiable in this specific constraint?
Thank you in advance,
Minho