Hidden Markov Model with Binomial: How to infer a state-dependent intercept when the outcome gets unary

scijens · December 8, 2020, 9:14am

I am employing a Hidden Markov Model to model a binary outcome over time (e.g., on a given day, will an individual check the number of Covid-19 infections or not). The model is basically set up like proposed in this great case study. Now I ran this model assuming the existence of 3 latent states, reflecting an individual’s current propensity to check the number of infections.

What’s happening is that the state-dependent intercept of state 3 (mu[3]; the state of highest checking propensity) in the Binomial part of the model goes towards positive infinity. The problem the model is facing is that in state 3, the dependent variable always takes the value 1, i.e., the individual always checks the number of infections when she is in state 3.

Since the model cannot infer the intercept when the outcome is unary, how can I prevent it from estimating a state-dependent intercept that goes towards infinity?

I am showing the model parts that might be relevant to answer this question (I have simplified it a bit by not accounting for the heterogeneity across individuals):

data{
  int<lower = 1> N; // number of observations
  int<lower = 0, upper = 1> y[N]; // binary outcome: infection check "yes" (1) vs. "no" (0)
  int<lower = 1> S; // number of states
...
parameters {
  simplex[S] theta[S];  
  ordered[S] mu; // ordered state-dependent intercepts
...
model {

  // priors
  mu[1] ~ normal(-1, 1);
  mu[2] ~ normal(1, 1);
  mu[3] ~ normal(3, 1);

```
  // forward algorithm
  {
  real acc[S];
  real gamma[N, S];
  for (k in 1:S)
    gamma[1, k] = bernoulli_logit_lpmf(y[1] |  mu[k]);
  for (t in 2:N) {
    for (k in 1:S) {
      for (j in 1:S)
        acc[j] = gamma[t-1, j] + log(theta[j, k]) + bernoulli_logit_lpmf(y[t] |  mu[k]);
      gamma[t, k] = log_sum_exp(acc);
    }
  }
  target += log_sum_exp(gamma[N]);
  }
}
...

This plot visualises the issue that for state 3, the observed output is always y = 1 and never y = 0.

And here is the posterior output after running my model on my real data (I have validated it using simulated data before and that worked pretty well):

jsocolar · December 8, 2020, 6:07pm

I’m not an expert on hidden markov models, but it looks to me like something has gone awry here. The issue you are dealing with feels a lot like complete separation in logistic regression. In Bayesian analysis, a common solution is to regularize the coefficient (in this case mu[3]) via the prior.

What feels weird to me is that you have a prior on mu[3], but your output still lets mu[3] fly off towards infinity. This shouldn’t happen. By the time mu[3] attains a value of mu[3] = 10, probability of success (i.e. inverse_logit(mu[3]) is within 1e-05 of one. At this point, unless you have millions of data points, there is no way for the likelihood to suggest with any confidence that mu[3] should be higher than it is. The prior, on the other hand, has some very strong opinions about how high mu[3] can get. For example, it sees mu[3] = 20 as something like 1e-50 times less likely than mu[3] = 10.

So there’s no way that your model should place noteworthy posterior probability at high values of mu[3], where “high” depends on your sample size but probably doesn’t exceed 7 or 8. The fact that your model is seeing values as high as infinity (more or less), suggests that you somehow failed to apply the prior to mu[3] when you ran the code. Is it possible, for example, that you originally coded a two-state model and then updated the script for a three-state model without remembering to add the prior on mu[3]? I realize that there’s a prior on mu[3] in the code you provided above…

scijens · December 9, 2020, 8:05am

Thanks Jacob, this is helpful. I will double check if the prior was applied correctly to mu[3] in this estimation.

Do you have a recommendation for the choice of the prior? Would it make more sense to use something like mu[3] ~ normal(0,1), or mu[3] ~ normal(0,5) or mu[3] ~ normal(3,1) in this case - or what else?

jsocolar · December 9, 2020, 10:01pm

I’m afraid I don’t know enough about hidden markov models (or about your research area and your data) to provide a good suggestion for the prior. I will say that by putting a prior of normal(3,1) on mu[3] you are probably ensuring that there will be at least one state (mu[3]) where the probability of success is quite high (on the probability scale, the +/- two-sigma interval of that prior corresponds to [.73, .99]. Whether this is a good idea or not I cannot say.

martinmodrak · December 14, 2020, 5:20pm

I agree with @jsocolar on most of the prior recommendations. If unsure, you can always do a prior predictive check (e.g. use the prior to simulate new datasets as discussed e.g. in the Visualisation paper or the Bayesian workflow paper.

I would however ask you to consider also other options. The simples one is to follow the data - maybe you can change the meaning of the state 3 to be “checks always” and remove mu3 altogether?. Further, I find it slightly implausible that HMMs are actually a good model for this type of data - why would there be only three separate probabilitites for checking the number of cases? Wouldn’t a smooth model (e.g. a state-space model or a spline/GP model) make more sense?
Alternatively, you could also potentially model the process with only two states (checks always/never checks) and allow the transition probabilities to change over time.

Best of luck with your analysis!

scijens · December 18, 2020, 1:43pm

Thank you so much, that was very helpful!

The recommendation that a smooth model would make more sense appears to be correct - however, I do not plan to leave the route of the HMM for the moment. What you recommend as the alternative (2 states, time varying transition probabilities) is what I’m using for the moment - thank you!

Topic		Replies	Views
HMM/State Space Model with unobserved outcome Modeling specification	4	357	March 21, 2020
Rejecting initial value due to infinite gradient in a hidden markov model Modeling ecology , hmm , capture-recapture	18	1151	February 21, 2024
Hidden Markov Model with constraints Modeling specification , hmm	6	2669	August 24, 2017
Modelling with non-binary proportions as outcome brms	2	1899	December 17, 2019
Add an unknown discrete parameter following a Bernoulli distribution Modeling rstan , techniques	15	283	April 28, 2025

Hidden Markov Model with Binomial: How to infer a state-dependent intercept when the outcome gets unary

Related topics