Gaussian HMM does not converge and has large rhat

StanGao · August 3, 2022, 10:38pm

I’m trying yo create an hidden Markov Model with gaussian emission and fit with MCMC.

The observation sequence is generated by following:

p_transition <- matrix(data = c(0.2, 0.4, 0.1, 0.6, 0.2, 0.5, 0.2, 0.4, 0.4), nrow = 3)
p_init <- c(0.24561404, 0.40350877, 0.35087719)

Markov_sequence_length <- 1000

init_state <- which(c(rmultinom(1, 1, init_prob)) == 1)

Markov_sequence <- c(init_state)

for(i in 1:Markov_sequence_length){
  
  last_state <- tail(Markov_sequence, 1)
  
  state_sample <- which(c(rmultinom(1, 1,p_transition[last_state,])) == 1)
  
  
  Markov_sequence <- append(Markov_sequence, state_sample)
}

mus <- c(0.5, 0, -2)
sigmas <- c(1, 0.5, 3)

obs <- c()

for(m in 1:Markov_sequence_length){
  
  new_val <- rnorm(1, mean = mus[Markov_sequence[m]], sd = sigmas[Markov_sequence[m]])
  obs <- append(obs, new_val)
  
}

The model in Stan is defined as following:

data {
  int<lower = 0> N;
  int<lower = 1> Z;
  real y[N];
}

parameters {
  
  real mu[Z];
  real<lower = 0> mu_tau[Z];
  
  real<lower=0> sigma[Z];
  
  simplex[Z] p_transition[Z];
  simplex[Z] p_init; 
}

model {
  
  // priors
  mu_tau ~ scaled_inv_chi_square(1, 0.05);
  mu ~ normal(0, mu_tau);
  
  sigma ~ scaled_inv_chi_square(1, 0.05);
  
  // forward algprithm
  {
    real acc[Z];
    real alpha[N, Z];
    
    for (z in 1:Z){
      
      alpha[1, z] = normal_lpdf(y[1] | mu[z], sigma[z]) + log(p_init[z]);
      
    }
    
    for (t in 2:N){
      
      for (z in 1:Z){       // current state
        
        for (j in 1:Z){     // previous state
          
          acc[j] = alpha[t - 1, j] + log(p_transition[j, z]) + normal_lpdf(y[t] | mu[z], sigma[z]);
          
        }
        
        alpha[t, z] = log_sum_exp(acc);
        
      }
    }
    
    target += log_sum_exp(alpha[N]);
  }
}

Then, the model is fitted in R:

stan_data <- list(N = length(obs),
                  Z = 3,
                  y = obs)

fit <- stan("Wired_HMM.stan",
            data = stan_data,
            iter = 10000,
            control = list(adapt_delta = 0.9),
            chains = 4)

It gives the results:

                     mean se_mean    sd     2.5%    97.5% n_eff Rhat
mu[1]                -0.37    0.58  0.85    -2.09     0.57     2       4.02
mu[2]                -0.75    0.73  1.07    -2.26     0.64     2       3.85
mu[3]                -0.36    0.57  0.82    -2.05     0.57     2       5.26

Which 95% quantile includes my predefined mu(0.5, 0, -2) but with a large range and the rhats are far from 1. Is there anywhere I messed up the HMM? How can I fix the model?

avehtari · August 4, 2022, 7:45am

As n_eff is estimated to be 2, it hints the posterior has two modes and the chains are ending up in different modes. HMMs are likely to have mutiple modes. Check the traceplots; see, e.g. bayesplot mcmc_trace.

StanGao · August 4, 2022, 7:52am

Thanks! By the way, does my forward algorithm looks correct?

avehtari · August 4, 2022, 8:06am

I’ve not done HMMs myself, so can’t say. I also recommend to check the HMM functions (that were introduced two years ago):

avehtari · August 4, 2022, 8:08am

Oops forgot to paste the link HMM Example

StanGao · August 4, 2022, 8:12am

Thanks

Topic		Replies	Views
Multilevel Hidden Markov Model initial sampling fail Modeling rstan , fitting-issues , hmm	2	609	April 20, 2022
Multivariate Hidden Markov Model Modeling rstan	6	1563	January 15, 2021
Speeding up inference on ARHMM Modeling	1	652	March 8, 2018
Hidden Markov Model with constraints Modeling specification , hmm	6	2662	August 24, 2017
Fitting a Hidden Markov model with hierarchical emission parameters Modeling hmm	2	1891	July 28, 2017

Gaussian HMM does not converge and has large rhat

Related topics