HMM always stuck from the beginning of sampling

Dear Stan programmers,

I’m running a series of HMMs with different likelihood specifications, which do not have complex structure. The basic one works, but those with poisson and tobit likelihood always stuck from the very beginning. Additionally, when I added in-sample and out-of-sample predictions to generated quantities block, the basic one aslo stuck. Compilation was successful, no error reported, tried both rstan and cmdstanpy. Data is not big.

For example, below is poisson one:


data {
  int<lower=1> S;                            // Number of states
  int<lower=1> I;                            // Number of individuals
  array[I] int<lower=1> T;                   // Number of calibration periods for each individual
  array[I, max(T)] int<lower=0> Y;           // Observed behavior for calibration periods
  array[I] int<lower=1> T_test;              // Number of testing periods for each individual
  array[I, max(T_test)] int<lower=0> Y_test; // Observed behavior for testing periods
}

parameters {
  // initial state probs
  vector[S-1] rho_raw;                       // Raw parameters for initial state probabilities

  // transition matrix
  row_vector[S*(S-1)] phi_raw;               // Raw parameters for transition probabilities (population baseline)
  matrix[S*(S-1), I] z_q;                    // Indep std normals for transition probabilities
  cholesky_factor_corr[S*(S-1)] L_corr_Sigma_q;   // Cholesky decomposition of correlation matrix for transition parameters
  vector<lower=0, upper=pi()/2>[S*(S-1)] tau_unif;  // Prior scale for transition parameters

  // emission
  vector<lower=0>[S] beta_0_raw;             // Raw population baseline emission means, after log-transfer obs > 0
  array[I] real<lower=0> beta_1;                           // Individual-random-effect scale part of emission rate
  real<lower=0> sigma_beta_1;                // Population standard deviation for scale part of emission rate: beta_1
}

transformed parameters {
  simplex[S] pi;                             // Initial state probabilities
  vector<lower=0>[S*(S-1)] tau = 2.5 * tan(tau_unif); // Standard deviation for transition parameters
  matrix[I, S*(S-1)] theta;                  // Individual transition parameters
  array[I] matrix[S, S] Q;                   // Transition matrices for each individual
  vector[S] beta_0;                          // Ordered state-dependent population baseline emission rate
  array[I] vector[S] lambda;                 // Individual-specific random effect for emission rate

  // Compute initial state probabilities
  pi = softmax(append_row(rho_raw, 0));

  // Compute individual transition parameters and transition matrices
  theta = rep_matrix(phi_raw, I) + (diag_pre_multiply(tau, L_corr_Sigma_q) * z_q)';  // element-wise addition
        //rep_matrix(phi_raw, I) returns a matrix[I, S*(S-1)]

  for (i in 1:I) {
    for (k in 1:S) {
      vector[S] ttheta = append_row(to_vector(theta[i, ((S-1) * (k-1) + 1):((S-1) * k)]), 0);
      Q[i][k] = to_row_vector(softmax(ttheta));  // Convert to row_vector and assign
    }
  }

  // Compute ordered baseline emission means
  beta_0[1] = beta_0_raw[1];
  for (s in 2:S) {
    beta_0[s] = beta_0[s-1] + beta_0_raw[s];
  }

  // Compute individual-specific emission means
  for (i in 1:I) {
    lambda[i] = beta_0 * beta_1[i];                // Element-wise multiplication; (lambda_i | S_it = k) =  beta_0[k] * beta_1[i]
  }
}

model {
  // Priors
  rho_raw ~ normal(0, 5);                              // Prior for initial state parameters
  to_vector(z_q) ~ std_normal();                       // Implies theta ~ MVN(phi_raw, Sigma_q)
  to_vector(phi_raw) ~ normal(0, 5);                   // Prior for transition matrix parameters
  L_corr_Sigma_q ~ lkj_corr_cholesky(2);               // LKJ prior for Cholesky factor

  beta_0_raw ~ lognormal(0, 3);                        // Prior for emission means
  sigma_beta_1 ~ cauchy(0, 2.5);                       // Population standard deviation for beta_1
  beta_1 ~ lognormal(0, sigma_beta_1);                 // Individual-specific scale part for emission rate

  // Likelihood (Forward algorithm for calibration period)
  for (i in 1:I) {
    array[T[i]] vector[S] gamma;
    array[S] real acc;

    // Initialize the first time step
    for (k in 1:S) {
      gamma[1, k] = log(pi[k]) + poisson_lpmf(Y[i, 1] | lambda[i, k]);
    }

    // Forward algorithm over time steps (up to T[i])
    for (t in 2:T[i]) {
      for (k in 1:S) {
        for (j in 1:S) {
          acc[j] = gamma[t-1, j] + log(Q[i, j, k]) + poisson_lpmf(Y[i, t] | lambda[i, k]);
        }
        gamma[t, k] = log_sum_exp(acc);
      }
    }

    target += log_sum_exp(gamma[T[i]]);
  }
}

generated quantities {
  simplex[S] pi_generated = pi;
  matrix[S*(S-1), S*(S-1)] Sigma_q;
  array[I] matrix[S, S] Q_generated = Q;
  vector[I] log_lik;
  vector[I] log_lik_test;
  vector[I] log_p_y_star;
  array[I, max(T)] int y_star;  // Declare y_star with fixed maximum length max_T

  // Reconstruct covariance matrix
  Sigma_q = quad_form_diag(L_corr_Sigma_q, tau);

  for (i in 1:I) {
    array[T[i]] vector[S] gamma;
    array[T[i]] vector[S] best_logp_calib;
    array[T[i], S] int bacS_ptr_calib;
    array[T_test[i]] vector[S] gamma_test;
    array[S] real acc;

    // Forward algorithm for calibration periods (log-likelihood)
    for (k in 1:S) {
      gamma[1, k] = log(pi[k]) + poisson_lpmf(Y[i, 1] | lambda[i, k]);
      best_logp_calib[1,k] = gamma[1, k];
    }

    for (t in 2:T[i]) {
      for (k in 1:S) {
        best_logp_calib[t, k] = negative_infinity();
        for (j in 1:S) {
          real logp = best_logp_calib[t-1, j] + log(Q[i, j, k]) + poisson_lpmf(Y[i, t] | lambda[i, k]);
          acc[j] = gamma[t-1, j] + log(Q[i, j, k]) + poisson_lpmf(Y[i, t] | lambda[i, k]);
          if (logp > best_logp_calib[t, k]) {
            bacS_ptr_calib[t, k] = j;
            best_logp_calib[t, k] = logp;
          }
        }
        gamma[t, k] = log_sum_exp(acc);
      }
    }
    log_p_y_star[i] = max(best_logp_calib[T[i]]);  // Probability of most likely state (Viterbi)

    for (k in 1:S)
      if (best_logp_calib[T[i], k] == log_p_y_star[i])
        y_star[i, T[i]] = k;  // Most likely state (Viterbi) at time T[i]

    for (t in 1:(T[i] - 1))
      y_star[i, T[i] - t] = bacS_ptr_calib[T[i] - t + 1, y_star[i, T[i] - t + 1]];

    // Ensure y_star values are within bounds (1 to S)
    for (t in 1:T[i]) {
      if (y_star[i, t] < 1) {
        y_star[i, t] = 1;
      } else if (y_star[i, t] > S) {
        y_star[i, t] = S;
      }
    }

    // Optionally set invalid entries in y_star to a default value (e.g., 0)
    for (t in (T[i] + 1):max(T)) {
      y_star[i, t] = 0;
    }

    log_lik[i] = log_sum_exp(gamma[T[i]]);

    // Forward algorithm for testing periods (log-likelihood)
    for (k in 1:S) {
      for (j in 1:S) {
        acc[j] = gamma[T[i], j] + log(Q[i, j, k]) + poisson_lpmf(Y_test[i, 1] | lambda[i, k]);
      }
      gamma_test[1, k] = log_sum_exp(acc);
    }
    for (t in 2:T_test[i]) {
      for (k_test in 1:S) {
        for (j in 1:S) {
          acc[j] = gamma_test[t - 1, j] + log(Q[i, j, k_test]) + poisson_lpmf(Y_test[i, t] | lambda[i, k_test]);
        }
        gamma_test[t, k_test] = log_sum_exp(acc);
      }
    }
    log_lik_test[i] = log_sum_exp(gamma_test[T_test[i]]);
  }
}

Thank you very much in advance.

Best,
Fay

Iterative operations are very expensive in terms of runtime, especially when nested like this. Have you tried setting Stan to report on each warm-up/sample (instead of the default of each 100 warm-ups/samples) to see if it’s just running very slowly rather than stuck?