Constraints fail on transformed parameter block in hierarchical HMM model


Hi All,
I am trying to expand Hidden Markov Models to fit to typical movement data. The starting point is based on earlier discussions between @bgoodri and @Ben_Lambert. One extension that I am trying is to fit one hierarchical model to several animals, allowing the parameters that govern movement (ie step length and turning angle) to vary between the individuals.
The issue that I am running into is that initialization fail despite setting constraints to parameters in the transformed parameter block.
Here is the model I tried: hmm_covariates_hier.stan (5.1 KB)

And some code to generate data:


n_obs <- 200
ss <- simData(nbAnimals = 5,nbStates = 2,stepDist = "exp",
              angleDist = "wrpcauchy",stepPar = c(2,0.1),anglePar = c(0,1,0.15,0.75),
              obsPerAnimal = n_obs)
data <- list(N=(n_obs-2)*5,turn=ss$angle[!$angle)],

m <- stan("hmm_covariates_hier.stan",data=data,control=list(adapt_delta=0.99))
#this result in many initial error:

Error evaluating the log probability at the initial value.
validate transformed params: rho_ind[3] is -4.38754, but must be greater than or equal to 0
Rejecting initial value:
Error evaluating the log probability at the initial value.
validate transformed params: mu_ind[5] is -4.66398, but must be greater than or equal to -3.14159

Initialization between (-2, 2) failed after 100 attempts. 
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

I tried to play a bit around setting an initialization function, but this did not solve anything.
Am I missing something?


So far all that’s missing is the error message.


Thanks for your fast answer, updated the original post.


So I can’t open a .Stan file on my phone easily but that’s telling you that you have a (transformed?) parameter with a constraint that is not being met. Are you missing an explanation transformation in an assignment?


Ugh, exp not explanation


It looks like both rho_ind, alpha_ind, and mu_ind are all multiplied by unbounded parameters (error), which would cause them to take values outside of the constraints specified by their declarations:

parameters {
  matrix[S,K] error[3];

transformed parameters {
  matrix<lower=-pi(),upper=pi()>[S,K] mu_ind;
  matrix<lower=0>[S,K] rho_ind;
  matrix<lower=0>[S,K] alpha_ind;
  mu_ind = rep_matrix(mu + rep_vector(sigma_mu, K), S)' .* error[1,,];
  rho_ind = rep_matrix(rho + rep_vector(sigma_rho, K), S)' .* error[2,,];
  alpha_ind = rep_matrix(alpha + rep_vector(sigma_alpha, K), S)' .* error[3,,];
model {
  to_vector(error[1,,]) ~ normal(0,1);
  to_vector(error[2,,]) ~ normal(0,1);
  to_vector(error[3,,]) ~ normal(0,1);

So, as @sakrejda suggests, maybe using some kind of link function or transformation to satisfy the lower and upper bounds of support would be a good solution.


Ah yes, good point. Here is how I re-wrote the model:

transformed parameters {
  mu_ind = sigma_mu * error[1,,]; //mu_ind is N(0,sigma_mu)
  mu_ind = (2 * pi() / (max(mu_ind) - min(mu_ind))) * (mu_ind - max(mu_ind)) + pi(); //re-scaling between   -pi and pi
  mu_ind = mu_ind + rep_matrix(mu,S)'; //mu_ind is N(mu,sigma_mu) and scaled between -pi ad pi

  rho_ind = rep_matrix(rho + rep_vector(sigma_rho, K), S)' .* exp(error[2,,]);

  alpha_ind = rep_matrix(alpha + rep_vector(sigma_alpha, K), S)' .* exp(error[3,,]);

The model runs despite some initialization failures for the mu_ind parameter, I guess that I am somehow doing the math wrong there.

Now the issue is that the chains show all kind of bad behaviors (no mixing, non-stationarity for some of them, divergence, issue with treedepth …). So I wonder if this is an issue with this parametrization, or if this particular type of model is tricky and I should feed in more (simulated) data.


If it’s not a supervised HMM, you run into all the same problems as with other mixture models. You might want to take a look at Michael Betancourt’s mixture case study.

Initialization failure winds up happening either due to missing constraints (some values should have non-zero density but don’t) or due to numerical issues.