Hi all,

I’m working on a hidden markov model to analyze multi-state capture-recapture data when there is uncertainty in the assignment of an individual’s state upon capture. In the ecological literature these are referred to as “multi-event” models. The transition and emission matrices are parametrized in terms of demographic rates and detection probabilities.

More concretely, my animals can be in three states: juveniles, adults or dead (coded 1,2,3 resp.). The observations take place in two periods each year, during the breeding season, and outside the breeding season. Thus there are four possible observable events for an individual in a given year: observed during the breeding season only, observed outside the breeding season only, observed in both periods, or neither (coded 1,2,3,4 resp.). The data consists of a capture history matrix, with each row corresponding to an individual, each column to a year, and each entry giving the individual’s observation code in that year.

Here’s my model, with the forward algorithm to implemented as described in the Stan User Manual’s page on HMMs:

```
data {
int<lower=2> T; // number of years
int<lower=1> N; // number of individuals
int<lower=1,upper=4> y[N,T]; // capture histories
}
parameters {
// survival probabilities
real<lower=0,upper=1> psi1; // juveniles
real<lower=0,upper=1> psi2; // adults
// recruitment probability
real<lower=0,upper=1> beta;
// detection probabilities
real<lower=0,upper=1> q1; // juveniles outside breeding season
real<lower=0,upper=1> q2; // adults outside br. season
real<lower=0,upper=1> p; // adults during br. season
}
transformed parameters {
simplex[3] theta[3]; // transition probabilities
simplex[4] phi[3]; // emission probabilities
// define theta
theta[1] = [psi1*(1-beta), psi1*beta, 1-psi1]';
theta[2] = [ 0, psi2, 1-psi2]';
theta[3] = [ 0, 0, 1]';
//define phi
phi[1] = [ 0, q1, 0, 1-q1]';
phi[2] = [p*(1-q2), (1-p)*q2, p*q2, (1-p)*(1-q2)]';
phi[3] = [ 0, 0, 0, 1]';
}
model {
// default priors
// likelihood
for (n in 1:N) { // loop over individuals
matrix[T, 3] gamma;
vector[3] acc;
// initialise forward algorithm at t = 1
for (k in 1:3) {
gamma[1, k] = log(phi[k,y[n,1]]);
}
// recursion
for (t in 2:T) {
for (k in 1:3) { // k is hidden state at t
for (j in 1:3) { // j is hidden state at t-1
acc[j] = gamma[t-1,j] + log(theta[j, k])+ log(phi[k, y[n,t]]);
}
gamma[t, k] = log_sum_exp(acc);
}
}
target += log_sum_exp(gamma[T]);
}
}
```

And here is the R code I used to simulate data under the model:

```
### Set values for parameters and transformed parameters ###
# parameters
psi1 <- 0.77
psi2 <- 0.9
beta <- 0.33
q1 <- 0.7
q2 <- 0.8
p <-0.8
# transition matrix
theta <- matrix(c(psi1*(1-beta), psi1*beta, 1-psi1,
0, psi2, 1-psi2,
0, 0, 1), byrow=T, nrow=3)
# emission matrix
phi <- matrix(c( 0, q1, 0, 1-q1,
p*(1-q2), (1-p)*q2, p*q2, (1-p)*(1-q2),
0, 0, 0, 1), byrow=T, nrow=3)
### Simulate data ###
# Assume the individuals are all detected as juveniles at time 1
T <- 10
N <- 100
# Simulate hidden states
z <- matrix(NA,nrow=N,ncol=T)
z[,1] <- rep(1,N)
for (n in 1:N){
for (t in 1:(T-1)){
z[n,t+1] <- sample(1:3,1,prob=theta[z[n,t],])
}
}
# Simulate observations
y <- matrix(4, N,T)
for (n in 1:N){
for (t in 1:T){
y[n,t] <- sample(1:4,1,prob=phi[z[n,t],])
}
}
# Remove individuals that are never observed
observed <- which(apply(y,1,min)!=4)
y <- y[observed,]
z <- z[observed,]
# Fit model
N <- dim(y)[1]
T <- dim(y)[2]
input_data <- list(T=T,N=N,y=y)
fit <- stan("model.stan",data=input_data)
```

The model fails to initialize: the initial values are repeatedly rejected, with the error “Gradient evaluated at initial value is not finite.” I don’t understand why?

All the parameters are appropriately constrained, and I think all the simplices in the transformed parameters block are valid. Setting “inits=0” in the call to stan() does not help.

Perhaps the issue relates to log(0) values that occur in the forward algorithm, e.g. for log(theta[2,1])? These negative infinity values should behave the right way as far as the calculation of the likelihood goes, but perhaps they are a problem for the gradient?

Any insights or suggestions would be much appreciated!