I’m attempting to model animal movement based on a regular time series of GPS and accelerometer measurements from GPS-collared terrestrial animals. I have a total of 21 tracks (track being a regular time-series without any gaps representing a single deployment on a single animal). I’m basing a lot of my code on @vianeylb and Theo Michelot’s “Intro to Animal Movement Modelling…”, with some differences. For example, at this stage I am modelling each track with stationary parameters. That is for each animal I have single constant transition probability matrix per track and for the emission probabilities of each state I have a single constant parameter set for each distribution of the observed quantities. My observed quantities are 1) an “activity” measurement based on a summary of the accelerometer data which is bounded between 0 and 1 and is modeled as Beta distributed; and 2) step length which the distance traveled in meters and is bounded to be positive and is modeled as Gamma distributed. I am only modelling two states.
So this is a pretty simple model at this point. For each track, I am estimating 10 parameters: the constant probability of remaining in state 1 or in state 2 (2 parameters), the logit of the mean of each Beta distribution for each state (2 parameters), the log of the precision of each Beta distribution (2 parameters) (Note: I’m following Cribari-Neto and Zeilis for the Beta specification), the log the Gamma mean (2 parameters) and the log the Gamma standard deviation (2 parameters). However, I am anticipating a hierarchical model for the 21 tracks. For example, I will eventually have a hyper Beta logit-mean distribution and each individual track’s Beta logit-mean would be drawn from this hyper distribution.
I have modeled code that is running well for fitting each track on it’s own. By virtue of using an ordered
type for the Beta logit mean, and the fact that my data are nicely behaved and seem pretty easy to separate, I’m not really running into any identifiability issues as outlined in @betanalpha’s Identifying Mixture Models vignette. Only the Beta logit mean is of the ordered
type for two reasons, 1) for almost every animals there is pretty strong signal of two modes in the activity measurement and 2) while shorter step-lengths tend to be associated with smaller activity values because of GPS error I do not have a step-length observation for each point in the time-series and I will sometimes have large values of step-length even with small activity values.
However, when I attempt to fit what is essentially the same model but run for multiple individuals all in one shot, I run into identifiability issues where one or more chains gets stuck on a different solution. For example, I’ve attached the traceplot for an example I ran where I am attempting to fit a model to four individual tracks all in one go using 3 chains.
For track 1 (beta_mu_logit[1,1]
and beta_mu_logit[1,2]
) chains 2 and 3 converge on the same solution and this solution agrees with the estimates I get when I run my single-track model for this track. Chain 1 on the otherhand is getting stuck on a solution that finds basically no difference in this parameter between the two states. Weirdly, it does not seem to be respecting the ordered
type for this chain: the c_summary
for chain 1 give the following means:
parameter mean
beta_mu_logit[1,1] -1.5542648
beta_mu_logit[1,2] -1.5529320
But for track 2, (beta_mu_logit[1,1]
and beta_mu_logit[1,2]
) only chain 3 finds a solution that agrees with my estimates from the single track model. Chains 1 and 2 once again hone in a solution that fits essentially no difference between the states. For track 3, it’s chain 2 that is off, but chain 1 and 3 are correct, and finally for track 4 it’s chain 3 that is off. In all cases the chain that is misbehaving is essentially fitting a constant value for each of the two states.
I’m wondering if I’m running to an issue by using an array of ordered
type in my multiple individual model. I’m confident that the log-likelihood math for each individual track is identical between the two model. At this point I’m fitting everything separately so information shouldn’t be shared between tracks (I’d like to go there, but I need to solve this issue first!). What can I try here?
Below is model code comparing the “single-track” model to the multiple individual model. Unfortunately, I cannot share the data at this point.
data block
Single Track:
data {
int<lower=0> T; // length of complete time series
int<lower=0, upper = T> T_obs_stp; // number of observed step lengths
int<lower=1, upper = T> index_obs_stp[T_obs_stp];
vector<lower = 0>[T_obs_stp] obs_steps;
vector<lower = 0, upper = 1>[T] activity;
}
Multiple Track:
data {
int<lower = 1> M; //number of individuals
int<lower=0> T; // length of complete time series
int<lower = 1, upper = M> ID[T]; //track identifier
int<lower=0, upper = T> T_obs_stp; // number of observed step lengths
int<lower=1, upper = T> index_obs_stp[T_obs_stp];
vector<lower = 0>[T_obs_stp] obs_steps;
vector<lower = 0, upper = 1>[T] activity;
}
parameter block
Single Track:
parameters {
ordered[2] beta_mu_logit; // logit mean of beta dist
vector[2] beta_phi_log; // log precision of beta dist
ordered[2] gamma_mu_log; // log mean of gamma
ordered[2] gamma_sigma_log; // log SD of gamma
vector<lower = 0, upper = 1>[2] maintain; // probability of remaining in current state
}
Multiple Track:
parameters {
ordered[2] beta_mu_logit[M]; // logit mean of beta dist
vector[2] beta_phi_log[M]; // log precision of beta dist
ordered[2] gamma_mu_log[M]; // log mean of gamma
ordered[2] gamma_sigma_log[M]; // log SD of gamma
vector<lower = 0, upper = 1>[2] maintain[M]; // probability of remaining in current state
}
transformed parameters block
Single Track:
transformed parameters {
vector<lower = 0>[2] alpha;
vector<lower = 0>[2] beta;
vector<lower = 0>[2] shape;
vector<lower = 0>[2] rate;
matrix[2, 2] trans_mat;
simplex[2] statdist; //stationary distribution
for(k in 1:2){
real beta_mean;
real beta_phi;
real gamma_mu;
real gamma_sigma;
beta_mean = inv_logit(beta_mu_logit[k]);
beta_phi = exp(beta_phi_log[k]);
gamma_mu = exp(gamma_mu_log[k]);
gamma_sigma = exp(gamma_sigma_log[k]);
alpha[k] = beta_mean * beta_phi;
beta[k] = (1 - beta_mean) * beta_phi;
shape[k] = square(gamma_mu)/square(gamma_sigma);
rate[k] = gamma_mu/square(gamma_sigma);
}
trans_mat[1, 1] = maintain[1];
trans_mat[1, 2] = 1 - maintain[1];
trans_mat[2, 1] = 1 - maintain[2];
trans_mat[2, 2] = maintain[2];
statdist = to_vector(rep_row_vector(1, 2) /
(diag_matrix(rep_vector(1, 2)) -
trans_mat + rep_matrix(1, 2, 2)));
}
Multiple Track:
transformed parameters {
vector<lower = 0>[2] alpha[M];
vector<lower = 0>[2] beta[M];
vector<lower = 0>[2] shape[M];
vector<lower = 0>[2] rate[M];
matrix[2, 2] trans_mat[M]; //matrix of row simplex gammas
simplex[2] statdist[M]; //stationary distribution
for (m in 1:M){
for(k in 1:2){
real beta_mean;
real beta_phi;
real gamma_mu;
real gamma_sigma;
beta_mean = inv_logit(beta_mu_logit[m, k]);
beta_phi = exp(beta_phi_log[m, k]);
alpha[m, k] = beta_mean * beta_phi;
beta[m, k] = (1 - beta_mean) * beta_phi;
gamma_mu = exp(gamma_mu_log[m, k]);
gamma_sigma = exp(gamma_sigma_log[m, k]);
shape[m, k] = square(gamma_mu)/square(gamma_sigma);
rate[m, k] = gamma_mu/square(gamma_sigma);
}
}
for (m in 1:M){
trans_mat[m, 1, 1] = maintain[m, 1];
trans_mat[m, 1, 2] = 1 - maintain[m, 1];
trans_mat[m, 2, 1] = 1 - maintain[m, 2];
trans_mat[m, 2, 2] = maintain[m, 2];
}
for (m in 1:M)
statdist[m] = to_vector(rep_row_vector(1, 2) /
(diag_matrix(rep_vector(1, 2)) -
trans_mat[m] + rep_matrix(1, 2, 2)));
}
model block
Single Track:
model {
vector[2] logp;
vector[2] logp_1;
vector[2] log_trans_mat_tr[2];
int obs_index_now;
//priors
beta_mu_logit[1] ~ normal(-3, 0.5);
beta_mu_logit[2] ~ normal(0, 1);
beta_phi_log[1] ~ normal(1, 1);
beta_phi_log[2] ~ normal(0, 1);
gamma_mu_log[1] ~ normal(3, 1);
gamma_mu_log[2] ~ normal(5, 1);
gamma_sigma_log[1] ~ normal(3, 1);
gamma_sigma_log[2] ~ normal(5, 1);
for(i in 1:2)
for(j in 1:2)
log_trans_mat_tr[j, i] = log(trans_mat[i, j]);
if (index_obs_stp[1] == 1) {
for (k in 1:2)
logp[k] = log(statdist[k]) + gamma_lpdf(obs_steps[1] | shape[k], rate[k]) +
beta_lpdf(activity[1] | alpha[k], beta[k]);
obs_index_now = 2;
} else {
for (k in 1:2)
logp[k] = log(statdist[k]) + beta_lpdf(activity[1] | alpha[k], beta[k]);
obs_index_now = 1;
}
for (t in 2:T){
if (index_obs_stp[obs_index_now] == t){
for (k in 1:2)
logp_1[k] = log_sum_exp(log_trans_mat_tr[k] + logp) +
gamma_lpdf(obs_steps[obs_index_now] | shape[k], rate[k]) +
beta_lpdf(activity[t] | alpha[k], beta[k]);
obs_index_now += 1;
} else {
for (k in 1:2)
logp_1[k] = log_sum_exp(log_trans_mat_tr[k] + logp) +
beta_lpdf(activity[t] | alpha[k], beta[k]);
}
logp = logp_1;
}
target += log_sum_exp(logp);
}
Multiple Track:
model {
vector[2] logp;
vector[2] logp_temp;
matrix[2, 2] log_trans_mat_transpose[M];
int obs_index_now = 1;
for (m in 1:M){
beta_mu_logit[m, 1] ~ normal(-3, 0.5);
beta_mu_logit[m, 2] ~ normal(0, 1);
beta_phi_log[m, 1] ~ normal(1, 1);
beta_phi_log[m, 2] ~ normal(0, 1);
gamma_mu_log[m, 1] ~ normal(3, 1);
gamma_mu_log[m, 2] ~ normal(5, 1);
gamma_sigma_log[m, 1] ~ normal(3, 1);
gamma_sigma_log[m, 2] ~ normal(5, 1);
}
for (m in 1:M)
for (i in 1:2)
for (j in 1:2)
log_trans_mat_transpose[m, j, i] = log(trans_mat[m, i, j]);
for(t in 1:T){
//initialise forward variable if first obs per track
if (t == 1 || ID[t] != ID[t - 1]){
for (k in 1:2)
logp[k] = log(statdist[ID[t], k]) + beta_lpdf(activity[t] | alpha[ID[t], k], beta[ID[t], k]);
if (index_obs_stp[obs_index_now] == t) {
for (k in 1:2)
logp[k] += gamma_lpdf(obs_steps[obs_index_now] | shape[ID[t], k], rate[ID[t], k]);
obs_index_now += 1;
}
} else {
for (k in 1:2)
logp_temp[k] = log_sum_exp(log_trans_mat_transpose[ID[t], k]' + logp) +
beta_lpdf(activity[t] | alpha[ID[t], k], beta[ID[t], k]);
if (index_obs_stp[obs_index_now] == t) {
for (k in 1:2)
logp_temp[k] += gamma_lpdf(obs_steps[obs_index_now] | shape[ID[t], k], rate[ID[t], k]);
obs_index_now += 1;
}
logp = logp_temp;
}
if (t == T || ID[t + 1] != ID[t])
target += log_sum_exp(logp);
}
}