State-space model - log_lik and loo when using the forward algorithm

Hi Everyone,

I am using a 4 state state-space model from the mark-recapture book by Kéry and Schaub 2012, and kindly translated into stan by Hiroki Itô, The model estimates 4 parameters true survival, detection, site fidelity, and tag return rate - full code attached.

I would like to calculate the log-likelihood so that I can compare my models using loo.

I was previously using CJS models and was able to find a simple example and extract the log-likelihood in the generated quantities section, but I’m a beginner in stan and my basic knowledge is now struggling to see how to do this for this differently coded likelihood. Any help with this would be much appreciated.

  // Likelihood
  // Forward algorithm derived from Stan Modeling Language
  // User's Guide and Reference Manual
  for (i in 1:nind) {
    if (first[i] > 0) {
      for (k in 1:4)
        gamma[first[i], k] = (k == y[i, first[i]]);

      for (t in (first[i] + 1):n_occasions) {
        for (k in 1:4) {
          for (j in 1:4)
          acc[j] = gamma[t - 1, j] * ps[j, i, t - 1, k]*po[k, i, t - 1, y[i, t]];
          gamma[t, k] = sum(acc);
          print(gamma[t,k])
        }
      }
      target += log(sum(gamma[n_occasions]));
    }
  }

lifedead_KL_final.stan (4.7 KB)

Hi @kjl - I think you could use leave-one-individual-out cross validation using the log likelihood for each individual’s capture history.

To do this, you can store the log likelihood for each individual in a vector in the transformed parameters block, then increment the log probability in the model block e.g.,

// -------------------------------------------------
// States (S):
// 1 alive in study area
// 2 alive outside study area
// 3 recently dead and recovered
// 4 recently dead, but not recovered, or dead (absorbing)
// Observations (O):
// 1 seen alive
// 2 recovered dead
// 3 neither seen nor recovered
// -------------------------------------------------

functions {
  /**
   * Return an integer value denoting occasion of first capture.
   * This function is derived from Stan Modeling Language
   * User's Guide and Reference Manual.
   *
   * @param y         Observed values
   * @return Occasion of first capture
   */
  int first_capture(int[] y_i) {
    for (k in 1:size(y_i))
      if (y_i[k] != 3)
        return k;
    return 0;
  }
}

data {
  int<lower=0> nind;
  int<lower=0> n_occasions;
  int<lower=1,upper=3> y[nind, n_occasions];
  real<lower=0,upper=1> p_adjust[n_occasions-1];// adjusts different sampling effort
  int<lower=2,upper=4> phi_adjust[n_occasions-1];//adjusts for uneven sampling
}

transformed data {
  int n_occ_minus_1 = n_occasions - 1;
  int<lower=0,upper=n_occasions> first[nind];

  for (i in 1:nind)
    first[i] = first_capture(y[i]);
}

parameters {
  real<lower=0,upper=1> mean_s;    // Mean survival
  real<lower=0,upper=1> mean_f;    // Mean fidelity
  real<lower=0,upper=1> mean_r;    // Mean recovery
  real<lower=0,upper=1> mean_p;    // Mean recapture
}

transformed parameters {
  vector<lower=0,upper=1>[n_occ_minus_1] s; // True survival probability
  vector<lower=0,upper=1>[n_occ_minus_1] F; // Fidelity probability
  vector<lower=0,upper=1>[n_occ_minus_1] r; // Recovery probability
  vector<lower=0,upper=1>[n_occ_minus_1] p; // Recapture/resighting probability
  simplex[4] ps[4, nind, n_occ_minus_1];
  simplex[3] po[4, nind, n_occ_minus_1];
  vector[nind] log_lik = rep_vector(0, nind);

  // Constraints
  for (t in 1:n_occ_minus_1) {
    s[t] = exp(phi_adjust[t]*(log(mean_s)/3)); // adjust for uneven sampling interval
    //s[t] = mean_s;
    F[t] = mean_f;
    r[t] = mean_r;
    p[t] = mean_p*inv_logit(p_adjust[t]); //adjusts for uneven sampling effort 
    //p[t] = mean_p ;
  
  }

  // Define state-transition and observation matrices
  for (i in 1:nind) {
    // Define probabilities of state S(t+1) given S(t)
    for (t in 1:n_occ_minus_1) {
      ps[1, i, t, 1] = s[t] * F[t];
      ps[1, i, t, 2] = s[t] * (1.0 - F[t]);
      ps[1, i, t, 3] = (1.0 - s[t]) * r[t];
      ps[1, i, t, 4] = (1.0 - s[t]) * (1.0 - r[t]);
      ps[2, i, t, 1] = 0.0;
      ps[2, i, t, 2] = s[t];
      ps[2, i, t, 3] = (1.0 - s[t]) * r[t];
      ps[2, i, t, 4] = (1.0 - s[t]) * (1.0 - r[t]);
      ps[3, i, t, 1] = 0.0;
      ps[3, i, t, 2] = 0.0;
      ps[3, i, t, 3] = 0.0;
      ps[3, i, t, 4] = 1.0;
      ps[4, i, t, 1] = 0.0;
      ps[4, i, t, 2] = 0.0;
      ps[4, i, t, 3] = 0.0;
      ps[4, i, t, 4] = 1.0;

      // Define probabilities of O(t) given S(t)
      po[1, i, t, 1] = p[t];//in state 1 (alive inside) and obs alive 1 must be detected = p[t]
      po[1, i, t, 2] = 0.0; //in state 1 (alive inside) and obs not detected obs 2 (recovered dead) = 0
      po[1, i, t, 3] = 1.0 - p[t];// in state 1 (alive inside) and not detected obs 3 = 1-p[t]
      po[2, i, t, 1] = 0.0; // in state 2 alive outside so alive not detected = 0
      po[2, i, t, 2] = 0.0; // in state 2 alive outside and cannot be recovered dead  = 0
      po[2, i, t, 3] = 1.0; // in state 2 and obs is not detected = 1
      po[3, i, t, 1] = 0.0; // state 3 recovered can't be detectect = 0
      po[3, i, t, 2] = 1.0; // state 3 recovered found dead 2 = 1
      po[3, i, t, 3] = 0.0; // state 3 recovered not seen or found dead 3 = 0
      po[4, i, t, 1] = 0.0; // state 4 dead not seen not alive = 0
      po[4, i, t, 2] = 0.0; // state 4 dead without recovery then can't be recovered dead afterwards = 0 
      po[4, i, t, 3] = 1.0; // state 4 dead without recovery then obs 3 not detected = 1
      }
   }
   
  // compute log likelihood for individual capture histories
  {
    real acc[4];
    vector[4] gamma[n_occasions];

    // Forward algorithm derived from Stan Modeling Language
    // User's Guide and Reference Manual
    for (i in 1:nind) {
      if (first[i] > 0) {
        for (k in 1:4)
          gamma[first[i], k] = (k == y[i, first[i]]);
  
        for (t in (first[i] + 1):n_occasions) {
          for (k in 1:4) {
            for (j in 1:4)
            acc[j] = gamma[t - 1, j] * ps[j, i, t - 1, k]*po[k, i, t - 1, y[i, t]];
            gamma[t, k] = sum(acc);
            print(gamma[t,k])
          }
        }
        log_lik[i] = log(sum(gamma[n_occasions]));
      }
    }
  }
}

model {
  // Uniform priors are implicitly defined.
  //  mean_s ~ uniform(0, 1);
  //  mean_f ~ uniform(0, 1);
  //  mean_r ~ uniform(0, 1);
  //  mean_p ~ uniform(0, 1);
  target += sum(log_lik);
}

2 Likes

Great! Thank you, I’ll give it a go! I was trying to do something much more complicated, and wrong - so thanks for the prompt reply!

1 Like