Capture-Recapture with further impact on survivability

Dear Forum,
I have been using Cormack Jolly Seber (CJS) examples in the manual and online to try and estimate posteriors for survivability and prob of recapture, in addition to the influence of a trait which reducing survivability (phi).

p = prob of recapture
phi = prob of survivability
dphi[4] = reduction in survivability based one 4 levels (1=nil, 2=low, 3=med, 4=high)

I created a dataset and for normal p and phi estimations it works using the code in the manual.
Sample of capture recapture data (X)
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23
[1,] 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
[2,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
[3,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0
[4,] 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[5,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0
[6,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0

The additional parameter is also included in the generated data as Y. Most animals do NOT exhibit this, but it is present.
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22
[1,] 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1
[2,] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[3,] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3
[4,] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[5,] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1
[6,] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

I am getting initialization errors, but have tried to truncate priors to mitigate that as an initial work-around.

Any pearls of wisdom would be greatly appreciated.

Thanks,
Steve

data {
  int<lower=1> K;                      // capture events
  int<lower=0> I;                      // number of individuals
  int<lower=0,upper=1> X[I,K];         // X[i,k]: individual i captured at k (binary matrix)
  int<lower=1,upper=4> Y[I,K-1];       // Y[i,k-1]: individual influence on future survivability (1=nil, 2=low, 3=med, 4=high)
}

transformed data {
  int<lower=0,upper=K+1> first[I];     // first[i]: ind i first capture
  int<lower=0,upper=K+1> last[I];      // last[i]:  ind i last capture

  first = rep_array(K+1,I);
  last = rep_array(0,I);
  for (i in 1:I) {
    for (k in 1:K) {
      if (X[i,k] == 1) {
        if (k < first[i]) 
          first[i] = k;
        if (k > last[i]) 
          last[i] = k;
      }
    }
  }

}

parameters {
  vector<lower=0,upper=1>[K-1] phi;       // phi[k]: Pr[alive at k + 1 | alive at k]
  vector<lower=0,upper=1>[K] p;           // p[k]: Pr[capture at k]
  real<lower=0,upper=1> dphi[4];          //upper limit should be < phi... but we I set it as 1?
}

model {
  for (i in 1:I) {
    if (last[i] > 0) { //implies there are some captures for the animal atleast (put in based on released animals never originally caught)
      for (k in (first[i]+1):last[i]) {
      
      //SURVIVABILITY
        if (Y[i, k-1] > 1)                                // if individual influencing parameter exists
          target += (log(phi[k-1] - dphi[Y[i,k-1]]));     // uses one case, else uses another
        else                                              
          target += (log(phi[k-1]));
        
      //CAPTUREABILITY  
        if (X[i,k] == 1)
          target += (log(p[k]));       // i captured at k  // increment_log_prob(log(p[k]));
        else
          target += (log1m(p[k]));     // i not captured at k  // increment_log_prob(log1m(p[k]));
      }
    }
  }

    p ~ normal( 0.3 , 0.2 );
    phi ~ normal( 0.7 , 0.1 );
    dphi[1] ~ uniform( 0.4 , 0.6 ); //This is coded to be redundant above
    dphi[2] ~ normal( 0.3 , 0.1 ); // T[0,(min(phi)-0.01)];
    dphi[3] ~ normal( 0.6 , 0.1 ); // T[0,(min(phi)-0.01)];
    dphi[4] ~ normal( 0.9 , 0.2 ); // T[0,(min(phi)-0.01)];
}

Hi! Can you paste the error message? Thanks for the data and model code.

Hi,

The error results from when trying to initialise.
Chain 1: Rejecting initial value:
Chain 1: Log probability evaluates to log(0), i.e. negative infinity.
Chain 1: Stan can’t start sampling from this initial value.

I suspect it is this:
target += (log(phi[k-1] - dphi[Y[i,k-1]]));

Have you had experience using a bernoulli model instead
Regards,
Steve

Is it possible that you’re getting negative survival probabilities by subtracting dphi from phi?

This phi[k-1] - dphi[Y[i,k-1]] could give negative values under the current parameterization, if I’m understanding correctly. I wonder whether a different parameterization for your survival effects would help, e.g.,

\phi_{ik} = \text{logit}^{-1}(\alpha + \beta_{y_{i k}}),

where \alpha is an intercept, and \beta is a vector with 4 elements, indexed by y. You could declare alpha and beta in the parameters block, and compute phi as a transformed parameter. The logit transform would ensure that you get survival probabilities between 0 and 1.

1 Like

Also can you post the model call? I want to see if what you settings are and if you specified inits.

Hi,

@mbjoseph Started looking into logit. Seems like a logical way forward, just yet to get it working. Alternatively, perhaps inits can be specified (if I am reading into what @Ara_Winter is saying).

@Ara_Winter
sampling_iterations = 1e5
options(mc.cores = 3)
out = rstan::sampling(
object=model
, data = data
, chains = 3
, iter = sampling_iterations
, warmup = sampling_iterations/2
, thin = 4#2#1
, refresh = sampling_iterations/10
)

Regards,
Steve

Thanks @Stephen_Bradshaw ! Can you try setting your init = 0 in the model call? But I would also take the approach that @mbjoseph outlined.

@Stephen_Bradshaw here’s an example of how you might ensure that the survival probabilities are bounded between 0 and 1 (note the use of log_inv_logit where previously you had log:

data {
  int<lower=1> K;                      // capture events
  int<lower=0> I;                      // number of individuals
  int<lower=0,upper=1> X[I,K];         // X[i,k]: individual i captured at k (binary matrix)
  int<lower=1,upper=4> Y[I,K-1];       // Y[i,k-1]: individual influence on future survivability (1=nil, 2=low, 3=med, 4=high)
}

transformed data {
  int<lower=0,upper=K+1> first[I];     // first[i]: ind i first capture
  int<lower=0,upper=K+1> last[I];      // last[i]:  ind i last capture

  first = rep_array(K+1,I);
  last = rep_array(0,I);
  for (i in 1:I) {
    for (k in 1:K) {
      if (X[i,k] == 1) {
        if (k < first[i]) 
          first[i] = k;
        if (k > last[i]) 
          last[i] = k;
      }
    }
  }
}

parameters {
  real alpha;
  vector[4] beta;
  vector<lower=0,upper=1>[K] p;           // p[k]: Pr[capture at k]
}

model {
  for (i in 1:I) {
    if (last[i] > 0) { 
      //implies there > 1 captures for the animal 
      // (put in based on released animals never originally caught)
      for (k in (first[i]+1):last[i]) {
        // survival
        if (Y[i, k-1] > 1)  // if individual influencing parameter exists
          target += log_inv_logit(alpha + beta[Y[i, k-1]]); 
        else                                              
          target += log_inv_logit(alpha);
        
        // capture
        X[i, k] ~ bernoulli(p[k]);
      }
    }
  }

  p ~ beta(1, 1);
  alpha ~ normal(0, 1.5);
  beta ~ std_normal();
}

1 Like

@ mbjoseph

Thanks a lot for the code snippet! I’ve started implementing it but am yet to get it converge to values which I am expecting (from generated data).
I appreciate your help.

Steve

1 Like

Hi again,

@mbjoseph I have played with your suggestions and since worked through the standev github code to better understand this.

The posteriors for p and phi come out near what I expect (for generated data), although a lot of the posterior data is zero, hence I need to subset the posterior draws (doesn’t sit well with me, but the “constaint” section shows this as there is a lot of data which doesn’t inform the estimations of p and phi… Is this sound practice (trucating the posteriors)?

Secondly, the beta terms exist for each individual animal, at each step in time. 4 values are estimated, the first of which I expect to be uninformative as it is a “filler” variable (representing a normal animal), whereas the subsequent beta values represent a presence of a certain trait in differing levels.

I expected the logit of the posterior of these would have meaning, but only when I further subtract that from one it has meaning. Is this sensical in obtaining meaning from a parameter established in this way?

Any advice is very welcome.

Thanks,
Steve

transformed data {
  int n_occ_minus_1 = n_occasions - 1;
  int<lower=0,upper=n_occasions> first[nind];
  int<lower=0,upper=n_occasions> last[nind];
  vector<lower=0,upper=nind>[n_occasions] n_captured;

  for (i in 1:nind)
    first[i] = first_capture(y[i]);
  for (i in 1:nind)
    last[i] = last_capture(y[i]);
  n_captured = rep_vector(0, n_occasions);
  for (t in 1:n_occasions)
    for (i in 1:nind)
      if (y[i, t])
        n_captured[t] = n_captured[t] + 1;
}

parameters {
  vector[4] beta;                                // Slope parameter
  real<lower=0,upper=1> mean_phi;     // Mean survival
  real<lower=0,upper=1> mean_p;       // Mean recapture
}

transformed parameters {
  matrix<lower=0,upper=1>[nind, n_occ_minus_1] phi;
  matrix<lower=0,upper=1>[nind, n_occ_minus_1] p;
  matrix<lower=0,upper=1>[nind, n_occasions] chi;
  real mu = logit(mean_phi);
  
   // Constraints
  for (i in 1:nind) {
    for (t in 1:(first[i] - 1)) {
      phi[i, t] = 0;
      p[i, t] = 0;
    }
    
    for (t in first[i]:n_occ_minus_1) {
      phi[i, t] = inv_logit(mu + beta[x[i, t]]);
      p[i, t] = mean_p;
    }
  }

  chi = prob_uncaptured(nind, n_occasions, p, phi);
}

model {
  // Priors
  beta ~ normal(0, 1);

  for (i in 1:nind) {
    if (first[i] > 0) {
      for (t in (first[i] + 1):last[i]) {
        1 ~ bernoulli(phi[i, t - 1]);
        y[i, t] ~ bernoulli(p[i, t - 1]);
      }
      1 ~ bernoulli(chi[i, last[i]]);
    }
  }
}
1 Like

Subsetting posterior draws like this sounds sketchy to me too, but if I follow, you are subsetting the values that you are hard-coding to 0 in the transformed parameters block. If this is correct, then it’s not statistically invalid but it does seem inefficient. There may be a way to code the model where you don’t need to do this. For example, detection probability p appears to be constant for all individuals and time steps. Can you use mean_p everywhere instead? Second, you might be able to avoid defining phi as a variable. e.g., something like:

  for (i in 1:nind) {
    if (first[i] > 0) {
      for (t in (first[i] + 1):last[i]) {
        1 ~ bernoulli_logit(mu + beta[x[i, t]]);
        y[i, t] ~ bernoulli(mean_p);
      }
      1 ~ bernoulli(chi[i, last[i]]);
    }
  }

I’d need to see what’s going on with the prob_uncaptured function to give a more complete suggestion, but I suspect there may be a way to construct chi more efficiently (something that doesn’t require a bunch of 0’s in these matrices).

Are you looking for a way to interpret the posterior for beta? If so, can you look at posterior densities for survival probability for your 4 different levels of beta? Visualizing survival probabilities instead of beta alone might be easiest, since the interpretation of the beta parameter may depend somewhat on the values of mu.

Hi,
Apologies for the delayed response @mbjoseph .A colleague and I discussed this before I saw your post. I have changed the code somewhat. However, there is still something amiss.

(1) Indivdual Covariates:
Beta represents individual effects. It is either not applied, or applied to phi at a level of 1-3. Thus I have made it a vector of length 3 to reduce the computations. In the event it is not necessary, phi only depends on mu.

(2) Subsetting data post draws:
I have instead only captured mean_phi, mean_p and betas as parameters of interest. Mitigates the decision of having the phi and p matrices with zeroes.

(3) Removal of phi:
I haven’t attempted this. It is necessary for chi calculations (which in turn influences p and phi). Beyond me at present.

Ideally I would have wanted a beta value, but they are not making sense. I have started to shift thinking to your statement above, but would need to consider how to structure this . The individual covariates are also temporal. So if an animal was caught it could be impacted which would affect phi. If it survived and was later recaught it may not be impacted that time.

(4) Generated Data:
In my generated data I have increased the number of animals experiencing these individual covariates which influence their survival. I had expected that the generated data had too few affected animals to successfully estimates posteriors for beta… The problem remains.
I have also reviewed the generated data method with our understanding of how the model is structured and we believe that their structures are congruent.

(5) Indexing:
I have worked trhough a couple of number sets a few times. I am not understanding the indexing in parts. See

Model block:
y[i, t] ~ bernoulli(p[i, t - 1]); //.... NEED TO REVIEW AGAIN.... If an animal has been seen from time capture 2, then the third time capture should have y[i,3] ~ bern(p[i,3]) right? Why the minus 1?

chi set up:
chi[i, t_curr] = (1 - phi[i, t_curr]) + phi[i, t_curr] * (1 - p[i, t_next - 1]) * chi[i, t_next];
// NOTE: I think the above line is wrong. I think it should be (t_next - 1) should be t_next… NEED TO REVIEW AGAIN

All code:

// This models is derived from section 12.3 of "Stan Modeling Language
// User's Guide and Reference Manual"

functions {
  int first_capture(int[] y_i) {
    for (k in 1:size(y_i))
      if (y_i[k])
        return k;
    return 0;
  }

  int last_capture(int[] y_i) {
    for (k_rev in 0:(size(y_i) - 1)) {
      // Compoud declaration was enabled in Stan 2.13
      int k = size(y_i) - k_rev;
      //      int k;
      //      k = size(y_i) - k_rev;
      if (y_i[k])
        return k;
    }
    return 0;
  }

  matrix prob_uncaptured(int nind, int n_occasions,
                         matrix p, matrix phi) {
    matrix[nind, n_occasions] chi;

    for (i in 1:nind) {
      chi[i, n_occasions] = 1.0;
      for (t in 1:(n_occasions - 1)) {
        // Compound declaration was enabled in Stan 2.13
        int t_curr = n_occasions - t;
        int t_next = t_curr + 1;
        chi[i, t_curr] = (1 - phi[i, t_curr]) + phi[i, t_curr] * (1 - p[i, t_next - 1]) * chi[i, t_next];
        // NOTE: I think the above line is wrong. I think it should be (t_next - 1) should be t_next.... NEED TO REVIEW AGAIN
        // Link: https://mc-stan.org/docs/2_22/stan-users-guide/mark-recapture-models.html
        // This requires a change of the size of variables. 
      }
    }
    return chi;
  }
}

data {
  int<lower=0> nind;            // Number of individuals
  int<lower=2> n_occasions;     // Number of capture occasions
  int<lower=0,upper=1> y[nind, n_occasions];    // Capture-history
  int<lower=1,upper=4> x[nind, n_occasions-1];         // individual covariate influencing phi
}

transformed data {
  int n_occ_minus_1 = n_occasions - 1;
  int<lower=0,upper=n_occasions> first[nind];
  int<lower=0,upper=n_occasions> last[nind];
  vector<lower=0,upper=nind>[n_occasions] n_captured;

  for (i in 1:nind)
    first[i] = first_capture(y[i]);
  for (i in 1:nind)
    last[i] = last_capture(y[i]);
  n_captured = rep_vector(0, n_occasions);
  for (t in 1:n_occasions)
    for (i in 1:nind)
      if (y[i, t])
        n_captured[t] = n_captured[t] + 1;
}

parameters {
  vector[3] beta;                     // Slope parameter
  real<lower=0,upper=1> mean_phi;     // Mean survival
  real<lower=0,upper=1> mean_p;       // Mean recapture
}

transformed parameters {
  matrix<lower=0,upper=1>[nind, n_occ_minus_1] phi;
  matrix<lower=0,upper=1>[nind, n_occ_minus_1] p;
  matrix<lower=0,upper=1>[nind, n_occasions] chi;

  real mu = logit(mean_phi);
  
  // Constraints
  for (i in 1:nind) {
    for (t in 1:(first[i] - 1)) {
      phi[i, t] = 0;
      p[i, t] = 0;
    }
    
    for (t in first[i]:n_occ_minus_1) {
      p[i, t] = mean_p;
      
      if (x[i, t] == 1){
        phi[i, t] = inv_logit(mu);
      } else {
	  phi[i, t] = inv_logit(mu + beta[x[i, t]-1]);
      }
    }
  }
  chi = prob_uncaptured(nind, n_occasions, p, phi);
}

model {
  for (i in 1:nind) {
    if (first[i] > 0) {
      for (t in (first[i] + 1):last[i]) {  
        1 ~ bernoulli(phi[i, t - 1]);
        y[i, t] ~ bernoulli(p[i, t - 1]);   //.... NEED TO REVIEW AGAIN.... If an animal has been seen from time capture 2, then the third time capture  should have y[i,3] ~ bern(p[i,3]) right? Why the minus 1?
      }
      1 ~ bernoulli(chi[i, last[i]]);
    }
  }
}

I am inclined to belive the code from the stan repo is correct, but my implementation and understanding is not. Any help in understanding this would be greatly appreciated. The indexing to start with?

Kind regards,
Steve