Specifying a multinomial N-mixture model for capture-recapture encounter histories


I’m currently working on adapting a multinomial N-mixture model from Jags to Stan. The goal of the model is to estimate the effects of some covariate “X” on both the site-specific abundance and site-specific detectability of some species. Ultimately, I’d like to have both a binomial N-mixture model (already done) that uses the variation among counts as well as a multinomial N-mixture model (in progress). My goal is to better compare the performance of the two given a shared data generating process - e.g. do they tend to diverge at a particular sample size, detection rate, underlying abundance, etc.

My goal is for the model to take encounter histories from 3 repeat survey events to make a statement about effects of a site-specific covariate on abundance and detection. I’m less interested in estimating survival (CJS model) or total abundance in the system (zero inflated data augmentation model). I do have both of those working in Stan but not in a way that allows me to use the encounter histories to estimate effects of site-specific covariates on abundance/detection in a way that is directly comparable to the binomial N-mixture model estimates.

Right now I have a working multinomial N-mixture model written in Jags, but I’m having a hard time specifying my model in Stan. I get the error "Output: “Chain 1: Rejecting initial value: Chain 1: Log probability evaluates to log(0), i.e. negative infinity. Chain 1: Stan can’t start sampling from this initial value.” I’m assuming this means I’ve not properly specified my model.

I’ve tried specifying the likelihood as a three part marginalized model: likelihood of some possible abundance from max number of observed individuals to some arbitrarily large value K; likelihood of observing some number of individuals across all site visits given the proposed abundance and an estimated total probability of detection across all multinomial cell probabilities; and finally likelihood of the observed cell distribution of encounters given the conditional multinomial cell probabilities.

I’d greatly appreciate if anyone has any tips on where this model specification is going wrong! Apologies also if this type of question has already been asked and answered. I couldn’t find anything precisely analogous in the discourse.

My simulation and model files are uploaded and attached below the model code. Thanks!

// Multinomial N-mixture model
// for estimating effects of covariate on abundance and detection rate
// given a set of capture/encounter histories (y) from N (value unknown) individuals
// across a spatially stratified system consisting of n_sites units.

// there are 3 repeat surveys per site -> 
// 8 possible detection histories (including 000 never detected)

data {
  
  int<lower=0> n_sites; // number of sites
  int<lower=0> y[n_sites, 7]; // capture histories matrix (7 observable histories possible)
  int<lower=0> nobs[n_sites]; // total number of individuals observed at each site 
                                // (with any history) // i.e. nobs = sum(y[,1:7])
  int<lower=0> K; // some arbitrarilly large maximum site-specific abundance to search up until
  vector[n_sites] X; // site covariate (categorical 0 or 1)

}

parameters {
  
  real mu_alpha0; // abundance intercept
  real mu_alpha1; // effect of covariate on abundance 
  real mu_beta0; // detection intercept
  real mu_beta1; // effect of covariate on detection
  
}

transformed parameters {
  
  vector<lower=0, upper=1>[n_sites] p; // site-specific detection, ranges between 0 and 1 
  vector[n_sites] lambda; // site-specific abundance
  
  real cellprobs[n_sites,8]; // cell probabilities (probability of landing in a cell)
  real cellprobs_conditional[n_sites,7]; // cell probs conditional on the sum of all other observable probabilities
  real p_det[n_sites]; // probability of detecting an individual >= once
  
  for(i in 1:n_sites){
    
    lambda[i] = mu_alpha0 + mu_alpha1 * X[i]; // linear model for abundance
    p[i] = inv_logit(mu_beta0 + mu_beta1 * X[i]); // linear model for detection
    
    // define cell probabilities (capture histories based on site-specific detection prob.)
    cellprobs[i, 1] = p[i]*p[i]*p[i]; // 111 (observed, observed, observed)
    cellprobs[i, 2] = p[i]*p[i]*(1-p[i]); // 110 (observed, observed, not observed)
    cellprobs[i, 3] = p[i]*(1-p[i])*p[i]; // 101
    cellprobs[i, 4] = p[i]*(1-p[i])*(1-p[i]); // 100
    cellprobs[i, 5] = (1-p[i])*p[i]*p[i]; // 011
    cellprobs[i, 6] = (1-p[i])*p[i]*(1-p[i]); // 010
    cellprobs[i, 7] = (1-p[i])*(1-p[i])*p[i]; // 001
    cellprobs[i, 8] = 1-sum(cellprobs[i,1:7]); // 000 // INDIVIDUAL NEVER OBSERVED
    // define conditional cell probabilities (each cell prob relative contribution to total prob.)
    cellprobs_conditional[i,1] = cellprobs[i, 1]/sum(cellprobs[i, 1:7]);
    cellprobs_conditional[i,2] = cellprobs[i, 2]/sum(cellprobs[i, 1:7]);
    cellprobs_conditional[i,3] = cellprobs[i, 3]/sum(cellprobs[i, 1:7]);
    cellprobs_conditional[i,4] = cellprobs[i, 4]/sum(cellprobs[i, 1:7]);
    cellprobs_conditional[i,5] = cellprobs[i, 5]/sum(cellprobs[i, 1:7]);
    cellprobs_conditional[i,6] = cellprobs[i, 6]/sum(cellprobs[i, 1:7]);
    cellprobs_conditional[i,7] = cellprobs[i, 7]/sum(cellprobs[i, 1:7]);
    // probability of detecting an individual >= 1 times
    p_det[i] = sum(cellprobs[i,1:7]); 
    
  }
  
}

model {
  
  // Priors
  mu_alpha0 ~ normal(0, 2); // abundance intercept
  mu_alpha1 ~ normal(0, 2); // effect of covariate on abundance
  mu_beta0 ~ normal(0, 2); // detection intercept
  mu_beta1 ~ normal(0, 2); // effect of covariate on detection
  
  // Likelihood
  for (i in 1:n_sites) {
    
    // Roughly how I would write this if I were using Jags:
    //y[i,1:7] ~ multinomial(cellprobs_conditional[i,1:7]);
    //nobs[i] ~ binomial(N[i], p_det[i]);
    //N[i] ~ poisson(lambda[i]);
    
    // marginalized across possible total abundances nobs:K
    vector[K - nobs[i] + 1] lp; // lp vector of length of possible abundances 
      
      for (j in 1:(K - nobs[i] + 1)) { // for each possible abundance:
      
        // lp of capture histories conditional on
        // abundance model and
        // observation model
        lp[j] = 
          // prob of N individuals per site given lambda per site
          poisson_log_lpmf(nobs[i] + j - 1 | lambda[i]) +
          // prob of observing nobs individuals given N per site and p_det odds of detecting individuals >= 1 times
          binomial_lpmf(nobs[i] | nobs[i] + j - 1, p_det[i]) +
          // prob of the site-specific detection histories, given the site-specific conditional cell probabilities
          multinomial_lpmf(y[i,1:7] | to_vector(cellprobs_conditional[i,1:7]))
        ; // end lp[j]
          
        target += log_sum_exp(lp);
      }
      
  }
  
}

generated quantities{
  
  // predict total abundance in the system 
  // not really an important result (for me) but it's a nice initial check of
  // whether the model agrees with the simulation
  // develop posterior predictive check next
  
  vector[n_sites] N; // abundance per site (to be simulated)
  real totalN; // total abundance across all sites
  
  for(i in 1:n_sites){
    N[i] = poisson_rng(lambda[i]); // simulate site-specific abundance
  }

  totalN = sum(N); // sum abundance across sites

}


Stan program and simulation R code:
multinomial_Nmix_model.stan (5.0 KB)
simulate_data_multinomialNmix.R (11.7 KB)

1 Like

Hi, @julrich, and sorry nobody has answered yet. This kind of model debugging question is the hardest kind of question we get as it’s usually very hard to see where things are going wrong at a glance.

First, it helps a lot if the model is well indented and vertical space reduced and commented out code eliminated. Things like the non-standard indentation after lp[j] also make it hard to read. Also, some of the doc is misleading, like the binomial_lpfm referencing a non-existent variable N

Some of the code can be cleaned up to make it easier to follow, e.g.,

cellprobs_conditional[i,1] = cellprobs[i, 1]/sum(cellprobs[i, 1:7]);
...
cellprobs_conditional[i,7] = cellprobs[i, 7]/sum(cellprobs[i, 1:7]);

can just be

vector[7] qs = cellprobs[i, 1:7];
cellprobs_conditional[i, 1:7] = qs / sum(qs);

though I’m not sure why this is only 7 of the 8 (it looks like the last state is for individuals never observed). Aren’t there usually probabilities of death in CJS model that play together with the observation types?

It’s way more efficient to use linear algebra, so brig out the definition of lambda as in

lambda = mu_alpha0 + mu_alpha1 * X;

and then just index into it. I’d also define

vector[?] one_minus_p = 1 - p;

and then index into one_minus_p. You want to cut down as much as possible on duplicated work. There’s more that could be done to this end for cellprobs, but the above is a start.

There’s also a lot of redundant computation in the inner j loop, with variables that could be defined outside the loop and re-used, like to_vector(cellprobs_conditional[i,1:7]).

Alas, none of this is going to fix your model. I’d look into the capture histories and make sure what you’re doing there is really getting the state-space model right.

The generated quantities can also be cleaned up:

generated quantities {
  array[n_sites] real N = poisson_rng(lambda);
  real totalN = sum(N);
}

It’s easier to debug if it’s easier to see more of the code at once.

For the lp case in the model, there’s usually a probability of each mixture component that’s used and I don’t see anything like that here.

2 Likes

The first thing I’d try is printing lp. Are all elements bad or just some? For all sites, or just some?

1 Like

Thanks for your suggestions @Bob_Carpenter and @jsocolar. I’ll report back after working through both of these things

Hi @Bob_Carpenter and @jsocolar. Thanks for your suggestions, I was able to make some progress. So, first, I think there is still more to do to simplify the code. I will keep working on that after this.

That being said I did figure out the issue. My model now accurately and precisely returns the parameters mu_alpha0, mu_alpha1, mu_beta0 and mu_beta1 as well as the imperfectly observed total abundance across all sites. I’ve never used the print command before but that helped me isolate the issue. I had placed log_sum_exp() inside the j loop. This was causing log sum of the lp vector before I had I actually calculated lp for all of the j possible latent abundance states.

@Bob_Carpenter you mentioned:

For the lp case in the model, there’s usually a probability of each mixture component that’s used and I don’t see anything like that here.

I’m just trying to come to a better understanding of what Stan is doing here to calculate lp, after also having read through documentation. My understanding was that mixing ratios would need to be defined if the same outcome could be drawn from different distributions. In my case where the outcomes are different (but inter-related parameters) could there be issues with simply adding together the log probability masses? i.e.,
lp[j] = poisson_log_lpmf(latent_abundance | lambda[i]) + binomial_lpmf(nobs[i] | latent_abundance, p_det[i]) + multinomial_lpmf(y[i,1:7] | site_cellprobs);

Here is my updated working (but work in progress) stan code:

// Multinomial N-mixture model
// for estimating effects of covariate on abundance and detection rate
// given a set of capture/encounter histories (y) from N (value unknown) individuals
// across a spatially stratified system consisting of n_sites units.
// There are 3 repeat surveys per site -> 
// 7 observable detection histories plus an unobservable 8th history (never detected)
data {
  int<lower=0> n_sites; // number of sites
  int<lower=0> y[n_sites, 7]; // capture histories matrix (7 observable histories possible)
  int<lower=0> nobs[n_sites]; // total number of individuals observed at each site, i.e. nobs = sum(y[,1:7])
  int<lower=0> K; // some arbitrarilly large maximum site-specific abundance to search up until
  vector[n_sites] X; // site covariate (categorical 0 or 1)
}
parameters {
  real mu_alpha0; // abundance intercept
  real mu_alpha1; // effect of covariate on abundance 
  real mu_beta0; // detection intercept
  real mu_beta1; // effect of covariate on detection
}
transformed parameters {
  vector<lower=0, upper=1>[n_sites] p = inv_logit(mu_beta0 + mu_beta1 * X); // linear model for detection; // site-specific detection, ranges between 0 and 1 
  vector[n_sites] lambda = mu_alpha0 + mu_alpha1 * X; // linear model for abundance
  real cellprobs[n_sites, 8]; // cell probabilities (probability of landing in a cell)
  real cellprobs_conditional[n_sites, 7]; // cell probs conditional on the sum of all other observable probabilities
  real p_det[n_sites]; // probability of detecting an individual >= once
  // define cell probabilities (capture histories based on site-specific detection prob.)
  for(i in 1:n_sites){
    real one_minus_p = 1 - p[i];
    cellprobs[i, 1] = p[i]^3; // 111 (observed, observed, observed)
    cellprobs[i, 2] = p[i]^2*one_minus_p; // 110 (observed, observed, not observed)
    cellprobs[i, 3] = cellprobs[i, 2]; // 101
    cellprobs[i, 4] = p[i]*one_minus_p^2; // 100
    cellprobs[i, 5] = cellprobs[i, 2]; // 011
    cellprobs[i, 6] = cellprobs[i, 4]; // 010
    cellprobs[i, 7] = cellprobs[i, 4]; // 001
    cellprobs[i, 8] = 1-sum(cellprobs[i, 1:7]); // 000 // INDIVIDUAL NEVER OBSERVED
    // define conditional cell probabilities (each cell prob relative contribution to total prob.)
    vector[7] qs = to_vector(cellprobs[i, 1:7]);
    cellprobs_conditional[i, 1:7] = to_array_1d(qs / sum(qs));
    // probability of detecting an individual >= 1 times
    p_det[i] = sum(cellprobs[i,1:7]); 
  }
}
model {
  // Priors
  mu_alpha0 ~ normal(0, 2); // abundance intercept
  mu_alpha1 ~ normal(0, 2); // effect of covariate on abundance
  mu_beta0 ~ normal(0, 2); // detection intercept
  mu_beta1 ~ normal(0, 2); // effect of covariate on detection
  // Likelihood
  for (i in 1:n_sites) {
    int length_lp = K - nobs[i] + 1;
    vector[length_lp] lp; // lp vector of length of possible abundances at site i
    vector[7] site_cellprobs = to_vector(cellprobs_conditional[i,1:7]);
    for (j in 1:length_lp) { // for each possible abundance:
      int latent_abundance = nobs[i] + j - 1;
      lp[j] = poisson_log_lpmf(latent_abundance | lambda[i]) + // abundance
            binomial_lpmf(nobs[i] | latent_abundance, p_det[i]) + // detection of n ind
            multinomial_lpmf(y[i,1:7] | site_cellprobs); // observed detection histories
    }
    target += log_sum_exp(lp); // sum across possible latent states
  }
}
generated quantities {
  array[n_sites] real N = poisson_log_rng(lambda);
  real totalN = sum(N);
}