Help with reparameterization

I am trying to run an N-mixture type model with a different parameterization than I’ve used before. The model runs and the means are reasonable but the credible intervals are enormous. I’m not sure if it’s parameterization or putting things in the wrong blocks for sampling but I suspect something with the model formulation. Any help or suggestions would be appreciated. I’m still very new to stan.

Data generation in R

# Simulate N-mixture Data
n_sites <- 100 # Number of sites
n_visits <- 4 # Number of visits per site

x1 <- rnorm(n_sites, 0, 1) # covariate on normalized scale
beta0 <- log(10) # mean abundance on log scale
beta1 <- 1.2 # effect of covariate 1

lambda <- exp(beta0 + beta1 * x1)

p <- 0.3 # static detection probability

# True abundance
N <- rep(NA, times = length(lambda))
for(i in 1:n_sites) {
  N[i] <- rpois(1, lambda = lambda[i])

# Count process
counts <- matrix(NA, n_sites, n_visits)
for(i in 1:n_sites) {
  for(j in 1:n_visits) {
    counts[i, j] <- rbinom(1, N[i], p)

Stan code

// Binomial mixture model with covariates
data {
  int<lower=0> R;       // Number of sites
  int<lower=0> T;       // Number of temporal replications
  int<lower=0> y[R, T]; // Counts
  vector[R] X;          // Covariate

parameters {
  real beta0; // intercept (log mean abundance)
  real beta1; // slope parameter
  real<lower=0, upper=1> p; // detection probability

transformed parameters {
  vector[R] log_lambda; // Log expected count
  vector[R] log_mu; // Log expected population size
  real log_p; // Log detection probability
  vector[R] lambda; // expected count intensity
  vector[R] mu; // Expected population size (rate parameter)

  log_mu = beta0 + beta1 * X;
  log_p = log(p);
  log_lambda = log_mu + log_p;
  mu = exp(log_mu);
  lambda = exp(log_lambda);

model {
  // Priors
  // beta0 ~ normal(0, 10);
  // beta1 ~ normal(0, 10);
  beta0 ~ uniform(-10, 10);
  beta1 ~ uniform(-10, 10);
  p ~ uniform(0, 1);

  // Likelihood
  for (i in 1:R) {
    for(j in 1:T) {
      y[i, j] ~ poisson(lambda[i]);
      // y[i, j] ~ poisson_log(log_lambda[i]);
      // target += poisson_log_lpmf(y[i, j] | log_lambda);

generated quantities {
  int N[R];
  int totalN;

  for (i in 1:R)
    // N[i] = poisson_rng(mu[i]);
    N[i] = poisson_log_rng(log_mu[i]);
  totalN = sum(N);

Code to run via rstan

rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

# load data

## Parameters monitored
params <- c("beta0", "beta1", "p", "N")

## MCMC settings
ni <- 20000
nt <- 10
nb <- 10000
nc <- 3

data_list <- list(counts = counts,
  x1 = x1,
  n_sites = nrow(counts),
  n_visits = ncol(counts)

Nst <- apply(counts, 1, max) + 1   # Important to give good inits for latent N

## Initial values
inits <- lapply(1:nc, function(i)
  list(N = Nst, 
       beta0 = runif(1, -1, 1),
       beta1 = runif(1, -1, 1),
       p = runif(1, 0.2, 0.8)))

## Call Stan from R
# if(!dir.exists("Results/Stan")) dir.create("Results/Stan", recursive = TRUE)
out <- stan("Code/Stan_Models/nmix_alt.stan",
            data = list(y = counts, R = n_sites, T = n_visits, X = x1),
            init = inits,
            pars = params,
            chains = nc, iter = ni, warmup = nb, thin = nt,
            seed = 123,
            open_progress = FALSE,

print(out, digits = 3)
plot(out, par = c("beta0", "beta1", "p"))
traceplot(out, par = c("beta0", "beta1", "p"))

You’re generating data with: counts[i, j] <- rbinom(1, N[i], p)

But fitting with y[i, j] ~ poisson(lambda[i]);

That seems like an issue.

N-mixture models are a bit tricky in Stan because latent abundance (N) is unbounded (e.g., see discussion here:!searchin/stan-users/n-mixture|sort:date/stan-users/9mMsp1oB69g/78vW13OmE18J).

It is possible to code these up either by choosing some maximum abundance and enumerating over all possible N, or by using a multinomial Poisson parameterization. Some examples of both of these can be found here:


Yeah sorry I should have explained better in my question. I knew the data generation wasn’t going to match the stan model but I wanted to see if

count \sim Poisson(\mu * p)
N = poisson(\mu)

would approximate the standard

count \sim Bin(N, p)
N \sim Poisson(mu)

I also wasn’t careful about my constant use of \mu and \lambda in the generating code vs sampling code.

1 Like

I have marginalized out N in the past by summing over values up to K (max remotely possible N) but have found it’s very slow as opposed to using a Gibbs sampler to sample N directly.

However, I was thinking that the alternative parameterization of multiplying \mu * p might work. Since people don’t do that I figured there was probably a reason but I couldn’t think of it offhand so I decided to simulate and see what happens. But then since I’m new to stan, I couldn’t tell if it was something about my code or just the parameterization that was the issue.

I’ve never looked into the multinomial Poisson, so I’ll check that option out. Thanks for the link.

It’s been a while since I’ve thought about N-mixture models, but IIRC in a typical binomial-Poisson hierarchy only the product \lambda p is identified, not \lambda and p separately. The tricky bit with N-mixture models is the repeat surveys that provide information that can inform p and N separately. If we have repeat surveys t=1, ..., T, then the likelihood for repeat counts y_1, ..., y_T (at one site) is

\prod_{t=1}^T \text{Binomial}(y_t \mid N, p) \text{Poisson}(N \mid \lambda])

and in that case I don’t know of a closed form marginalization over N.

It looks like this thread is a little old, but in case someone else finds it useful, there are some closed-form methods for estimating N and p separately.

  1. Use a multivariate poisson model (Dennis et al. 2015, DOI: 10.1111/biom.12246)
  2. Reparameterize the likelihood of the binomial*poisson mixture following Haines 2016, DOI: 10.1111/biom.12521)
  3. Use distance sampling, which uses additional data to estimate p, so that you can use the count ~ poisson(mu * p) in the likelihood.