Hi Folks
I’m trying to build a Stan version of the JAGS models currently used to analyse the North American Breeding Bird Survey (BBS) monitoring data. The model is effectively a hierarchical, overdispersed Poisson regression, with a number of intercept parameters to account for observereffects, routelevel variation in abundance, and random annual fluctuations around the slopeline.
The model works. It generates reasonable estimates, no divergent transitions, effective sample sizes are generally ~80% of the total samples. If the max_treedepth argument is increased beyond 10.
The problem
The problem is that the sampler exceeds the maximum tree depth, even if the limit is set to ~1517 (haven’t tried higher yet because life is short). The high max_treedepth limit means that the sampler takes a really long time, even for a species which has very few data compared to many in the dataset.
For example, at tree_depth 15, the n_leapfrog steps is > 30,000. So even a limited initial run of 500 (400 warmup) requires ~10 hours. If max_treedepth at 17, it’s closer to 24 hours for the same small run (500 transitions). If I were to try to run the model on some of the most datarich species in the dataset, it would require months of runtime.
Help!
Is there an alternative parameterization that might make this posterior easier to sample?
I’m really new to Stan (this is my 3rd bespoke model), and I’m no mathematician. So it’s entirely possible that I’m doing something dumb.
The model
model {
// This is a Stan implementation of the bbsBayes slope model
data {
int<lower=1> nstrata;
int<lower=1> ncounts;
int<lower=1> nyears;
int<lower=0> count[ncounts]; // count observations
int<lower=1> strat[ncounts]; // strata indicators
int<lower=1> year[ncounts]; // year index
int<lower=1> fixedyear; // centering value for years
int<lower=0> firstyr[ncounts]; // first year index
int<lower=1> obser[ncounts]; // observer indicators
int<lower=1> nobservers;
}
parameters {
vector[ncounts] noise_raw; //non centered observationlevel random effect to model overdispersion
real lambda[ncounts]; // Poisson means
vector[nstrata] beta_p; //non centered random effect slopes
real BETA; //Slope hyperparameter
vector[nstrata] strata_p; //non centered random effect intercepts
real STRATA; //intercept hypterparameter
real eta; //firstyear observer intercept (observer startup effect)
matrix[nstrata,nyears] yeareffect_raw; //random yeareffects modeling departures from slopebased trajectory
vector[nobservers] obs_raw; // sd of year effects
real<lower=0> sdnoise; // sd of overdispersion
real<lower=0> sdobs; // sd of observer effects
real<lower=0> sdbeta; // sd of slopes
real<lower=0> sdstrata; // sd of intercepts
real<lower=0> sdyear[nstrata]; // sd of year effects
}
transformed parameters {
vector[ncounts] E; // log_scale additive likelihood
vector[ncounts] noise; // extraPoisson lognormal variance
vector[nstrata] beta;
vector[nstrata] strata;
vector[nobservers] obs; //observer effects
matrix[nstrata,nyears] yeareffect;
obs = sdobs*obs_raw;
noise = sdnoise*noise_raw;
// covariate effect on intercepts and slopes
beta = (sdbeta*beta_p) + BETA;
strata = (sdstrata*strata_p) + STRATA;
for(s in 1:nstrata){
yeareffect[s,] = sdyear[s]*yeareffect_raw[s,];
}
for(i in 1:ncounts){
E[i] = beta[strat[i]] * (year[i]fixedyear) + strata[strat[i]] + yeareffect[strat[i],year[i]] + obs[obser[i]] + eta*firstyr[i] + noise[i];
}
}
model {
sdnoise ~ normal(0,1); //prior on scale of extra Poisson lognormal variance
noise_raw ~ normal(0,1); //non centered prior normal tailed extra Poisson lognormal variance
sdobs ~ normal(0,1); //prior on sd of observer effects
sdyear ~ normal(0,1); // prior on sd of yeareffects  stratum specific
obs_raw ~ normal(0,1); //non centered prior on observer effects
for(s in 1:nstrata){
yeareffect_raw[s,] ~ normal(0,1);
sum(yeareffect_raw[s,]) ~ normal(0,0.001*nyears);// soft sum to zero constraint
}
count ~ poisson_log(E); //vectorized count likelihood with logtransformation
BETA ~ normal(0,0.1);// prior on fixed effect mean slope  hyperparameter
STRATA ~ normal(0,1);// prior on fixed effect mean intercept  hyperparameter
eta ~ normal(0,1);// prior on firstyear observer effect
sdstrata ~ normal(0,1); //prior on sd of intercept variation
sdbeta ~ normal(0,0.1); //prior on sd of slope variation
beta_p ~ normal(0,1); //non centered prior on stratumlevel slopes
strata_p ~ normal(0,1); //non centered prior on stratumlevel intercepts
//sum to zero constraints
sum(strata_p) ~ normal(0,0.001*nstrata);
sum(beta_p) ~ normal(0,0.001*nstrata);
}
}
The Data
I haven’t included the data here (it’s large), but I have the example data and all code necessary to run the model at this GitHub repo. This repo also includes some background on the data, as well as some of the JAGS modeling that’s been done with these data.
In a nutshell, the data object is a list with the following components:

count  the counts of individual birds on each route and year (the example data here has ~4000 counts)

strat  the integer indicator for each of the 18 geographic strata included in this species’ range (some species in the BBS database cover ~ 150 strata)

year  the integer indicating which year of the timeseries

firstyr  1 if this routeyear combination represents the observers first year on this particular route

obser  integer indicating which observer conducted the survey

ncounts  number of observations

nstrata  number of geographic strata

nyears  number of years included

nobservers  number of unique observerroute combinations in the data

fixedyear  midpoint of the timeseries, used to center the yearvalues