Suggestions for improving model of bird monitoring data

Thanks @cmcd. I’m going to try moving some of the transformed parameters in the model block. The discussion on the link you shared suggested that writing to disc may vary between the blocks and there are many data involved here so perhaps those disc-write limits are important. I also noticed that post suggested using the std_normal() distribution instead of the normal(0,1). I’ll try that too.

The soft sum to zero constraints is pretty new for me too. I discovered it when working with an iCAR model in Stan. There’s some discussion of different approaches to these constraints here:

and the modelling work where I first discovered them is here:
https://mc-stan.org/users/documentation/case-studies/icar_stan.html

I have tried versions of this same model without the sum to zero constraints (as you suggested, they’re easy to comment-out). There’s little difference in run-time and the convergence is generally a bit worse (particularly for the main slope and intercept hyperparameters).

Update: I’ve tried these two suggestions

and I’ve used the std_normal() distribution instead of the normal(0,1).
It seems to have improved things a bit:
The warmup seems to be going faster (mostly)…

get_elapsed_time(slope_stanfit)/3600 ## in hours

          warmup   sample
chain:1 5.227583 2.122542
chain:2 7.342417 2.112506
chain:3 5.522083 2.103778

Here’s the same result from the original parameterization:

          warmup   sample
chain:1 7.798528 2.105014
chain:2 8.093528 2.224678
chain:3 7.706306 2.151717

It’s not the kind of speed I’m hoping for yet, so I’m going to keep trying things. I’m wondering if it’s worth digging deep in to the error distribution on the counts. The sdnoise parameter is one of the least well sampled parameters. Perhaps that’s causing some miss-specification.

So, here’s the model now:

// This is a Stan implementation of the bbsBayes slope model

data {
  int<lower=1> nstrata;
  int<lower=1> ncounts;
  int<lower=1> nyears;

  int<lower=0> count[ncounts];              // count observations
  int<lower=1> strat[ncounts];               // strata indicators
  int<lower=1> year[ncounts]; // year index
  int<lower=1> fixedyear; // centering value for years
  
  int<lower=0> firstyr[ncounts]; // first year index
  
  int<lower=1> obser[ncounts];              // observer indicators
  int<lower=1> nobservers;
  
}

parameters {
  vector[ncounts] noise_raw;             //non centered observation-level random effect to model over-dispersion
  real lambda[ncounts];             // Poisson means
  
  vector[nstrata] beta_p; //non centered random effect slopes
  real BETA; //Slope hyperparameter

  vector[nstrata] strata_p; //non centered random effect intercepts
  real STRATA; //intercept hypterparameter

  real eta; //first-year observer intercept (observer startup effect)
  
  matrix[nstrata,nyears] yeareffect_raw; //random year-effects modeling departures from slope-based trajectory

  vector[nobservers] obs_raw;    // sd of year effects

  real<lower=0> sdnoise;    // sd of over-dispersion
 //real<lower=1> nu;  //optional heavy-tail df for t-distribution
  real<lower=0> sdobs;    // sd of observer effects
  real<lower=0> sdbeta;    // sd of slopes 
  real<lower=0> sdstrata;    // sd of intercepts
  real<lower=0> sdyear[nstrata];    // sd of year effects

  
}


model {


  vector[ncounts] E;           // log_scale additive likelihood
  vector[ncounts] noise;           // extra-Poisson log-normal variance
  vector[nstrata] beta;
  vector[nstrata] strata;
  vector[nobservers] obs; //observer effects
  matrix[nstrata,nyears] yeareffect;




  sdnoise ~ std_normal(); //prior on scale of extra Poisson log-normal variance
  noise_raw ~ std_normal(); //non centered prior normal tailed extra Poisson log-normal variance
  
  sdobs ~ std_normal(); //prior on sd of observer-route effects
  sdyear ~ std_normal(); // prior on sd of yeareffects - stratum specific
  obs_raw ~ std_normal(); //non centered prior on observer effects
  
  
  //nu ~ gamma(2,0.1); // alternate prior on df for t-distribution of heavy tailed 
 for(s in 1:nstrata){

  yeareffect_raw[s,] ~ std_normal();
  sum(yeareffect_raw[s,]) ~ normal(0,0.001*nyears);// soft sum to zero constraint
  
 }
 
  BETA ~ normal(0,0.1);// prior on fixed effect mean slope - hyperparameter
  STRATA ~ std_normal();// prior on fixed effect mean intercept - hyperparameter
  eta ~ std_normal();// prior on first-year observer effect
  
  
  sdstrata ~ std_normal(); //prior on sd of intercept variation
  sdbeta ~ normal(0,0.1); //prior on sd of slope variation

  beta_p ~ std_normal(); //non centered prior on stratum-level slopes
  strata_p ~ std_normal(); //non centered prior on stratum-level slopes

  //sum to zero constraints
  sum(strata_p) ~ normal(0,0.001*nstrata);
  sum(beta_p) ~ normal(0,0.001*nstrata);
  
  
       obs = sdobs*obs_raw;
     noise = sdnoise*noise_raw;
      

// covariate effect on intercepts and slopes
  beta = (sdbeta*beta_p) + BETA;
  strata = (sdstrata*strata_p) + STRATA;
  
   for(s in 1:nstrata){
     yeareffect[s,] = sdyear[s]*yeareffect_raw[s,];
   }
  

  for(i in 1:ncounts){
    E[i] =  beta[strat[i]] * (year[i]-fixedyear) + strata[strat[i]] + yeareffect[strat[i],year[i]] + obs[obser[i]] + eta*firstyr[i] + noise[i];
  }
  
    count ~ poisson_log(E); //vectorized count likelihood with log-transformation
 
}
1 Like