Difficulties with non-linear hierarchical model

Hi All, first post. I’m fitting what is known as a functional response to a dataset of Peregrine Falcon prey consumption. Here is my model definition:

Del ~ Poisson(mu) # number of prey deliveries to chicks in a day

mu = a * N ^ (m + 1) / (1 + a * t * N ^ (m + 1)) * offset * chicks

this is the functional response formula, and the offset is the number of hours the nest was observed (by camera). chicks is the number of chicks in the nest, which converts the units of the functional response to per chick. a, t and m are parameters to be estimated during fitting that control the shape of the response, and N is prey density in the surrounding landscape.

I have taken the further step of allowing a and t to vary according to the number of chicks in the nest and their age

a = exp(alpha_a + beta_chick_a * chicks + beta_age_a * age)
t = exp(alpha_t + beta_chick_t * chicks + beta_age_t * age)

a and t must be > 0 for the model to make biological sense, hence exp()

priors for submodel terms:
alpha_a, alpha_t ~ N(0, 5)
beta_chick_a, beta_chick_t, beta_age_a, beta_age_t ~ N(0, 1)

m ~ halfnorm(0, 1)

this definition for m constrains the exponent in the model formula to be > 1, which is also necessary for biological plausibility.

This model runs cleanly in ulam from the rethinking package, although does exhibit very high posterior correlations between terms in the linear submodels for a and t.

My difficulty arises when I introduce a random effect for nest ID into the submodel for a, i.e.,

a = exp(alpha_a + beta_chick_a * chicks + beta_age_a * age + zz[nestID])
zz[nestID] ~ N(0, sigma_zz)
sigma_zz ~ halfnorm(0, 1)

then I get divergent iterations, R-hat warnings and poorly mixed chains:



I have tried the non-centered approach, i.e.,

a = exp(alpha_a + beta_chick_a * chicks + beta_age_a * age + zz[nestID] * sigma_zz)
zz[nestID] ~ N(0, 1)
sigma_zz ~ halfnorm(0, 1)

but this does not solve the problem. I can actually fit a much more complicated joint model for multiple prey species that also accounts for unidentified prey and uncertainty in prey density estimates, which also runs cleanly if no random effects are present (though with similarly high posterior correlations), but then fails if random effects are included (divergent iterations and poor mixing).

I’m somewhat raw as a Bayesian and would be really grateful if someone could maybe advise me here. I’m kind of at a loss as to a good approach.

Thanks!

Hi @Kevin_Hawkshaw and welcome. Could you please let us know if you are using rstan, rstanarm, brms, pystan, or something along those lines? It’s also helpful to know which versions of the software you are running. Having a snippet of the data, the model, and model call are also great. If you can’t supply the model then some fake data works too!

Thanks Ara, absolutely. I am using the ulam function in the rethinking package, which uses rstan.

here is the model in stan code:

data{
int Ndel_sb[1556];
vector[1556] hours_op;
vector[1556] area;
vector[1556] se_songbird;
vector[1556] prediction_songbird;
vector[1556] se_shorebird;
vector[1556] prediction_shorebird;
int fsite_year[1556];
vector[1556] age_std;
int chicks[1556];
}
parameters{
vector[1556] songbird;
vector[1556] shorebird;
vector[81] zz_sb;
real<lower=0> sigma_zz_sb;
real alpha_sb_aa;
real alpha_sb_tt;
real<lower=0> mm_sb;
real beta_sb_tt_chicks;
real beta_sb_aa_chicks;
real beta_sb_tt_age;
real beta_sb_aa_age;
}
model{
vector[1556] lambda_sb;
vector[1556] smallbird;
vector[1556] aa_sb;
vector[1556] tt_sb;
beta_sb_aa_age ~ normal( 0 , 1 );
beta_sb_tt_age ~ normal( 0 , 1 );
beta_sb_aa_chicks ~ normal( 0 , 1 );
beta_sb_tt_chicks ~ normal( 0 , 1 );
mm_sb ~ normal( 0 , 1 );
alpha_sb_tt ~ normal( 0 , 5 );
alpha_sb_aa ~ normal( 0 , 5 );
sigma_zz_sb ~ normal( 0 , 1 );
zz_sb ~ normal( 0 , sigma_zz_sb );
for ( i in 1:1556 ) {
tt_sb[i] = exp(alpha_sb_tt + beta_sb_tt_chicks * chicks[i] + beta_sb_tt_age * age_std[i]);
}
for ( i in 1:1556 ) {
aa_sb[i] = exp(alpha_sb_aa + beta_sb_aa_chicks * chicks[i] + beta_sb_aa_age * age_std[i] + zz_sb[fsite_year[i]]);
}
shorebird ~ normal( 4.44 , 1 );
prediction_shorebird ~ lognormal( shorebird , se_shorebird );
songbird ~ normal( 6.64 , 1 );
prediction_songbird ~ lognormal( songbird , se_songbird );
for ( i in 1:1556 ) {
smallbird[i] = (exp(songbird[i]) + exp(shorebird[i]))/area[i]/24.30585 * 100;
}
for ( i in 1:1556 ) {
lambda_sb[i] = aa_sb[i] * smallbird[i]^(mm_sb + 1)/(1 + aa_sb[i] * tt_sb[i] * smallbird[i]^(mm_sb + 1)) * hours_op[i] * chicks[i];
}
Ndel_sb ~ poisson( lambda_sb );
}

Just to avoid confusion, the model accounts for uncertainty in prey abundance estimates - hence the data named prediction_ and se_

I’m using R version 4.0.3 and rstan 2.26.0.9000

@Kevin_Hawkshaw sorry last minute meeting with a student and rolling into the weekend here. If you are somewhat new to Bayesian stuff and Stan my recommendations would be to simplify the model.

I’d install the brms package and code up a simplified version of this model and use some fake data to make sure everything is working. From the brms package you can dump out the stan code. This way you can focus in the modelling aspects and then code it up in Stan later.

Someone in another thread reminded me that there is a great tutorial on non-linear models here:
https://cran.r-project.org/web/packages/brms/vignettes/brms_nonlinear.html

1 Like

Thanks Ara, that is a nice vignette on non-linear models. My problem seems to have been due to my selection of priors - in particular the priors for the intercept terms within the linear submodels. Because those models have log-link functions, the selection of priors is tricky. N(0, 5) was a poor choice because it piles up a ton of density between zero and one on the exponential scale. I switched those priors for N(3, 1), which gives those linear submodels broad support in the range 0-100. That combined with a non-centered parameterization for the random effects has rid me of the divergent transitions and poorly mixed chains, in both the small single species model that I posted, and also my larger multispecies functional response.

I guess the takeaway is beware prior selection in non-linear models. Thanks for pointing out that vignette, and I’ll say also that Chapter 11 of McElreath’s textbook was helpful as well.

1 Like