- Operating System: Windows 10, Intel Xeon core
- brms Version: 2.10.0
Hello: this is a follow-up to my original post (Priors on predicted shape parameter(s) in mixed zinb model). There I received helpful feedback regarding the interpretation of the shape variance parameter in ZINB models. Here I’m looking for assistance in properly specifying the model.
I am attempting a zero-inflated negative binomial mixed-effect model, and I am also interested in predicting the shape (dispersion) parameter (so as to assess whether variance is different across years in my model). I realize variance parameters are tricky and can require a lot of data, but if there is a way, I would like to predict it.
I am predicting a count response: cars95, annual reproductive success
Predictors include:
Main effects:
age (discrete) - most central to my question
year (discrete) - most central to my question
mort (0/1) - sparse data, not my primary focus
bwev (0/1) - sparse data, not my primary focus
prev sire (0/1) - sparse data, not my primary focus
age x year interaction - most central to my question
Group-level effects:
age|id (where id is categorical for individual across years; this random intercept is necessary to account for multiple samples from the same individual across years)
Thus, the model I would like to fit is:
model {
cars95 ~ age*year + mort + bwev + prevsire + (age|id), shape ~ year
This model fails to converge with default priors:
prior class coef group resp dpar nlpar bound
1 b
2 b age
3 b age:year
4 b bwev1
5 b mort1
6 b prevsire1
7 b year
8 student_t(3, -2, 10) Intercept
9 student_t(3, 0, 10) Intercept shape
10 lkj_corr_cholesky(1) L
11 L id
12 student_t(3, 0, 10) sd
13 student_t(3, 0, 10) sd shape
14 sd id
15 sd age id
16 sd Intercept id
17 sd year shape
18 sd Intercept year shape
19 beta(1, 1) zi
I’m thrown the warnings:
Warning messages:
1: There were 861 divergent transitions after warmup. Increasing adapt_delta above 0.9 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
2: There were 59139 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
3: Examine the pairs() plot to diagnose sampling problems
4: The largest R-hat is 4.13, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat
5: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess
6: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess
Note, however, if I do not predict shape, the model converges fine. There is a lot of uncertainty around the shape parameter when no priors are given, so here I tweaked the default prior on shape (from the default gamma(0.01, 0.01)
to gamma(1,1)
) though I admittedly did not follow a rigorous method for doing so):
Family: zero_inflated_negbinomial
Links: mu = log; shape = identity; zi = identity
Formula: cars95 ~ age * year + mort + bwev + prevsire + (age | id)
Data: c95data (Number of observations: 220)
Samples: 4 chains, each with iter = 15000; warmup = 5000; thin = 2;
total post-warmup samples = 20000
Group-Level Effects:
~id (Number of levels: 60)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 1.82 0.97 0.15 3.86 1.00 4798 9806
sd(age) 0.54 0.22 0.14 1.02 1.00 4365 6019
cor(Intercept,age) -0.63 0.37 -0.96 0.52 1.00 6200 6898
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -4.04 1.77 -7.93 -0.98 1.00 7970 13616
age 1.44 0.61 0.33 2.72 1.00 7962 14286
year 0.10 0.23 -0.35 0.58 1.00 14321 16159
mort1 0.58 0.90 -1.21 2.36 1.00 16313 17632
bwev1 -0.97 0.89 -2.81 0.70 1.00 16957 16902
prevsire1 0.07 0.50 -0.91 1.05 1.00 8872 14221
age:year -0.15 0.08 -0.30 -0.00 1.00 8792 14912
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
shape 1.68 0.95 0.54 4.14 1.00 9684 14440
zi 0.12 0.09 0.01 0.34 1.00 14419 16585
Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
is a crude measure of effective sample size, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
I’ve conducted loo tests and a number of post predictive checks on the the converged model without a predicted shape. Many, but not all, check out. Problems that arise with the initial model are a higher p_loo count than the number of parameters in my model, and a somehwat poor-performing ppc_dens_overlay plot:
Loo output:
Computed from 20000 by 220 log-likelihood matrix
Estimate SE
elpd_loo -172.6 18.5
p_loo 34.2 6.5
looic 345.1 37.1
------
Monte Carlo SE of elpd_loo is 0.4.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 197 89.5% 13
(0.5, 0.7] (ok) 23 10.5% 121
(0.7, 1] (bad) 0 0.0% <NA>
(1, Inf) (very bad) 0 0.0% <NA>
All Pareto k estimates are ok (k < 0.7).
See help('pareto-k-diagnostic') for details.
Post-predictive checks:
PPC - bars:
PPC - ecdf overlay:
PPC- stat, mean
PPC - stat, mean and SD
PPC - ppc_check()
My questions for this post:
- Might including more informative priors for a model where I predict the shape variance parameter help convergence?
- If so, what priors should I assign - particularly for shape-specific parameters (the
b
,sd
, andIntercept
parameters associated withdpar=shape
) - Do you have any other suggestions/thoughts to improve model convergence beyond improved prior specification?
- Would you agree with my assessment that the model NOT predicting shape (for which I’ve included the model output and post-predictive checks above) ‘fits’ decently well - even considering the high
p_loo
value relative to my number of parameters?
I’ve included a reproducible example together with a link to data:
library(dplyr)
library(brms)
library(ggeffects)
library(GGally)
library(rstantools)
library(rstanarm)
library(loo)
library(bayesplot)
c95data <- read.csv("ARS95KAc.csv")
c95data$id <- as.factor(c95data$id)
c95data$year <- as.integer(c95data$year) #year (2-8) as integer NOT categorical
c95data$mort <- as.factor(c95data$mort)
c95data$prevsire <- as.factor(c95data$prevsire)
c95data$bwev <- as.factor(c95data$bwev)
#MODELS
#NOT predicting shape
mod1 <- brms(cars95 ~ age*year + mort + bwev + prevsire + (1|id) ,
data=c95data,
family=zero_inflated_negbinomial(),
prior= set_prior("gamma(1,1)"),
iter=20000, warmup=5000, thin=2, chains=4,
control = list(adapt_delta = 0.99),
cores=6)
##Predicting shape -- DOES NOT CONVERGE
mod2_nopri <- brms(bf(cars95 ~ age*year + mort + bwev + prevsire + (1|id) ,
shape ~ year),
data=c95data,
family=zero_inflated_negbinomial(),
iter=20000, warmup=5000, thin=2, chains=4,
control = list(adapt_delta = 0.99),
cores=6)
#LOO
loo1 <- loo(mod1, reloo=T, cores=6)
##PPCs
#post predictive checks
yrep <- posterior_predict(mod1)
y <- c95data$cars95
bayesplot::ppc_stat(y, yrep[1:50, ], stat="mean")
brms::pp_check(mod1)
brms::pp_check(mod1, type="stat_2d", stat=c("mean", "sd"), nsamples=30)
brms::pp_check(mod1, type="ecdf_overlay", discrete=T, nsamples=30)
brms::pp_check(mod1, type="bars", nsamples=30)
DATA: https://drive.google.com/file/d/1T51Cp0m8qHtsf2fZcpqchMQJo6dIdQn1/view?usp=sharing
Thank you in advance to everyone/anyone who gives my struggles and questions some time! It is so appreciated.
El