Hello,
I am analysing cross-sectional time-series aggregate data that summarise the annual effect of an environmental exposure (EE) on body-weight (bwt) in large groups of people (300-30000, median 1000). The same data are available as either the mean bwt (which I call xbwt) for each entire group, or as the number of people who exceed a pre-set bwt threshold in each group (which I call tbwt).
The environmental exposure is measured with error (ee.err) and I wish to account for this measurement error. So, I included me(EE, ee.err) in my analyses. The analyses also covary other aggregate population indices (ethnicity, income, etc) and random effects of large geographical regions and smaller socioeconomic areas (within regions) and I weight them by the inverse probability (ipwt) of receiving exposure, for each group.
My code for the negative binomial analysis is:-
mod1=brm(tbwt|weights(ipwt) ~ year+s(region,bs=‘mrf’,xt=list(polys=geom))
+(1|area)+ethnicity+income… etc… +me(EE,ee.err)+offset(log(N)), family=negbinomial(log))
My code for the analogous gaussian analysis of the aggregate mean bwt (xbwt) is:-
mod2=brm(xbwt|weights(ipwt) ~ year+s(region,bs=‘mrf’,xt=list(polys=geom))
+(1|area)+ethnicity+income… etc… +me(EE,ee.err), family=gaussian)
mod2 has the problem outlined below - but mod2a runs well:-
mod2a=brm(xbwt|weights(ipwt) ~ year+s(region,bs=‘mrf’,xt=list(polys=geom))
+(1|area)+ethnicity+income… etc… +EE, family=gaussian)
I run all the above models with:-
bwprior=c(set_prior(‘normal(0,1)’,class=‘b’),set_prior(‘normal(0,ee.err)’,class=‘meanme’))
brm(model, data = dat, chains = 4, cores = 4, iter=2500, prior=bwprior, future=F, control=list(adapt_delta=.95, max_treedepth = 15), inits=0, seed=123, save_mevars=T)
mod1 runs quickly (about an hour) and efficiently (all Rhats = 1.00 and all effective sample sizes >1000).
In contrast, mod2 runs slowly (3-8 hours) and extremely inefficiently (most Rhats>100 and most effective sample sizes<10).
mod2a runs as quickly and efficiently as the negative binomial model.
I am puzzled. If anything, I would have expected including the me(EE,ee.err) term to affect the negative binomial model, not the normal model.
Could you please explain what is happening here?
With many thanks, in anticipation of your help
Jonathan Williams
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] brms_2.8.0 Rcpp_1.0.1
loaded via a namespace (and not attached):
[1] mvtnorm_1.0-8 lattice_0.20-35 zoo_1.8-5 gtools_3.5.0
[5] assertthat_0.2.0 digest_0.6.12 mime_0.5 R6_2.2.2
[9] plyr_1.8.4 backports_1.1.2 ggridges_0.5.0 stats4_3.4.3
[13] coda_0.19-1 ggplot2_2.2.1 colourpicker_1.0 pillar_1.3.1
[17] rlang_0.4.0 lazyeval_0.2.0 miniUI_0.1.1 Matrix_1.2-12
[21] DT_0.2 shinythemes_1.1.1 shinyjs_0.9.1 stringr_1.4.0
[25] htmlwidgets_1.3 loo_2.1.0 igraph_1.1.2 munsell_0.5.0
[29] shiny_1.0.5 compiler_3.4.3 httpuv_1.3.5 rstan_2.17.3
[33] pkgconfig_2.0.1 base64enc_0.1-3 rstantools_1.4.0 htmltools_0.3.6
[37] tidyselect_0.2.5 tibble_2.1.3 gridExtra_2.3 threejs_0.3.1
[41] matrixStats_0.52.2 crayon_1.3.4 dplyr_0.8.3 grid_3.4.3
[45] nlme_3.1-131 xtable_1.8-2 gtable_0.2.0 magrittr_1.5
[49] StanHeaders_2.17.2 scales_1.0.0 stringi_1.2.4 reshape2_1.4.2
[53] dygraphs_1.1.1.4 xts_0.11-2 tools_3.4.3 glue_1.3.0
[57] shinystan_2.4.0 markdown_0.8 purrr_0.3.2 crosstalk_1.0.0
[61] rsconnect_0.8.5 abind_1.4-5 parallel_3.4.3 inline_0.3.14
[65] colorspace_1.3-2 bridgesampling_0.4-0 bayesplot_1.5.0 Brobdingnag_1.2-4