Few effective samples for a nested three-level logistic model with a random intercept

Diva · September 6, 2019, 8:37am

Please also provide the following information in addition to your question:

Operating System: Win10
brms Version: 2.8.0

Hello! I have built up a three-level model binary logistic model where the small group is nested within a big group. I have five population-level variables (a1-a5 are category ones and a6 is numerical) and one group-level variable (b1 as a variable at the big group level). For quick coverage, I adopt the prior for “sd” as Cauchy (0, 2.5). However, I met a problem with my model results. As there are very few effective samples for the estimates of “sd” at both group levels, and few effective samples for the estimates of “Intercept” and the group-level variable “b1”. Anyone could help detect the problem with my model? Is there any suggestions to deal with the problem? Many thanks. Diva

The output shows as below:

Family: bernoulli
Links: mu = logit
Formula: Rf ~ a1 + a2 + a3 + a4 + a5 + a6 + b1 + (1 | BigGroup/SmallGroup)
Data: orgin (Number of observations: 5314)
Samples: 2 chains, each with iter = 4000; warmup = 2000; thin = 1;
total post-warmup samples = 4000

Group-Level Effects:
~BigGroup(Number of levels: 14)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 1.00 0.34 0.54 1.79 578 1.00

~BigGroup:SmallGroup (Number of levels: 115)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 1.24 0.11 1.03 1.47 859 1.00

Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept -0.91 0.33 -1.57 -0.27 1272 1.00
a1F -0.09 0.07 -0.23 0.06 3535 1.00
a2W 0.31 0.11 0.10 0.52 2588 1.00
a2G 0.78 0.17 0.46 1.11 2646 1.00
a2S 0.45 0.19 0.09 0.80 2701 1.00
a3dd -0.16 0.10 -0.35 0.04 3081 1.00
a3nn -0.06 0.13 -0.31 0.18 2769 1.00
a3tm -0.13 0.11 -0.36 0.10 2998 1.00
a3ot -0.29 0.16 -0.61 0.01 3326 1.00
a4low -0.04 0.10 -0.23 0.15 3390 1.00
a4mid 0.05 0.12 -0.18 0.29 3150 1.00
a5HH 0.49 0.09 0.32 0.66 3228 1.00
a6 -0.06 0.07 -0.20 0.08 2324 1.00
b1 -0.28 0.34 -0.84 0.41 784 1.00

martinmodrak · September 10, 2019, 7:31am

To me those look like reasonably healthy diagnostics and I wouldn’t worry much as long as you get at least few hundred effective samples. Note that 578 effective samples is very good to determine the overall tendency and likely still very good for determining 90% credible intervals. Wider intervals (e.g. 95%) might be slightly unstable but if you need good resolution for those, then just run more iterations (via the iter argument).

That is actually quite a wide prior for logistic regression. Quickly simulating with

N_samples <- 1e4; 
mean(abs(rnorm(N_samples, 0, abs(rcauchy(N_samples, 0, 2.5)))) > log(100))

Shows that this prior assumes almost 25% probability of an effect with odds ratio > 100 or < 0.01. This to me seems unrealistic for most contexts. But please, make sure to check that my reasoning and code makes sense to you - I definitely do make mistakes :-)

Prior of N(0,2) puts such large effects a priori just below 5%, which is still quite wide but not unreasonably wide.

Hope that helps.

Diva · September 15, 2019, 12:35pm

Thanks for your reply. Another question is how I could extract 90% credible intervals instead of 95% credible intervals as default? Thanks in advance.

Best,

Diva

mcol · September 15, 2019, 1:27pm

Most functions, such as summary, posterior_interval and many more accept a prob argument, which you could set to prob=0.9 for a 90% posterior interval. Check the documentation of the functions you use to summarize your data.

Diva · September 15, 2019, 2:19pm

Thanks for your answer. I have a new problem, about interaction. I constructed a formula like bf(Rf ~ a1 + a2 + a3 + a4 + a5 + a6 * b1 + (1 | BigGroup/SmallGroup). But when I run the model in brms, some errors appeared like “Error in unserialize (socklist[[n]]): error reading from connection”; and “Error in serialize (data, node$con, xdr = FALSE): error writing to connection”. Could you please help check what’s wrong with my model? Does the interaction symbolized by “a6 * b1” in the formula?

Pascal_J · September 15, 2019, 2:28pm

Do you use the future package?

Diva · September 15, 2019, 2:43pm

No, just brms & rstan. Is there any document explaining how to construct a formula with interaction terms?

martinmodrak · September 16, 2019, 9:09am

This almost certainly is not (directly) about the interaction term. These errors appear when some of the Stan threads crashed without giving out a nice error - this is typical of crashes within the C++ code. This might even be a bug in Stan, but is often more inocuous (e.g. running out of memory). To troubleshoot, you should definitely run with chains=1 and possibly also verbose=TRUE .

My best bet would be memory - how many group levels you have for the a6 and b1 terms? If this is very large you could indeed run out of memory.

Topic		Replies	Views
Three-Level-Meta-Analyis in brms: "Error evaluating the log probability at the initial value." brms	3	624	January 29, 2022
One chain not moving: sampling for complex nonlinear mixed effects model brms techniques , fitting-issues , specification , ecology	14	3353	November 6, 2020
A question about bayesian credible interval of an interaction fit in brms brms rstan , specification	5	450	August 10, 2021
Improving efficiency of a multilevel binomial logit model Modeling fitting-issues , performance	3	830	June 7, 2018
Could you please check if I specify the model by brms correctly? brms	6	667	October 15, 2019

Few effective samples for a nested three-level logistic model with a random intercept

Related topics