Confused by estimate + marginal_effects plot zero-inflated beta model brms

Hello everyone,

I am still new to the Bayesian world and I am confused by the following.
I ran a study in which participants hourly self-report their current level of fatigue on a 100-point scale and I want to test if the level of fatigue predicts how much a participant uses their smartphone in the subsequent time frame while at work (an app on the participant’s smartphone calculates the amount of smartphone use in seconds). It turned out that smartphone use is not normally distributed and heavily zero-inflated. So I transformed it into a proportion of the maximum possible use in the given time frame (e.g., by 600 in 10 minutes) to fit a zero-inflated beta model (please let me know if this makes sense to you). I standardized fatigue within participants so that it has a mean of 0 and a sd of 0.5. I then fitted the following model:

m.f.total_a10_beta <- brm(bf(total_a10_beta ~ 1 + fatigueS + (1 + fatigueS | pp/day), zi ~ 1 + fatigueS + (1 + fatigueS | pp/day)), data = appData_beta_10, family = zero_inflated_beta(), prior = c(set_prior(“normal(0, 0.5)”, class = “b”), set_prior(“cauchy(0, 1)”, class = “sd”)), inits = 0, control = list(adapt_delta = 0.90))

Rhat values, effective samples, and trace plots indicate that the model converged.

summary(m.f.total_a10_beta)
Family: zero_inflated_beta
Links: mu = logit; phi = identity; zi = logit
Formula: total_a10_beta ~ 1 + fatigueS + (1 + fatigueS | pp/day)
zi ~ 1 + fatigueS + (1 + fatigueS | pp/day)
Data: appData_beta_10 (Number of observations: 1531)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000

Group-Level Effects:
~pp (Number of levels: 82)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 0.36 0.08 0.20 0.50 731 1.01
sd(fatigueS) 0.11 0.08 0.00 0.31 2369 1.00
sd(zi_Intercept) 0.84 0.10 0.65 1.05 1575 1.00
sd(zi_fatigueS) 0.21 0.15 0.01 0.57 1166 1.00
cor(Intercept,fatigueS) -0.20 0.56 -0.97 0.91 3453 1.00
cor(zi_Intercept,zi_fatigueS) -0.01 0.53 -0.94 0.93 3102 1.00

~pp:day (Number of levels: 243)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 0.09 0.07 0.00 0.24 983 1.01
sd(fatigueS) 0.11 0.09 0.00 0.32 2395 1.00
sd(zi_Intercept) 0.26 0.15 0.01 0.56 538 1.02
sd(zi_fatigueS) 0.28 0.20 0.01 0.75 729 1.00
cor(Intercept,fatigueS) 0.03 0.58 -0.95 0.95 3414 1.00
cor(zi_Intercept,zi_fatigueS) 0.14 0.56 -0.92 0.96 1443 1.01

Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept -1.18 0.07 -1.32 -1.05 1991 1.00
zi_Intercept 0.44 0.11 0.23 0.65 1427 1.00
fatigueS -0.07 0.09 -0.24 0.10 4907 1.00
zi_fatigueS -0.24 0.12 -0.48 -0.00 4659 1.00

Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
phi 2.40 0.15 2.11 2.72 2072 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
is a crude measure of effective sample size, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

This is the summary from the model. If I look at the population-level effects, I see that the predictor fatigueS has a small, negative estimate. However, when I plot the marginal effect I get the following:

image

Here, it looks like as fatigue increases, so does subsequent smartphone use (albeit not by much).
Using lme4, I have never encountered something similar. I am at a loss how to interpret this.
Maybe someone can help? Sorry if I did not provide all the information needed…

Best,
Jonas

Please also provide the following information in addition to your question:

  • Operating System: macOS Mojave 10.14.6
  • brms Version: 2.9.0

This makes sense. However, it is always preferable to not rely solely on intuition and check your model works with posterior predictive check (see pp_check and browse these forums for more details)

Also do I understand correctly, that pp marks your participants and day marks days of the study?

This might be because the marginal effects plot also includes the varying slope from (1 + fatigueS | pp/day) so quite possibly the participant and day you plot the marginal for have positive varying slope, overriding the overall negative trend. You might try changing the pp and day values you plot the marginal effect for and/or investigate the estimates of the individual varying slopes.

And a technical note: you can use triple backticks (```) to mark blocks of text as code or program output which would make the model summary nicer to read.

Hope that helps!