Hello everyone,
I am still new to the Bayesian world and I am confused by the following.
I ran a study in which participants hourly self-report their current level of fatigue on a 100-point scale and I want to test if the level of fatigue predicts how much a participant uses their smartphone in the subsequent time frame while at work (an app on the participant’s smartphone calculates the amount of smartphone use in seconds). It turned out that smartphone use is not normally distributed and heavily zero-inflated. So I transformed it into a proportion of the maximum possible use in the given time frame (e.g., by 600 in 10 minutes) to fit a zero-inflated beta model (please let me know if this makes sense to you). I standardized fatigue within participants so that it has a mean of 0 and a sd of 0.5. I then fitted the following model:
m.f.total_a10_beta <- brm(bf(total_a10_beta ~ 1 + fatigueS + (1 + fatigueS | pp/day), zi ~ 1 + fatigueS + (1 + fatigueS | pp/day)), data = appData_beta_10, family = zero_inflated_beta(), prior = c(set_prior(“normal(0, 0.5)”, class = “b”), set_prior(“cauchy(0, 1)”, class = “sd”)), inits = 0, control = list(adapt_delta = 0.90))
Rhat values, effective samples, and trace plots indicate that the model converged.
summary(m.f.total_a10_beta)
Family: zero_inflated_beta
Links: mu = logit; phi = identity; zi = logit
Formula: total_a10_beta ~ 1 + fatigueS + (1 + fatigueS | pp/day)
zi ~ 1 + fatigueS + (1 + fatigueS | pp/day)
Data: appData_beta_10 (Number of observations: 1531)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000
Group-Level Effects:
~pp (Number of levels: 82)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 0.36 0.08 0.20 0.50 731 1.01
sd(fatigueS) 0.11 0.08 0.00 0.31 2369 1.00
sd(zi_Intercept) 0.84 0.10 0.65 1.05 1575 1.00
sd(zi_fatigueS) 0.21 0.15 0.01 0.57 1166 1.00
cor(Intercept,fatigueS) -0.20 0.56 -0.97 0.91 3453 1.00
cor(zi_Intercept,zi_fatigueS) -0.01 0.53 -0.94 0.93 3102 1.00
~pp:day (Number of levels: 243)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 0.09 0.07 0.00 0.24 983 1.01
sd(fatigueS) 0.11 0.09 0.00 0.32 2395 1.00
sd(zi_Intercept) 0.26 0.15 0.01 0.56 538 1.02
sd(zi_fatigueS) 0.28 0.20 0.01 0.75 729 1.00
cor(Intercept,fatigueS) 0.03 0.58 -0.95 0.95 3414 1.00
cor(zi_Intercept,zi_fatigueS) 0.14 0.56 -0.92 0.96 1443 1.01
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept -1.18 0.07 -1.32 -1.05 1991 1.00
zi_Intercept 0.44 0.11 0.23 0.65 1427 1.00
fatigueS -0.07 0.09 -0.24 0.10 4907 1.00
zi_fatigueS -0.24 0.12 -0.48 -0.00 4659 1.00
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
phi 2.40 0.15 2.11 2.72 2072 1.00
Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
is a crude measure of effective sample size, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
This is the summary from the model. If I look at the population-level effects, I see that the predictor fatigueS has a small, negative estimate. However, when I plot the marginal effect I get the following:
Here, it looks like as fatigue increases, so does subsequent smartphone use (albeit not by much).
Using lme4, I have never encountered something similar. I am at a loss how to interpret this.
Maybe someone can help? Sorry if I did not provide all the information needed…
Best,
Jonas
Please also provide the following information in addition to your question:
- Operating System: macOS Mojave 10.14.6
- brms Version: 2.9.0