Hello, I’m trying to model some longitudinal data in a multilevel model in brms. The data is from skin conductance recordings and positively skewed, so I’ve adopted the generalized log-normal model. The PPD was heavily dispersed, even with weakly informed priors. So, I reduced the model down and only examined a single-level model and the results were the same. Applying a gamma link function resulted in similar results, which stunned me because when I used rstanarm, there was some over-dispersion but not on the magnitude I’m seeing in the brms model. I should add, I’m new to brms, so perhaps there’s some specification I’m missing. Is there an explanation for this over-dispersion in the PPD? Could someone point me in the direction of expanding the model to resolve it, if so?
Here’s the model below and it’s summary.
priors <- c(
set_prior("student_t(4, -3.4, 0.1)", class="Intercept"),
set_prior("normal(0, 1)", class="b", coef="Lat"),
set_prior("normal(0, 1)", class="b", coef="Timings"),
)
ln_fit <- brm(
SCR ~ Timings + Lat,
data = scr_data_long, family = lognormal(), prior = priors,
chains = 4, cores = 3, warmup = 1000, iter = 2000, thin = 4
)
Family: lognormal
Links: mu = identity; sigma = identity
Formula: SCR ~ Timings + Lat
Data: scr_data_long (Number of observations: 4263)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 4;
total post-warmup draws = 1000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -5.70 0.15 -5.99 -5.43 1.00 828 931
Timings 0.03 0.00 0.02 0.03 1.00 846 953
Lat -0.13 0.08 -0.29 0.01 1.00 984 837
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 3.34 0.04 3.27 3.41 1.00 976 866
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
I also show the mcmc plot below and one draw from the PPD in a histogram figure.
For reference, here’s a histogram of the actual data.