Thanks so much Bob. Very helpful comments indeed. I will try to expand and add some clarity.
In the dataset, we only have point estimates, and 95% confidence intervals for sensitivity and specificity for each study, as in d
above. We don’t have access to the numbers of participants who were true positives
, true negatives
, false positives
, and false negatives
, nor overall denominators.
So derived.phi_sens
& derived.phi_spec
allows us to weight study level estimates by their precision. p
in the brm
formula allows us to model the correlation between sensitivity and specificity within studies. e.g. see here..
I followed your advice and also modelled these data on the unconstrained log odds scale following this approach:
mdata2 <- d %>%
mutate(
logodds_sens = log(sens/(1-sens)),
logodds_spec = log(spec/(1-spec))
) %>%
mutate(
# SE on probability scale
se_sens_prob = (sens_upper - sens_lower) / (2 * 1.96),
se_spec_prob = (spec_upper - spec_lower) / (2 * 1.96),
# SE on log-odds scale
se_sens_logodds = se_sens_prob / (sens * (1 - sens)),
se_spec_logodds = se_spec_prob / (spec * (1 - spec))
)
m2 <- brm(
bf(logodds_sens | se(se_sens_logodds) ~ 1 + (1|p|id)) +
bf(logodds_spec | se(se_spec_logodds) ~ 1 + (1|p|id)) +
set_rescor(FALSE),
data = mdata2,
family = "gaussian",
control = list(adapt_delta = 0.99),
cores = 4,
chains = 4,
iter = 4000
)
Here is a summary of the posteriors for sens and spec on the probability scale:
add_epred_draws(object = m2,
newdata = mdata2,
re_formula = NA) %>%
mutate(epred_prop = inv_logit_scaled(.epred),
.category = case_when(.category=="logoddssens" ~"Sens",
.category=="logoddsspec" ~ "Spec")) %>%
ggplot() +
stat_halfeye(aes(x=epred_prop, fill=.category), alpha=0.8, ) +
facet_grid(.category~.)
I then fit an additional model m3
with the set_rescor(TRUE)
option, and compare all three in a table here:
(Note point estimates very similar between the beta models and log odds models, but slightly greater uncertainty with the beta model).
Would be very interested in your thoughts about:
- why this might be, and which model would be “preferred”, and why. Or some other alternative approach?
- any insights into modelling the residual correlation - I am not 100% clear about what this is doing and whether is necessary here.
Thanks!