In this recent brilliant article, Dr. Kubinec proposed a new Bayesian model:
- Ordered Beta Regression
" for continuous distributions with both lower and upper bounds, such as data arising from survey slider scales, visual analog scales, and dose-response relationships", further explained below. The package uses brms
under the hood.
This model employs the cutpoint technique popularized by ordered logit to fit a single linear model to both continuous (0,1) and degenerate [0,1] responses. The model can be estimated with or without observations at the bounds, and as such is a general solution for this type of data. Employing a Monte Carlo simulation, I show that the model is noticeably more efficient than ordinary least squares regression, zero-and-one-inflated beta regression, re-scaled beta regression and fractional logit while fully capturing nuances in the outcome. I apply the model to a replication of the Aidt and Jensen (2012) study of suffrage extensions in Europe. The model can be fit with the R package
ordbetareg
to facilitate hierarchical, dynamic and multivariate modeling.
Given these exciting characteristics, I decided to fit the Ordered Beta Regression model described in ordbetareg’s vignette.
Following the vignette, I plotted a density PPC, interpreted by Dr. Kubinec as
“The model can’t capture all of the modality in the distribution – there are effectively four separate modes – but it is reasonably accurate over the middle responses and the responses near the bounds.”
The density plot doesn’t seem ideal for this model, probably because the model corresponds to a combination of continuous + discrete outcomes.
I thus tried these PPCs:
Given these 5 different PPC plots, my first impression was that “this indicates a failure of the model to describe an aspect of the data” (Gelman et al., 2020).
However, Dr. Kubinec suggested that a PPC depicticing a combination of histogram (for continuous values), and bar plot (for 0/1) would be better for the Ordered Beta Regression model. Unfortunately I do not have the abilities necessary to create such plot.
Question
I would like to use this case study to ask:
- How can I create the custom PPC plot mentioned above?
- How do you interpret the 5 PPC plots shown above?
Lastly, I’d like to highlight that my ultimate goal is to understand the Ordered Beta Regression model better so I can use it in future analyses. It looks extremely promising.
Code:
pacman::p_load(ordbetareg,
brms,
ggplot2,
patchwork)
data("pew")
model_data <- select(pew,therm,age="F_AGECAT_FINAL",
sex="F_SEX_FINAL",
income="F_INCOME_FINAL",
ideology="F_IDEO_FINAL",
race="F_RACETHN_RECRUITMENT",
education="F_EDUCCAT2_FINAL",
region="F_CREGION_FINAL",
approval="POL1DT_W28",
born_again="F_BORN_FINAL",
relig="F_RELIG_FINAL",
news="NEWS_PLATFORMA_W28") %>%
mutate_at(c("race","ideology","income","approval","sex","education","born_again","relig"), function(c) {
factor(c, exclude=levels(c)[length(levels(c))])
}) %>%
# need to make these ordered factors for BRMS
mutate(education=ordered(education),
income=ordered(income))
# Fit model
# ord_fit_mean <- ordbetareg(formula=therm ~ mo(education) + mo(income) + (1|region),
# data=model_data,
# cores=2,chains=2,iter=1000)
# Instead of fitting the model, one can load it through the ordbetareg package
data("ord_fit_mean")
pp_check(ord_fit_mean)
pp_check(ord_fit_mean, type = "ecdf_overlay") +
pp_check(ord_fit_mean, type = "hist") +
plot_annotation(tag_levels = "A")
pp_check(ord_fit_mean, type = "stat") +
pp_check(ord_fit_mean, type = "stat_2d") +
plot_annotation(tag_levels = "A")