Beta family distribution and posterior predictive checks with continuous outcome variable

helen_schmidt · May 9, 2024, 1:02am

Hi everyone!

I’m relatively new to Bayesian modeling, and I was hoping to get some insights into my approach.

I have a dataset of ratings, where people are asked to make binary choices about a given stimulus. They make many ratings about each stimulus over the course of an experiment, resulting in a continuous, average percent chosen variable that is bound between 0 and 1 (PercentChosen). I am interested in exploring how features about the stimulus (Stimulus) and the condition (Condition) that people were assigned to influence their average proportion of ratings. There are 21 different pairs of stimuli (Pair) and 57 participants (Participant) in this dataset. Stimulus is a continuous numeric variable, Condition is a factor with two levels (Friend, Rival), Pair is a factor with 21 levels, and Participant is a factor with 57 levels.

Please see my full brms model below:

brm(PercentChosen ~ Stimulus * Condition + (Condition|Pair) + (Condition|Participant),
data = df,
prior = c(prior(normal(0,10), class = "b"),
prior(cauchy(0,5), class = "sd"),
prior(normal(0.5,1), class = "Intercept")),
family = Beta(link = "logit"),
iter = 10000,
warmup = 2000,
init = 0,
chains = 4,
cores = 6)

Here is the model summary I get:

Family: beta 
Links: mu = logit; phi = identity 
Formula: PercentChosen ~ Stimulus * Condition + (Condition | Pair) + (Condition | Participant) 
Data: df (Number of observations: 9500) 
Draws: 4 chains, each with iter = 10000; warmup = 2000; thin = 1;
total post-warmup draws = 32000

Group-Level Effects: 
~Pair (Number of levels: 21) 
                              Estimate Est.Error l-95% CI u-95% CI    Rhat Bulk_ESS Tail_ESS
sd(Intercept)                  0.71489   0.12139  0.52183  0.99565 1.00095     6334    11065
sd(ConditionA)                 1.24624   0.21052  0.91177  1.73056 1.00112     6247    11241
cor(Intercept,ConditionA)     -0.97654   0.01324 -0.99278 -0.94268 1.00057     9074    15853

~Participant (Number of levels: 57) 
                              Estimate Est.Error l-95% CI u-95% CI    Rhat Bulk_ESS Tail_ESS
sd(Intercept)                  0.01371   0.01070  0.00050  0.03988 1.00017    26272    17285
sd(ConditionA)                 0.02098   0.01625  0.00081  0.06054 1.00019    23371    17556
cor(Intercept,ConditionA)     -0.16661   0.57854 -0.97376  0.93088 1.00006    45602    22570

Population-Level Effects: 
                                    Estimate Est.Error l-95% CI u-95% CI    Rhat Bulk_ESS Tail_ESS
Intercept                            0.01210   0.16454 -0.31312  0.33992 1.00129     3561     6790
Stimulus                            -0.21907   0.13921 -0.49233  0.05294 0.99993    37531    25153
ConditionA                          -0.11456   0.28352 -0.67777  0.44660 1.00123     3673     7099
Stimulus:ConditionA                  0.58161   0.19753  0.20340  0.96824 1.00014    38952    26825

Family Specific Parameters: 
    Estimate Est.Error l-95% CI u-95% CI    Rhat Bulk_ESS Tail_ESS
phi  0.92056   0.01099  0.89916  0.94206 1.00002    73177    22498

My question comes when I use pp_check() to run posterior predictive checks on this model.

pp_check(model, ndraws = 50)

pp_check(model, "stat_grouped", ndraws = 50, group = "Condition")

Based on online tutorials that I have been following (namely Causal inference with beta regression | A. Solomon Kurz), it seems like my approach of using the Beta distribution should be appropriate for the bounded (0,1) continuous outcome. However, I’m not sure why there are such differences between the observed data and posterior draws in my pp_check() distribution plots.

Any insights or guidance would be greatly appreciated, thank you!

AWoodward · May 9, 2024, 2:29am

Hello @helen_schmidt, what do the data look like? Based on the PPC there it looks to me like the data are strongly clustered around some round numbers (0, 0.2, 0.4, 0.6, 0.8 and 1, or thereabouts).

Another potential issue with the model is that the priors are quite wide. On the logistic scale this can create some strange behaviour because it puts a lot of prior weight on very large effects.

MarijnG · May 9, 2024, 12:08pm

Hi there,

I will add some thought to what @AWoodward already said. To me, the data are not really continuous. If, for example, you have 10 ratings, the only numbers you can get are 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1, which does not meet the assumption of an infinite number of numbers.

Have you considered using a binomial model to model your outcomes? I think it would better fit with what you are doing as you can think of scoring 0 or 1 as “successes and failures”.

Solomon · May 9, 2024, 6:22pm

100% agree with @MarijnG’s suggestion.

helen_schmidt · May 10, 2024, 1:17am

Thank you everyone for such prompt and helpful replies!

I’ve narrowed my priors and switched to a beta-binomial distribution, which together seem to fix the problem. Thanks!

brm(SelectedCount | trials(PresentedCount) ~ Stimulus * Condition + (Condition|Pair) + (Condition|Participant),
data = df,
prior = c(prior(normal(0,5), class = "b"),
prior(cauchy(0,5), class = "sd"),
prior(normal(0.5,1), class = "Intercept")),
family = beta_binomial(link = "logit"),
iter = 10000,
warmup = 2000,
init = 0,
chains = 4,
cores = 6)

Topic		Replies	Views
Approach to estimating contrasts using brms model Modeling	0	30	May 15, 2025
Post-hoc test brms brms	5	2857	May 21, 2022
Beta-Binomial: Why not a default family in the package? brms	2	1778	August 28, 2019
Help on brms modeling Modeling	9	434	January 22, 2023
Help with understanding of bayesian inference using GLM and BRMS brms	3	1411	November 13, 2018

Beta family distribution and posterior predictive checks with continuous outcome variable

Related topics