Modeling with bernoulli or binomial distribution

jaredv · September 16, 2022, 12:07am

Hello,

I apologize if this is the incorrect forum to ask this question. I am happy to delete it and redirect it elsewhere if it is inappropriate.

I am modeling a small dataset (n = 256) with a few predictors using brms (brms 2.17.0, RStudio 4.2.1). The data come from an experiment in which 64 participants each participated in two conditions, with two trials per condition (several participants had to have their data on one trial thrown out). In each trial, participants could either respond one way (= 0) or another (= 1). Thus, on a trial-by-trial basis, the data are bernoulli distributed. However, on a condition-by-condition basis, the data are binomially distributed (i.e., participants could receive a sum total of 0, 1, or 2 per condition).

I have successfully modeled the data as intended. My question is, rather, about what the most appropriate modeling strategy is. While modeling the data as binomially distributed requires cutting the number of observations in half (i.e., because I have to aggregate the data within conditions), it also allows taking account in the model formula of the number of trials participants completed (i.e., most completed 4 trials, but some completed only 3 trials). Is it somehow “more informative” or otherwise preferable to use one data distribution over another? In particular, is it preferable, on some sort of statistical grounds, to use one observation model over another in this particular case? Presently, I have used the “bernoulli” and “binomial” (and “beta-binomial”, but the overdispersion parameter was unnecessary) families to model the data.

It is worth mentioning that all models, regardless of the assumed data distribution, give very similar posterior parameter estimates, although a comparison of models fitted to the binomial data was slightly more skewed towards favoring a full model (with an interaction term) compared to a null model than was a comparison of models fitted to bernoulli data.

Thanks for any information or guidance offered.

Solomon · September 16, 2022, 2:20pm

If I followed your question correctly, I don’t think is should matter. Switching from Bernoulli to binomial may change the way you’ve structured your data, but it doesn’t change the amount of information in the model. The binomial n parameter ensures that.

jaredv · September 16, 2022, 2:33pm

Thank you for the helpful response, I was thinking along similar lines but I wanted to be sure. Also, thanks for the work on statistical rethinking with brms, it has been an invaluable resource to me in the past.

Solomon · September 16, 2022, 2:34pm

Cheers!

Topic		Replies	Views
bayes_R2 estimation brms	15	1453	June 17, 2020
Bernoulli or binomial distribution Modeling	2	693	August 22, 2020
Binomial instead of bernoulli - is there something similar for categorical and ordinal models? brms techniques	4	738	August 9, 2022
'Trials ' missing in model description with brm brms	3	2368	September 14, 2023
The Heteroskedastic Probit Model in brms (w/ Bernoulli or Binomial) brms binomial-response	1	779	January 21, 2021

Modeling with bernoulli or binomial distribution

Related topics