Zero-one inflated beta as "censored" beta?

This is an intellectual musing; not one I am currently in need of.

I finally went through @matti’s excellent tutorial on zero-one-inflated Beta models for VAS scales:

It left me wondering if the rate of extreme ratings (0s and 1s) could be though of as a “catch all more extreme than this threshold” just like the extreme categories in ordinal regression? I find it likely that participants map the VAS response option to a narrow region on the Beta distribution, e.g., [0.2, 0.7], and everything more extreme than that “rounded in” to the nearest response option (here [0, 0.2[ becomes 0).

That way, maybe the model could be parametarized as the Beta parameters + thresholds. I.e.:

  • precision,
  • mean [0, 1],
  • lower threshold [0, upper]
  • upper threshold [lower, 1].

Since the Beta parameters are generative of ALL data in this model, these parameters should be estimated with greater precision, and perhaps even greater validity. As I understand it, the ZOIB “discards” 0s and 1s as a different process, but I think that a whole bunch of 1s should actually be taken as evidence that the beta-mean is higher.

So I’m wondering (1) if this is even possible to model, and (2) is there already a way to model it? If so, this would def be worth a tutorial, and perhaps a paper! It could be generalized to much more than beta.


I have had similar thoughts about the extremes in [0, 1] scales, i.e. that we should not model them as separate processes, or at least as being affected by the same mean parameter.

My work around in a current project where responses were on a visual analog scale between 0 and 100 (with a lot of 0s and 100s and multimodality) is to simply model them as ordinal with 20 categories assuming that subjects can’t properly differentiate within an 5 point interval (out of a total of 100 points) anyway. This is surely not the most principled way of handling this (and will not apply to all kinds of beta distributed responses), though.

Another option is to use the overall mean parameterization of zero-inflated and hurdle models ( which should generalize to zero-one-inflated beta models as well.

Anyway, I would be very interested in hearing other ideas about this issue.


I’m glad someone is bringing this topic up. I’ve been working on simulations and analyses of large datasets of visual analog scale (VAS) ratings with the aim of evaluating different models of these types of data.

My conclusion so far is similar to what Paul is suggesting; while the ZOIB clearly “fits” the data better than a gaussian model, in practice we may nevertheless usually opt to use the gaussian because it doesn’t separate the edge responses into a separate process. (I am happy to share what I have so far if anyone is interested.) One way to think of this is in terms of statistical power (I know…): It is possible to construct situations where ZOIB wins, but in many realistic scenarios, a gaussian model will have greater power just because all responses count toward the parameters of interest. A normal model is also much easier to interpret, which matters in my opinion.

However, I’m not greatly in favor of Jonas’s suggestion about modeling the data with thresholds for extreme categories. The data that I’m familiar with doesn’t seem to favor that interpretation. e.g. participants tend to be more careful in selecting values near the edges than near the middle of the scale (i.e. the distance between a .97 and .98 rating is more meaningful than between .57 and .58.)

Paul’s approach seems more useful, but then also has the limitations that he points out. For example, the ratings may have a resolution of 1000 points on the scale, and that’s just too many intercepts. (And I would not prefer to cut things up post hoc to reduce the number of scales–although this is common practice and usually seems to work OK.)

Having said all that, the “marginal” model approach ( looks very interesting, especially if it works for zero and one inflation.


Power will only be a sensible measure if the model does not exceed the nominal type 1 error rate. I am not sure how the normal models behaves in that respect when applied to ZOIB models. I think the “marginal” approach has indeed some potential and it is worth investigating all the alternatives in more detail, once we have marginal approach up and running in brms.

Thanks for your thoughts! I really appreciate your work on this and see the points you’re making. I’ll await the upcoming work and revisit it later.

Have you thought about using the beta-binomial model? This would involve rescaling the values to integers zero to M, where M is the maximum score (here, M would also determine the resolution at which you think the VAS can measure).

The advantage of the beta-binomial model is that extreme responses (0, M) will not lead to infinite likelihoods as in the beta distribution. The disadvantage is that it is much slower than a normal distribution. I did some simulations to figure out if it matters if one uses a beta-binomial model or a normal model, and it looked as if the beta-binomial model was only clearly better if the responses were very skewed.
Here are the simulations (the topic of this document is modeling sum-scores, which can also be seen as a result of a beta-binomial process):


Hi Guido, thanks for the suggestion. I have not considered the beta binomial before, but I think it can make sense and will certainly take a look at it. (And thanks for sharing the simulations!)

Hi I just randomly found this… I had the same intuition (before reading the post), and just coded up a simulation that does what you describe – i.e., combines ordinal modeling with the beta regression to handle 0/1 outcomes with only a single linear model. All I have is the simulation, though, no time to code in Stan yet.

Hi @matti -

I’ve coded up a new model that I think might be able to implement these kinds of things. It explicitly combines beta regression within ordinal modeling. I’m currently working on comparing it to the existing models. You can see the code at this post:

Initial comparisons to the ZOIB and standard beta regression suggest it captures a larger (versus standard beta reg) and more statistically efficient (versus ZOIB) parameter on the mean of the overall distribution for a given covariate. The model only has a single linear model that predicts everything, so marginal/predicted effects over the whole distribution are fairly easy to calculate.

What is also interesting is that because it has ordered cutpoints, it could be possible also to model where the cutpoints occur–i.e., what is the point where a respondent “moves” from (0,1) to 0 or 1.

Would love to get your thoughts on the approach – @paul.buerkner too if you have time.


Sounds exciting! I look forward to reading more about your model.

I am relatively new to Bayesian modeling and I have found myself looking for a way to utilize the zero-one inflated beta model as described here for clinical trial data set. Briefly, I have pre/post data (see quick graph of data below) where the scaled outcome is 0-100, but as you can see there is an excess of both 0 and 100. We expected skew, but the inflation was not anticipated in this sample.

I am looking for a worked example that might use mixed effects model approach for these data that could help me think through coding the model. This is what I have thusfar, although I admit that it does not converge well with this parameterization. I have several outcomes with similar distributions, so any example or advice would be much appreciated.

zoib_model <- bf(
Emo10 ~Tx + Time + Tx:Time + (1|ID),
phi ~ Tx + Time + Tx:Time + (1|ID),
zoi ~ Tx + Time + Tx:Time + (1|ID),
coi ~ Tx + Time + Tx:Time + (1|ID),
family = zero_one_inflated_beta()
SFemoBayesZOIB <- brm(data=SFemo, formula = zoib_model,
iter = 4000,
warmup = 1000,
chains = 4,
cores = 4,
seed = 123)


Hi -

What might work better for you is a model I designed that is a simplified version of zero-one inflation. It’s particularly for situations where you don’t really care about the 0s and 1s, which seems like is your case. Also, you can fit this model with brms with mixed effects etc, and it should be more stable/easier to estimate. See the link here:


@saudiwin This is very helpful and I have tried out your model. It solves most of the convergence problems and the results are much easier to interpret!


Glad to hear it!