Continuous analog of the multinomial for simplexes that contain zeros

AFH · July 2, 2021, 2:52pm

This is a statistics question in addition to a Stan one:

I am modeling data collected at different ages. I started out modeling counts at age for simplicity’s sake, so the vector of real data counts is modeled as a multinomial of the simplex of estimated counts from the process model.

However, the true data aren’t actually integers, they’re real numbers (technically densities, not counts). I want to update the model to reflect this, and input the data as reals instead of integers.

But I’m not sure how to replace the multinomial. The obvious candidate is the Dirichlet, but every element of the data vector (alpha) needs to be nonnegative, which is not the case for my data (it contains a lot of zeros): 23.1 Dirichlet Distribution | Stan Functions Reference

Can you think of another analogous distribution that I can use (or a different version of the Dirichlet that allows zeros)?

martinmodrak · July 10, 2021, 6:53am

I think the most direct way would be to have “zero-inflated Dirichlet” modelled after “zero-inflated Beta” (because Dirichlet is the multivariate extension of Beta). Implement the zero-inflated dirichlet distribution · Issue #722 · paul-buerkner/brms · GitHub has some references on how that construction could be made (you can’t just add a vector of zero-inflation probabilities as you know that not all values can be zeroes simultaneously, although that might be a good simple approximation).

With tha said, it is rare that one directly observes densitites - in practice, densities are often derived from actual count observations. It is possible those are not accessible to you, but one needs to be mindful of the fact that densities derived from low counts would be inherently more noisy than those derived from large counts (and thus one could e.g. include any predictors presumed related to the underlying counts as predictors for the precision of the Dirichlet).

Best of luck with your model!

Topic		Replies	Views
Possible to model zero-inflated dirichlet distribution? brms	2	1176	August 5, 2019
Zero-inflated negative and positive data - Zero-Inflated Gaussian? Modeling specification	3	722	October 10, 2020
Multinomial with non-integer data Modeling specification	5	749	June 24, 2024
Zero Inflated models with missing values Modeling missing-data	4	733	March 27, 2019
How to properly use Simplex with Dirichlet likelihood? Modeling	9	1211	November 2, 2020

Continuous analog of the multinomial for simplexes that contain zeros

Related topics