I have data on historic election results in Britain that I would like to model using brms.
The data look something like this:
party1 party2 party3 party4 party5 region 0.4 0.4 0.2 0 0 England 0.5 0.5 0 0 0 England 0.2 0.5 0.1 0 0.2 Scotland 0.3 0.2 0.4 0.1 0 Wales
These are well suited to modelling using dirichlet regression: results can take any value between 0-1 and all outcomes must sum to 1.
The problem, however, is that some parties do not stand in particular cases. Typically this is for two reasons. Either the party is a regionalist one and stands only in particular areas or the party is a minor one without the resources to contest all elections. In both cases, the party gets a zero.
This is a problem as dirichlet regression requires that all values be greater than 0. Rather than fudge this by replacing 0 with a tiny number, I’d like to model the zero-inflation. Effectively, this would be a multinomial extension of the zero-inflated beta distribution in the same way that the dirichlet distribution is a multinomial extension of the standard beta distribution. I’d like to do this both because I’m intending to use the model to make predictions and because the zero-inflation is theoretically interesting in its own right.
Is this possible in brms? For example, is it possible for me to define this using the
Edit: This appears to be possible using the
zadr() function in the
Compositional package and an accompanying paper can be found here. There is also a paper on zero-inflated dirichlet regression when modelling microbiome data.