Priors for highly skewed multinomial word counts

stemangiola · May 15, 2020, 11:38pm

Higher skewed than what? Count data is typically highly skewed, and mutinomial models that. For a more overdispersed counts you can use dirichlet multinomial. Although none of these allow extreme large tails. I think this as multinomial is at poisson like dirichlet multinomial is at negative binomial (but this is totally intuition and non-scientific :) )

// Equivalent to multinomial(dirichlet(alpha))
  real dirichlet_multinomial_lpmf(int[] y, vector alpha) {
    	real alpha_plus = sum(alpha);

      return lgamma(alpha_plus) + sum(lgamma(alpha + to_vector(y)))
                  - lgamma(alpha_plus+sum(y)) - sum(lgamma(alpha));
  }

Where alpha is the real array parameter of a dirichlet

If the log proportions look roughly normal you should be OK with Dirichlet prior. Otherwise changing the prior to alpha does not do much, you should use something else than dirichlet, for example multinomial(softmax(parameter_coming_from_student_t) but we didn’t manage to go anywhere with that.

You could use rectangular-beta prior for extreme large tail proportional data, although I have 0 experience on that.

I think you might be well of with dirichlet_multinomial. My experience is to not go to exotic.

Topic		Replies	Views
Hierarchical multinomial model with sparse data Modeling	3	448	October 20, 2022
Choosing prior for "overdispersion" in Dirichlet Multinomial distribution Modeling prior-choice , dirichlet-multinomial	3	1659	May 12, 2020
Vague Proper Dirichlet Prior Modeling	9	4688	December 4, 2018
Problems to model choice probabilites directly instead of multinomial data with a dirichlet distribution Modeling fitting-issues , dirichlet-multinomial	3	618	October 7, 2022
Trying to generalise the dirichlet-multinomial (non-analytical) framework (replacing Dirichlet with other distributions) Modeling techniques , bioinformatics , dirichlet-multinomial	23	2616	November 5, 2018

Priors for highly skewed multinomial word counts

Related topics