Priors for highly skewed multinomial word counts

Higher skewed than what? Count data is typically highly skewed, and mutinomial models that. For a more overdispersed counts you can use dirichlet multinomial. Although none of these allow extreme large tails. I think this as multinomial is at poisson like dirichlet multinomial is at negative binomial (but this is totally intuition and non-scientific :) )

// Equivalent to multinomial(dirichlet(alpha))
  real dirichlet_multinomial_lpmf(int[] y, vector alpha) {
    	real alpha_plus = sum(alpha);

      return lgamma(alpha_plus) + sum(lgamma(alpha + to_vector(y)))
                  - lgamma(alpha_plus+sum(y)) - sum(lgamma(alpha));
  }

Where alpha is the real array parameter of a dirichlet

If the log proportions look roughly normal you should be OK with Dirichlet prior. Otherwise changing the prior to alpha does not do much, you should use something else than dirichlet, for example multinomial(softmax(parameter_coming_from_student_t) but we didn’t manage to go anywhere with that.

You could use rectangular-beta prior for extreme large tail proportional data, although I have 0 experience on that.

I think you might be well of with dirichlet_multinomial. My experience is to not go to exotic.

3 Likes