Transforming a multinomial model into a dirichlet-multinomial

jriou · February 17, 2022, 8:59am

Hi everyone,

So I have this ODE model that is fitted to categorical data using a multinomial likelihood. Basically it’s about antimocrobial resistance, and the categories correspond to increasing levels of resistance. This model works well, the sampling is quick, there are no divergences and the posterior predictive checks are good on average. So I don’t think there is anything wrong with the model itself. However, the model wildly underestimates the uncertainty, or equivalently the data is overdispersed. So to take a Poisson analogy I read from @stemangiola, I need for my multinomial model what the negative binomial is to the Poisson.

So following a few threads in this forum I tried a dirichlet-multinomial version of the same model. It all started well, the sampling was relatively quick and it all finished without warning. The posterior predictive checks were exactly what I wanted: very good fit to data, the right amount of uncertainty. However, and here comes my problem, I had more divergences that I have ever seen in a model (and believe me I have fitted some shady models). >95% divergences in all runs, regardless of adapt_delta, without any way to know where the problem is exactly (since it’s apparently everywhere). I also tried changing the priors or fixing the parameters one by one without any improvement. The prior predictive checks also look good, without divergence. So my conclusion is that the underlying model is fine but that I did not implement the Dirichlet-Multinomial properly, althought I mostly mirrored other implementations (eg [https://www.biorxiv.org/content/10.1101/711317v3](https://this one)). I also tried reimplementing a few times without success, that’s why I finally came here.

The models is quite big and I’m not allowed to share the data, so I thought I’ll just focus on the important parts. I understand that this makes it more difficult to help me but maybe it’s enough to get some advice or fix an obvious mistake? If not I will come back with a simulated dataset.

Here are the important parts of the initial model:

parameters {
  real<lower=0> theta[P]; // parameters for the ODE
} 
transformed parameters {
  real y[S,C]; // C is the number of ODE compartments (one more than K, the number of categories), S is the number of evaluation times
  simplex[K] output[S];
  y = integrate_ode_rk45(ode, init, t0, ts, theta, x_r, x_i, 1.0E-6, 1.0E-6, 1.0E3);
  for(i in 1:S) for(j in 1:C) y[i,j] = y[i,j] + 2*1.0E-6; // avoid negative values due to solver tolerance
  for(i in 1:S) output[i] = ( to_vector(y[g,i,2:C]) ) ./ sum( to_vector(y[g,i,2:C]) ); // get the proportion in each category
}
model {
  ... // priors for theta
  for(i in 1:S) cat_data[i] ~ multinomial(output[i] );
}

and here is my Dirichlet-Multinomial version:

parameters {
  real<lower=0> theta[P]; // parameters for the ODE
  real<lower=0> phi;
  simplex[K] trans_output[G,L];
} 
transformed parameters {
  real y[S,C]; // C is the number of ODE compartments (one more than K, the number of categories), S is the number of evaluation times
  simplex[K] output[S];
  y = integrate_ode_rk45(ode, init, t0, ts, theta, x_r, x_i, 1.0E-6, 1.0E-6, 1.0E3);
  for(i in 1:S) for(j in 1:C) y[i,j] = y[i,j] + 2*1.0E-6; // avoid negative values due to solver tolerance
  for(i in 1:S) output[i] = ( to_vector(y[g,i,2:C]) ) ./ sum( to_vector(y[g,i,2:C]) ); // get the proportion in each category
}
model {
  ... // priors for theta
  phi ~ exponential(0.001); // from https://www.biorxiv.org/content/10.1101/711317v3
  for(i in 1:S) trans_output[i] ~ dirichlet(phi*output[i]);
  for(i in 1:S) cat_data[i] ~ multinomial(trans_output[i] );
}

Does someone have any advice? Thanks!

nhuurre · February 17, 2022, 10:12am

Err, shouldn’t that be multinomial(trans_output[i])? And actually, how do the shapes make sense here, why does first loop have 1:L when output is of length S and the outermost dimension of trans_output is G?

Dirichlet-multinomial has a closed-form PMF. Maybe you could use that.

functions {
  real dirichlet_multinomial_lpmf(int[] y, vector alpha) {
    real sum_alpha = sum(alpha);
    return lgamma(sum_alpha) - lgamma(sum(y) + sum_alpha)
           // + lgamma(sum(y)+1) - sum(lgamma(to_vector(y)+1) // constant, may omit
           + sum(lgamma(to_vector(y) + alpha)) - sum(lgamma(alpha));
  }
}
parameters {
  real<lower=0> theta[P]; // parameters for the ODE
  real<lower=0> phi;
}
model {
  ...
  for(i in 1:S) cat_data[i] ~ dirichlet_multinomial(phi*output[i]);
}

jriou · February 17, 2022, 3:51pm

Oh right, I made a few typos while adapting the original model into this simplified version. It’s indeed multinomial(trans_output[i]), and 1:S!

Thanks for pointing out the closed-form PMF, I’ll try it out and report back!

Many thanks!

jriou · February 17, 2022, 5:19pm

It did the trick! Thank you so much @nhuurre, I was scratching my head for quite some time!

Topic		Replies	Views
More flexible than Dirichlet-multinomial: gamma-multinomial (?) Modeling techniques , bioinformatics , dirichlet-multinomial	19	6057	May 12, 2020
Problems to model choice probabilites directly instead of multinomial data with a dirichlet distribution Modeling fitting-issues , dirichlet-multinomial	3	627	October 7, 2022
Dirichlet negative multinomial regression Modeling bioinformatics	0	520	February 19, 2021
Choosing prior for "overdispersion" in Dirichlet Multinomial distribution Modeling prior-choice , dirichlet-multinomial	3	1676	May 12, 2020
Trying to generalise the dirichlet-multinomial (non-analytical) framework (replacing Dirichlet with other distributions) Modeling techniques , bioinformatics , dirichlet-multinomial	23	2644	November 5, 2018

Transforming a multinomial model into a dirichlet-multinomial

Related topics