Hierarchical multinomial model with sparse data

Felix_Fischer · October 18, 2022, 10:24am

Hi,

i’m trying to fit a hierarchical multinomial model. Unfortunately, my data is sparse - i have 25 different categories in 40 studies, but each study only has n of 20 to a 100. So, when i fit the model below i get convergence issues.

data {
  int response[40,25];
}
parameters {
  simplex[25] p[40];
  simplex[25] hyper_p;
}

model {
  
  for (i in 1:40){
    p[i] ~ dirichlet(hyper_p);
    response[i] ~ multinomial(p[i]);
  }
}

ultimately, I want to use the dirichlet-distribution as a prior subsequently. i have tried to run this without the hyperprior (this works fine and fast), but then i get the mean probabilities and can’t see how to account for the between-study variation.

so, i wonder whether a different choice for the hyperprior could help me? i could also collapse some of the categories to make the data less sparse?

thanks, felix

kholsinger · October 18, 2022, 8:19pm

I may be missing something here, but as written it looks as if you are trying to estimate the parameters of the Dirichlet. What if you specified a symmetric Dirichlet: 23.1 Dirichlet Distribution | Stan Functions Reference ?

Felix_Fischer · October 19, 2022, 9:50pm

Thanks, I’ll try that. In my code, i defined the hyperprior as a simplex - this is wrong, as the dirichlet can take values > 1 as well. When i fix that, i still get convergence issues, the hyper_prior tends to be quite large (1000-10000). what would be a meaningful prior on the parameter of the dirichlet?

thanks, felix

kholsinger · October 20, 2022, 8:09pm

Unless you have prior information (from earlier observations or from theoretical considerations), it is probably reasonable to imagine a priori that all components of the Dirichlet are equally likely. That would be the simplex equivalent of a uniform distribution in one dimension. If that’s reasonable, then you can set all Dirichlet parameters to 1. If you are agnostic about the value the components can take except that extreme values are unlikely, then you could pick a parameter bigger than 1. A good way to get a 1-d feel for it is to look at plots of the beta distribution. Beta(1, 1) corresponds to a uniform.

Topic		Replies	Views
Problems to model choice probabilites directly instead of multinomial data with a dirichlet distribution Modeling fitting-issues , dirichlet-multinomial	3	627	October 7, 2022
Prior for Simplex, more informative than Dirichlet Modeling	9	530	February 26, 2024
Neal's funnel on a simplex Modeling fitting-issues	1	219	May 17, 2024
Hierarchical Model - Hyper priors for a Gamma Distribution Modeling	4	1726	March 12, 2020
Choosing prior for "overdispersion" in Dirichlet Multinomial distribution Modeling prior-choice , dirichlet-multinomial	3	1676	May 12, 2020

Hierarchical multinomial model with sparse data

Related topics