Hi,
i’m trying to fit a hierarchical multinomial model. Unfortunately, my data is sparse - i have 25 different categories in 40 studies, but each study only has n of 20 to a 100. So, when i fit the model below i get convergence issues.
data {
int response[40,25];
}
parameters {
simplex[25] p[40];
simplex[25] hyper_p;
}
model {
for (i in 1:40){
p[i] ~ dirichlet(hyper_p);
response[i] ~ multinomial(p[i]);
}
}
ultimately, I want to use the dirichlet-distribution as a prior subsequently. i have tried to run this without the hyperprior (this works fine and fast), but then i get the mean probabilities and can’t see how to account for the between-study variation.
so, i wonder whether a different choice for the hyperprior could help me? i could also collapse some of the categories to make the data less sparse?
thanks, felix
I may be missing something here, but as written it looks as if you are trying to estimate the parameters of the Dirichlet. What if you specified a symmetric Dirichlet: 23.1 Dirichlet Distribution | Stan Functions Reference ?
Thanks, I’ll try that. In my code, i defined the hyperprior as a simplex - this is wrong, as the dirichlet can take values > 1 as well. When i fix that, i still get convergence issues, the hyper_prior tends to be quite large (1000-10000). what would be a meaningful prior on the parameter of the dirichlet?
thanks, felix
Unless you have prior information (from earlier observations or from theoretical considerations), it is probably reasonable to imagine a priori that all components of the Dirichlet are equally likely. That would be the simplex equivalent of a uniform distribution in one dimension. If that’s reasonable, then you can set all Dirichlet parameters to 1. If you are agnostic about the value the components can take except that extreme values are unlikely, then you could pick a parameter bigger than 1. A good way to get a 1-d feel for it is to look at plots of the beta distribution. Beta(1, 1) corresponds to a uniform.