Categorical Model is taking really long for estimation

My first guess is that your model is overparametrized: the categorical distribution with C categories is completely determined by C - 1 parameters (because the probabilitites need to sum to 1). However, if I read the model correctly, you predict C parameters. Usually, when working on the logit scale (i.e. before applying softmax) one would fix one of the vector elements to 0. Such overparametrization both prevents you from interpreting the coefficents in a useful way and also creates weird interdependencies between the parameters that are hard for the sampler to work with.

I discussed a similar issue recently at Two questions: ①Rejecting initial value but still sampling. ②regarding divergent transitions but feel free to ask for clarifications here, if it is hard to understand.

Few additional minor suggestions:

  • You can use the cholesky_factor_cov type so that the sample will work directly with the decomposition (this avoids having to decompose the matrix in the transformed parameters block and is usually more numerically stable). In many use cases it is recomended to separate the correlation matrix and the variance vector - then you can use the cholesky_factor_corr type and the lkj_corr_cholesky prior.
  • categorical_logit(X) is a more efficient shorthand for categorical(softmax(X))

Best of luck with the model!

1 Like