Testing a categorical model

Sadly, it’s not particularly straightforward. The way to do it in general when there are integer parameters is to use the generated quantities block. But that essentially requires rewriting the code and isn’t any easier than just generating the data in R or Python or something.

You definitely do not want to try to do continuous sampling then cut. It’ll break Stan’s sampler, which assumes continuous differentiability.

I don’t understand how the categorical_logit could work given that the argument seems to be a scalar rather than a vector. And the Stan compiler agrees with me,

> cmdstan_model('temp.stan')
...
    23:      for (n in 1:N) { h[n] ~ categorical_logit(phi * g[n] + upsilon * u[n]); }
                              ^
...
Ill-typed arguments to '~' statement. No distribution 'categorical_logit' was found with the correct signature

The likelihood is also problematic in that terms like phi * g[n] are not identified, because they’re both parameters. If you divide phi by a constant and multiply all the g[n] by the same constant, the result is the same. You can identify by giving g and phi priors as you have done, but this leads to banana-shaped posteriors which can be very difficult to sample. Furthermore, the sum phi * g[n] + upsilon * u[n] can get problematic because you can add to the left-hand factor and subtract from the right to get the same answer.

When you do sort out the categorical-logic, there’s an issue with identifiability there, too, in that you can add or subtract from all the arguments and get the same answer.