I used the Phi transform because I needed to constrain the parameters to (0,1) for w and (0,10)
It’s probly best to start out just using the Stan constraints to do stuff like this, if for no other reason that readability. For instance, to say a parameter is between zero and ten:
real<lower=0.0, upper=10.0> beta;
And it’s clear that this should be a parameter uniform from 0.0, to 10.0. Hopefully the sampler can achieve that as well :D. It’s hard to reason about parameters if they go through too many transforms.
And since w is a 2 component simplex, just start with that:
You can use Dirichlet distributions for the prior on w and it’ll probably be easier to think about what’s going on (do you strongly believe you know nothing about your simplex? Do you believe it should be near 50/50? Do you believe it should be near 20/80? – this is what the Dirichlet prior will let you specify).
As far as identifiabilities, there could be some, but get that answer from the sampler. If Stan is giving “maximum treedepth exceeded” errors (you can also get this specific information from shinystan) or you look at the pairplots in the posterior and find super correlated parameters, then it’ll be worth worrying about identifiabilities. But if everything just runs and your posteriors look nice and uncorrelated and all the diagnostics pass, you’re good!
And real quick, this looks suspicious to me:
choice[i,t] ~ categorical_logit(Qsum);
I think Qsum needs to have multiple entries for this to make sense. From the docs,
categorical_lpmf(ints y | vector theta)
The log categorical probability mass function with outcome(s) y in 1 : N given N-vector of outcome probabilities theta.
categorical_logit is just the same but the probabilities come from a vector that is passed through softmax. If there’s only one thing passed in, then it’ll always be softmaxed to a single element vector of probability one!
Hope that helps!