jonsjoberg http://discourse.mc-stan.org/u/jonsjoberg
May 9
Yeah, I know, which is why I don’t feel comfortable with moving away from
Stan. But ideally, I would like to specify my hierarchical models in a
centered way, and then Stan(or some other library) could automagically
transform and use what ever parameterization that is most efficient for the
data I have, but I do realise that it may not be possible.
It’s possible, we just aren’t there yet. Know anybody who wants to
contribute? :)
But is there any ongoing work/plan/ideas in this area?
More specifically my problems are usually, I want to fit some regression
and I have a bunch hierarchical covariates. Now for every covariate I test
which parameterization works (best), or if I sequentially add covariates I
get a bunch of models for each I have to make sure their parameterization
works.Usually
I go through centered → non-centered (Section 26.6. in the manual) →
hard sum-to-zero (Section 8.7 in the manual) to mitigate divergent
transitions.
You shouldn’t need to test all these. Use noon-centered unless you have
plenty of observations in all groups. Even then non centered works fine.
Think about identifiability first, it comes up in specific contexts. If you
think the issue will come up just code for it to begin with.
Also, priors in these models are critical. You won’t know what
implications of priors are without simulation in a complex hierarchical
model. Check that simulation from the weak prior yields reasonable values
for parameters. Seriously, check!
Changing
adapt_delta very seldom helps. And this is what takes time…
You should be able to do relatively short runs to check this stuff. A few
dozen iterations at most to see where stepsize end up and if that’s good
you can do a longer run to see divergences.
Is this a good workflow, or are there more efficient ways to work?
And what do I do when I still have divergent transitions after going
through this process?
If you are using gamma cdf or incomplete gamma functions, or the bets
binomial a few math lib calculations weren’t/aren’t as good as they should
be. I have some improvements in that should make the beta binomial better
and a branch that makes gamma models easier to fit. It’s mostly numerical
inaccuracy that messes with adaptation.
I
guess the answer to that is very model specific, but in general I have
interpreted that as with the current data it is not possible to fit the
model, and I must look at changing what covariates I’m trying to fit or
what priors I’m using.
Sometimes it also means we have a problem to fix so don’t be afraid to file
ask questions on the list or just make a reproducible example and file an
issue. Stan should be able to fit hierarchical glm’s.
Sometimes it also means your model doesn’t for your data in a really bad
way. Check that too.
(Sorry if this got long and off-topic, and as you probably can tell I’m not
a “real” statistician, but rather comes from a design/product development
background)
In the age of machine learning and data science I think you’re doing fine.