I agree that we should improve the way the Jacobian (and other) warnings work to make it clearer they’re based on heuristics and the user needs to exercise judgement.
A Jacobian adjustment involving only data will be a constant, so it shouldn’t alter the result from sampling. This is related to the normal of a log-transformed variable and a lognormal—the lognormal builds in the Jacobian. Specifically,
lognormal(y | mu, sigma)
= normal(log(y) | mu, sigma) |d.log(y) / d.y|
= normal(log(y) | mu, sigma) (1 / y)
But since that Jacobian term (1 / y)
only involves y
, you can get away with coding this in Stan either way if y
is data. If y
is a parameter, you get different results with lognormal(y | mu, sigma)
and normal(log(y) | mu, sigma)
.
Is there a section of the manual that you could point to? Preferably with an example where you do need to add a Jacobian and one where you don’t. (Given that most people don’t understand how to do a change of variables, or when they need to do one…)
Yes, there’s a change of variables chapter with contrasting examples where you do and don’t need Jacobians. But you’re right—it’s vexing when it gets to multiple dimensions.
The standard problem users run into is when they define two parameters, but want to put a prior on some function of the two—it’s improper without a prior on one of the parameters or some other function. Conveying that intrinsic dimensionality argument and why you get ridges is also in the problematic posteriors chapter of the manual.
Yes
a ~ normal(1,1)
b ~ normal(1,1)
a/b ~ normal(3,.2)
with a,b confined to positive values is a perfectly well defined prior for example, but you can think of it as independent (truncated) normals, multiplied by a further weighting function that describes the dependency structure between the two variables. If you try to do some kind of Jacobian “correction” to the last line you get a different model. What model do you want?
Also, in the lognormal example
y ~ normal(0,1);
ylognormal = exp(y);
gives ylognormal (a transformed parameter) the push-forward measure caused by pushing the normal through the exp function, whereas with ylognormal a parameter
log(ylognormal) ~ normal(0,1)
doesn’t give ylognormal a lognormal density, but, it does give it a density. So whether you need a Jacobian correction or not is down to which density you want to have on ylognormal. The reason it’s “obvious” which one we want is because we immediately assume something about motivations of the programmer. In a much more complicated model where motivations are not so clear, the importance of considering the fact that you have to think about what you want carefully and get the tail wagged by the dog is higher.
Yup, not the right situation for a Jacobian. You only want the Jacobians when you have a parameter x
and define y = f(x)
for some monotonic invertible f
and want to define p(x)
in terms of p(y)
.