Basic (newbie) question about transforming a parameter and Jacobians

I know this gets asked about sometimes, but I’m trying to make sure I understand when to include Jacobians in the model or not. I’m sure this is a stupid question, so brace yourselves!

Say we were interested in a model like this:

data_vector ~ N(mu, sigma)
sigma ~ LN(0, 100)

Sometimes, people might want to rewrite it like this:

data_vector ~ N(mu, sigma)
log(sigma) ~ N(0, 100)

But we could also write it like this:

data_vector ~ N(mu, exp(log_sigma))
log_sigma ~ N(0, 100)

So, for my questions:
(1) The second model requires a Jacobian correction, correct?
(2) Would the third model require such a correction? I can’t quite decide. I think it probably would…

Assuming sigma is declared in the parameters block, then the second model requires a Jacobian correction. Assuming log_sigma is declared in the parameters block, the third model does not require a Jacobian correction. Essentially the rule is that if you put priors only on the things declared in the parameters block, then no Jacobian corrections are required and adding Jacobian corrections would imply that you are drawing from a posterior distribution that you did not intend. Conceptually, you are drawing from the conditional distribution of the parameters given the data (and the transformed data); only in rare cases would you need to do a change-of-variables to draw from the conditional distribution of changed parameters given the data (and the transformed data) in which case you have to do a Jacobian correction to reconcile what you want with what Stan is going to do.

Thanks!

If (3) doesn’t require it, is that particular to Stan somehow (perhaps it handles it behind the scenes?), or would that still be the case if I were to e.g., write a basic random walk MH sampler?

In the latter case, for example, if I were generating proposals using logsigma_t+1 ~ N(logsigm_t, variance), I thought that does still require a Jacobian adjustment.

It is not at all particular to Stan. If logsigma_t is declared in the parameters block, then

target += normal_lpdf(logsigma_t + 1 | logsigma_t, sigma); 

conceptually requires a Jacobian correction. But the derivative of logsigma_t with respect to logsigma_t + 1 is a constant (i.e.1) so adding the logarithm of a constant to the posterior log-kernel in practice would have no effect on the values of the proposed parameters or which proposals are accepted, so you might as well just omit the Jacobian correction.

Just to stress @bgoodri’s answer here, Stan simply requires you to define the log density over which you want to sample. So no, there’s nothing special about where Stan requires Jacobians—it’s in exactly the same places as the math stats textbooks.

Where things get confusing is when there are constrained parameters. For those, Stan transforms to the unconstrained scale and automatically applies a Jacobian correction. From the user’s perspective, this just means all parameters are by default uniform over their declared constraints.

1 Like

That was actually what I was trying to get at in my last question (if my hamfisted notation weren’t so… hamfisted). The case of making proposals on an unconstrained parameter space, then transforming back, is where I’m used to having to include corrections when I write a basic MH sampler, for instance. Knowing that Stan takes care of that particular aspect in the background is very helpful to know. Thanks!

Don’t worry about the notation—Andrew, Matt, and I spent about a month when we first started this project trying to calibrate notation and terminology!