Yes, please!
Aki
Yes, please!
Aki
I created an issue:
The current approach without the Jacobian already includes this. The implicit uniform priors on the constrained parameters are already in the posterior p( theta | y ) as defined by the model.
Maybe an example will help. Consider the model:
parameter {
real<lower=0> x;
}
model {
x ~ exponential(1);
}
What is the MAP estimate of x? I suspect that youâre thinking that it should be at x=0, where the exponential distribution is maximized. However, if you include the Jacobian transformation, youâll get x = 1.
The constrained distribution p( x ) is simple p(x) = exp(-x)
The unconstrained distribution is
q( x_unc ) = exp(-exp(x_unc)) * exp(x_unc)
= exp(x_unc - exp(x_unc))
If you maximize p(x) wrt x, you get x = 0. If you maximize q(x_unc) wrt x_unc you get x_unc ~= 0 and x = exp(x_unc) ~= 1.
In Stan right now, the optimizer will (try to) give you a value of x=0. (I say try because it would require the optimizer getting to -infinity but thatâs another issue.)
I donât know where the idea came from that weâre not currently doing a valid MAP estimate, we most definitely are.
Perhaps we need to have a skype call to sort this out?
Marcus_Brubaker
October 13
The current approach without the Jacobian already includes this. The implicit uniform priors on the constrained parameters are already in the posterior p( theta | y ) as defined by the model.Maybe an example will help. Consider the model:
parameter {
real<lower=0> x;
}
model {
x ~ exponential(1);
}What is the MAP estimate of x? I suspect that youâre thinking that it should be at x=0, where the exponential distribution is maximized. However, if you include the Jacobian transformation, youâll get x = 1.
The constrained distribution p( x ) is simple p(x) = exp(-x)
The unconstrained distribution isq( x_unc ) = exp(-exp(x_unc)) * exp(x_unc)
= exp(x_unc - exp(x_unc))If you maximize p(x) wrt x, you get x = 0. If you maximize q(x_unc) wrt x_unc you get x_unc ~= 0 and x = exp(x_unc) ~= 1.
In Stan right now, the optimizer will (try to) give you a value of x=0. (I say try because it would require the optimizer getting to -infinity but thatâs another issue.)
I donât know where the idea came from that weâre not currently doing a valid MAP estimate, we most definitely are.
I do. Aki requested was that the optimizer optimize the same density
as was being sampled. I thought that meant we would need to include
the Jacobian. My calculus is atrocious, so Iâm almost certainly wrong
if thereâs any doubt as to whoâs confused.
The example is great. Thanks. I think this sorts it out. Let me
let it sink in and Iâll get back to you if I need further clarification.
Iâll close the issue in the meantime.
Thanks.