L-BFGS-B comment

avehtari · October 13, 2016, 3:42pm

Yes, please!

Aki

Bob_Carpenter · October 13, 2016, 5:29pm

I created an issue:

Bob

Marcus_Brubaker · October 13, 2016, 7:20pm

The current approach without the Jacobian already includes this. The implicit uniform priors on the constrained parameters are already in the posterior p( theta | y ) as defined by the model.

Maybe an example will help. Consider the model:

parameter {
  real<lower=0> x;
}
model {
  x ~ exponential(1);
}

What is the MAP estimate of x? I suspect that you’re thinking that it should be at x=0, where the exponential distribution is maximized. However, if you include the Jacobian transformation, you’ll get x = 1.

The constrained distribution p( x ) is simple p(x) = exp(-x)
The unconstrained distribution is

q( x_unc ) = exp(-exp(x_unc)) * exp(x_unc) 
                 = exp(x_unc - exp(x_unc))

If you maximize p(x) wrt x, you get x = 0. If you maximize q(x_unc) wrt x_unc you get x_unc ~= 0 and x = exp(x_unc) ~= 1.

In Stan right now, the optimizer will (try to) give you a value of x=0. (I say try because it would require the optimizer getting to -infinity but that’s another issue.)

I don’t know where the idea came from that we’re not currently doing a valid MAP estimate, we most definitely are.

Perhaps we need to have a skype call to sort this out?

Bob_Carpenter · October 13, 2016, 7:47pm

Marcus_Brubaker
October 13
The current approach without the Jacobian already includes this. The implicit uniform priors on the constrained parameters are already in the posterior p( theta | y ) as defined by the model.

Maybe an example will help. Consider the model:

parameter {
real<lower=0> x;
}
model {
x ~ exponential(1);
}

What is the MAP estimate of x? I suspect that you’re thinking that it should be at x=0, where the exponential distribution is maximized. However, if you include the Jacobian transformation, you’ll get x = 1.

The constrained distribution p( x ) is simple p(x) = exp(-x)
The unconstrained distribution is

q( x_unc ) = exp(-exp(x_unc)) * exp(x_unc)
= exp(x_unc - exp(x_unc))

If you maximize p(x) wrt x, you get x = 0. If you maximize q(x_unc) wrt x_unc you get x_unc ~= 0 and x = exp(x_unc) ~= 1.

In Stan right now, the optimizer will (try to) give you a value of x=0. (I say try because it would require the optimizer getting to -infinity but that’s another issue.)

I don’t know where the idea came from that we’re not currently doing a valid MAP estimate, we most definitely are.

I do. Aki requested was that the optimizer optimize the same density
as was being sampled. I thought that meant we would need to include
the Jacobian. My calculus is atrocious, so I’m almost certainly wrong
if there’s any doubt as to who’s confused.

The example is great. Thanks. I think this sorts it out. Let me
let it sink in and I’ll get back to you if I need further clarification.
I’ll close the issue in the meantime.

Thanks.

Bob

Topic		Replies	Views
Feature request: Jacobians of unconstraining transform Modeling	41	3707	December 25, 2016
Some issues that may arise regarding transformations General	28	1877	November 8, 2018
What does `jacobian=TRUE` do in model$optimize(data = data_list, jacobian = TRUE) Modeling cmdstanr , jacobian-adjustment	16	268	January 9, 2025
Possible INLA optimization step concerns, sparsity requirements, and Stan features for large gaussian process inference Algorithms optimization , mcmc	31	4069	September 19, 2018
Does Stan implement Monte Carlo mle? Modeling techniques	8	1493	February 20, 2018

L-BFGS-B comment

Related topics