Monte Carlo maximum likelihood estimation may be more efficient to compute than a full posterior derivation. Are there example of doing this in Stan?

Stan only impelements MLE with optimization (limited memory quasi Newton, specifically L-BFGS).

Could I ask the next question. When I use the optimizer, is the prior which I declared in the stan code incorporated? Another words, is it actually calculating the MAP instead of MLE when I call the optimizer?

Optimization will use the priors. If you don’t want the priors to influence you inference, you can use flat priors. But I hasten to add that (a) flat priors are not non-informative, they just say all parameter values are a priori equally likely and (b) flat priors can make it much harder to discover the optimal parameter values.

Unfortunately (in my opinion) Stan is using a different target when sampling (with Jacobians) and when optimizing (without Jacobians). This means that you don’t get MAP with optimizing. Bob explains some of this in this case study http://mc-stan.org/users/documentation/case-studies/mle-params.html

As clearly stated in the manual and on the website, Stan estimates posterior expectations and computes penalized maximum likelihood estimators. That’s it at the moment and likely for the immediate future.

Yes.

What’s *not* included is the Jacobian correction in mapping from the unconstrained back to the constrained parameters.

That makes our calculation equivalent to a penalized MLE, not a proper posterior mode (aka MAP estimate).

The underlying switch is there to turn the Jacobian on or off, so we could easily compute posterior modes.

Yes, please?

I created an issue on stan-dev/stan. @seantalts, want to have a crack at this one? You said you were looking for something to do in the Stan code. After it goes into Stan, the flags will need to go up through the interfaces, but that should be relatively easy.