Any tutorial-like examples for stan::optimization?

I would like to follow up on this thread: https://groups.google.com/forum/#!topic/stan-users/CPOrkY4gaMk

I have a C++ function that I would like to optimize using stan::services::optimize::lbfgs

template <typename T>
T objective (const Eigen::Ref<const Eigen::Matrix<T, 1, -1>>& params) {
    ...
}

I’ve been trying to figure it out by looking at the unit test and the .hpp generated for linear regression, but it’s a losing battle.

Any help would be greatly appreciated,

krey

Not that I know of. The issue is that the algorithms all assume a model that goes the model concept (so implements all the stuff in the .hope file you were looking at). We need a base class for people to work from but nobody had the time to write one, basically.

Out of curiosity, what drew you to Stan, rather than another platform with autodiff such as TensorFlow?

There’s a unit test that’ll show how exactly it’s called. I’m on my phone now, but when I get a free minute at my computer, I’ll dig up exactly how to run the optimization routine. Are you looking for lbfgs?

Here’s the test for the Stan C++ method for lbfgs:

Hey Aaron

I got the impression that TensorFlow was harder to use/deploy than stan, which is just a bunch of headers.
But really, it was the documentation that sealed the deal https://arxiv.org/abs/1509.07164

(Funnily enough, they don’t seem to have an optimizer: https://github.com/tensorflow/tensorflow/issues/9837)

Hey Daniel,

I linked that too in my original post.
Good to know that I’m looking at the right stuff!

As Krzysztof pointed out, the issue is that my “model” is in C++ not in stan.
Do you happen to have a pure C++ example?

Is your C++ function templated so that it can accept the var type?

Hey Krzysztof,

Yes, that’s the only reason it’s templated!
Sorry, I should’ve made that more clear.

Then if you really don’t want to understand the .hpp file you could make a dummy Stan file that has the data/parameters sections you need for your function. Add a functions section with the signature your own function would need and a dummy implementation and a dummy model body with something like:

target += my_function(....)

Then replace the my_function with the call to your actual c++ function, change the includes, etc… That should work fine.

Alternatively just read through the .hpp file to understand what a model is expected to provide

… or just incorporate your function into a branch of Stan: 1) add it to the math library; and 2) add the function to the Stan language… it’s not an onerous process and we’re happy to give pointers (there’s a tutorial in the wiki).

I can send over an example later. The Model template isn’t that complicated. Do you have an example function you want to use?

There’s also doc on the Wiki for the model concept.

2 Likes

I forgot we had that!

@krey, some bad news; it’s not as easy as I thought. Good news, it’s easy enough to have Stan generate something with the methods that aren’t the objective function (or you can follow the model template wiki that @Bob_Carpenter linked to).

We really need to make the optimization routines take a simple functor.

+100, all the algorithms.

As I always say, we take pull requests.

These things always sound easy when you just drop one liners into the group. When you get down and wrestle with the guts of what’s going on, you’ll see things are much more complex. Let’s just take the first step at trying to flesh out “take a simple functor”.

When you say “simple functor”, do you mean one with a templated method like this:

T operator()(const Eigen::Matrix<T, -1, 1>& theta) const;

That is, we assume the functor provides something we can run with autodiff? Or do you want to supply a class instance with two operations, one to do the log density and one to do gradients? Or maybe one method to compute both and return

pair<double, VectorXd> operator()(const Eigen::VectorXd& theta) const;

where the first return element is the value and the second is the gradient? Or maybe for efficiency we pass the gradient in?

double operator()(const Eigen::VectorXd& theta, Eigen::VectorXd& grad) const;

where it returns the log density and sets the gradient?

What do we do about saving intermediate output? If it’s a simple functor, where do print statements go in the model that generated the functor? How do things like rejections in the body of the functor get handled and reported? Presumably that would mean wrapping up the model class into a simple functor where all this stuff was plumbed in to the evaluation. What do we then do about saving output such as warnings, rejections, intermediate state, etc.?

No problem with speculating. I was thinking a little more about the output
piece of this. My suggestion long term is to make the policy for handling
output a separate template parameter for the algorithms. It could have
some sensible defaults, and it would still require names and dimensions
from the model objects but it doesn’t need to ask be packed into one
concept. I don’t see a need to plumb output from the algorithm back
through the model.