For some problems, it’s much faster to optimize and importance sample using the hessian from optimization as an initial distribution. I’ve written up a function to handle this, but it gets a bit ugly when trying to leverage multiple cores. Currently, the basic algorithm is:

optimize / get hessian
for i in 1:nloops {
sample ndraws from proposal
split draws across cores, compute lp
combine draws, importance sample,update proposal
}

but in order to ‘split draws across cores’, the stanfit object has to be recreated each time, on each core (the stanfit object is needed to compute the lp from the draws). This works ok (though removing this overhead would be nice), but the main problem is R eventually crashes - I presume related to too many dll’s loaded or something similar, but I don’t exactly know why.

Any suggestions for avoiding / reducing this problem?

Is it the same model each time? If so, are you compiling the model with stan_model() outside of the loop and then using the stanmodel object in a call to optimizing() inside the loop?

Also, what exactly do you mean “using the hessian from optimization as an initial distribution”? I have seen the posterior mode + scaled hessian used as a proposal distribution before, but I don’t quite see how one uses the hessian to form an initial distribution?

yes, same model each time. Compile outside the loop, inside the loop call sampling() with iter=1 and max_treedepth=1 just to generate a stanfit object, that I can use with log_prob() .

Re the hessian – yes I mean just as you say, an initial proposal distribution, sorry if it wasn’t clear :)

avehtari - you can find the code buried down the bottom of ctStanFit in the ctsem github repo, but it’s probably quite some work to make sense of, I was anyway planning on pulling it out into a standalone function – I can do that next week. I had to learn about IS / iterated IS while coding it, so would be very pleased to see / discuss improvements :) :)

I don’t understand your point re putting the log density into a function – what is the stanfit object doing except providing that function? I don’t see which bits of the model (except generated quantities, which don’t seem to be used by log_prob() anyway ) could be dropped.

By calling that function directly you avoid the overhead from starting any algorithm, building autodiff expression tree, computing gradients etc. But then you need to call that function for each draw separately, but if you wanted to parallelize anyway, then you have something to split the computation anyway. It’s more coding, but you asked for speedup :)

Calling rstan too quickly again seem to lead crashes. Small delay between calls seems to help.

So just to be explicit about it – are you suggesting I create a large custom stan function that contains my model code, but with the parameters determined by some parameter vector argument, and then do something like expose_stan_functions(), and call that function with my sample as argument? I guess it shouldn’t be ‘too’ hard to paste the model together inside a function, and it would resolve this difficulty with repeated rstan calls…

Yes. You can also call that same function from model block so that you are using the same code for sampling. This is of course not the optimal way, and this is known issue and easier way is on Stam roadmap, but it will take some time before it can be fixed.

I’ve been using sampling(). When I try your suggestion it first complains about data types, then after dealing with that mess I can’t apply log_prob() to the object class. Think I’m doing something wrong?

@avehtari I needed a way to test the performance of the log_prob and gradient functions for some optimizations for my model. I knew it had to be possible since there is the log_prob function, so I read through the stanfit class in RStan to figure out how this was done and how to do it without first generating a stanfit object.

It’s not really a supported feature of RStan, as evidence by calling the unexposed grab_cxxfun method (the :::), but it get’s the job done!

I really think that RStan should expose a “model with data” object to make this sort of thing easier.

[edit]
I actually remembered I posted about this when I first discovered it:

I’ve been lobbying for just this three-stage process, as it’s how Stan works under the hood. I think it may be in the works.

Joint distribution given by Stan program

compile C++ class

stan_model( )

Posterior

instantiate C++ class with data

n/a

Posterior draws

create stan_fit object with sample from posterior

sampling()

(2) is the “model with data” object that @aaronjg suggests, and there’s currently no implementation to match. It is exactly this posterior (model conditioned on data) object that is needed for the log density function, variable transforms, etc. The shape and size of all the variables is known only after instantiating the C++ class. Before that, all that is known is the static member variables for the variable names and shapes (but not sizes).