Seamlessly Running Stan Much Faster (using AWS/HPC etc)

There are some Stan models that run too slowly. Let’s assume for a moment that the work that is ongoing by Stan developers results in a version of Stan that can run on Amazon Web Services (AWS), High Performance Computing (HPC) resources and/or GPU farms and which can generate samples much faster than is possible with the current “desktop” instantiation of Stan. Ongoing work to accommodate within-chain parallelism and/or enable Stan to run on GPUs and/or using Sequential Monte Carlo samplers etc makes it likely that this will be possible.

My sense is that to maximise uptake, it will be important that we make it easy for users to make use of such a future version of Stan without resorting to using the command line. It therefore makes sense to ensure that the interface to such a version of Stan is easy to use. My sense is that we’d sensibly keep the interface to Stan as close to that we currently have as we can such that you would still call the same functions from R/Julia/Python, it’s just that you might have to have some additional functions to call to associate the distant compute resource with the local instance of Stan and/or login to that distant compute resource.

I’m keen to solicit thoughts on how something like this might sensibly appear to the users. I will start a separate thread on how to achieve what the users articulate that they would like.


The developer-centric spin on the same issue is here:


Moving over from duplicate thread in ‘interfaces’, closed that topic, can’t delete.

There is an server interface: GitHub - stan-dev/httpstan: HTTP interface to Stan, a package for Bayesian inference. that was mentioned that covers half the problem (hosting). Client should be fairly straightforward.

@storopoli suggested an option at the cmdstanR level, something like:

model <- cmdstan_model("my_slow_model.stan", server = "https://monster_server_farm",...)
fit <- model$fit(data = stanData)

Someone, can’t remember who, mentioned that RStudio already has a launcher feature that may already offer full on hosting since they do run Stan in RStudio cloud. I have sent an email asking about pricing, what interfaces are run etc… This is a GUI integrated ‘run in the cloud’ service on first examination: Using Background Jobs in the RStudio IDE - RStudio :: Solutions

I’ll report back on the RStudio Launcher feature.

Please add ideas/elaborate as you see fit.


1 Like

FYI one can programatically spin up a vm on Google Compute Engine via this package.