StanHeaders and rstan

Hi all,

I currently maintain the R package {rater} which relies on rstan. Because of this I have become quite interested in the development and maintenance of rstan.

I have been particularly excited to see the recent work on allowing rstan to more closely match the current version of Stan.

Reading the threads on GitHub about the changes it seems that a major problem is the need to keep the CRAN versions of rstan and StanHeaders consistent. There has also been some discussion about why the StanHeaders package exists and whether it may no longer be necessary.

I wanted to contribute to these discussions but quickly realized that I didn’t know nearly enough to do so! To remedy this I was wondering if anyone could answer a few questions I have about StanHeaders and rstan. Specifically:

  • Why does the StanHeaders package exist?
  • Why do rstan and the StanHeaders package need to be kept in sync?
  • And, are the issues with the StanHeaders package the major issue holding back rstan being more up to date?

Thanks in advance for your thoughts!

Best,
Jeffrey

3 Likes

One of the reasons that it exists is because I wanted to use the autodiff stuff in OpenMx. The dependency on StanHeaders is pretty narrowly localized in OpenMx source code. The code could probably be rewritten to not depend on StanHeaders with a few weeks of work.

FYI, it was me who put together the first version of StanHeaders.

4 Likes

Hi Joshua,

Thanks for that context, and thank you for your work on the original StanHeaders!

It’s interesting that there are packages which depend on StanHeaders and not rstan, though the code below suggests there is only two: (other than rstan)

rstan_lt <- crandep::get_dep("rstan", "Reverse_linking_to")
sh_lt <- crandep::get_dep("StanHeaders", "Reverse_linking_to")
setdiff(sh_lt, rstan_lt)
> [1] "OpenMx"   "ProbReco" "rstan"

@Joshua_Pritikin already answered the first question. StanHeaders obviously has its own use case and there is no reason to discontinue supporting it.

I do however think we should work on shipping the Stan & Stan Math code inside the rstan package while still keeping StanHeaders around at least until there are other packages that depend on it. If its possible and I think the feedback when I asked this on Github was that it is.

StanHeaders contains the Stan Math sources as well as Stan sources (the NUTS, HMC, VI, … algorithms). Stan sources also inlcude of the Stan-to-C++ transpiler but those are going away.
Stan sources are however also part of rstan because the Stan-to-C++ transpiler is built as part of Rstan.

So when an upgrade to a new version of Stan (from X to X+1) happens we need to first upload StanHeaders X+1 and that has to work with rstan X. Which means that the old generated code from the Stan-to-C++ transpiler (part of rstan X) must also compile with StanHeaders X+1.

This to me is something that is not sustainable in the long term and causes a lot of work for Ben G. to patch things every time. And a bunch of ifdefs everywhere I guess.

Stan-to-C++ transpiler (stanc3 now), Stan and Stan Math are just too tightly coupled to make this work this way. Theoretically we could help this process in the backend, but then that either severely limits what we can do in the backend and would halt backend development or forces us to make many workarounds/ifdefs/duplicated code in the generated C++ & backend to support all previous versions.

Simple example:

We added a function model_compile_info to the generated C++ model. This function returns the version of the Stan-to-C++ transpiler and the flags used to compile. Not super important but can help with numerical reproducibility and debugging. Adding a function in the generated C++ model also meant we had to add a virtual method to the model_base class which lives in the Stan repository.

The problem is this makes the old generated code not work with the new model_base class and the new generated code does not work with the old base class.

If we update both together then there is no issue what so ever. Exhibit A & B are cmdstan(r/py) and pystan3.

I dont know if they are the biggest issue as I was never involved with the CRAN process, I would say it is a big issue though.

1 Like

Hi Rok,

Thank you very much for that detailed explanation!

I am very much in favor of putting the stan math and stan source code into rstan! I can’t imagine all the work @bgoodri must do to keep everything working under the current system.

If any help is needed with that or other changes in rstan I would be very happy to help. (I have already opened some issues + a PR about other possible changes to rstan recently).

All the best!

1 Like

Thank you for working on those issues and for the PR!

1 Like