Stan 2.18

seantalts · February 16, 2018, 3:20pm

Constantly, haha. Evaluating a different math library involves looking at the diff between the Math library and another option, figuring out how many people are working on the other option and what % of their time will go towards pieces of their library that we find valuable, and estimating how long it would take us to add the rest of the things we need to their library and switch over.

re: a StanVM-like approach: I don’t think we can compile our math library into a shared object we could distribute (like would be needed by an interpreted approach) because we rely heavily on Eigen’s templated linear algebra library, which keeps us header-only. But I’m not sure how much extra performance we get from that vs. the cost. So I think the hardest part of doing a StanVM approach would be switching away from Eigen or figuring out how to precompile all of the instantiations that we need in a very small binary.

syclik · February 16, 2018, 3:39pm

Close. The reason we can’t do it is because we need too many instantiations around in a shared object library that would make it prohibitively large. I accidentally tried to do this early on and something like the normal distribution (only) builds a shared object on the order of a few GBs. Theoretically, there isn’t a reason we couldn’t. But I don’t think any of the tools are really capable of handling shared objects of TB size well (and we wouldn’t really want to distribute that.)

seantalts · February 16, 2018, 3:40pm

Sorry, I alluded to that but didn’t call it out as directly.

syclik · February 16, 2018, 3:45pm

Just wanted to make it clear that we don’t have to be header only. For example, if we exposed a subset of the language that was not vectorized, we could probably do that. Or if we had a version that promoted everything, then the instantiations also drop immensely.

seantalts · February 16, 2018, 3:48pm

Good call. What about only vectorizing on Eigen row vectors or something like that? That should drop the combinatorics a lot.

syclik · February 16, 2018, 3:50pm

At some point I wondered if we could instantiate all the meta programs and whether that’d help. That’s something we could evaluate.

Krzysztof_Sakrejda · February 16, 2018, 7:11pm

Fairness might be a little too much to hope for. But yes, rstan can fail due to finding the wrong boost even when everything else works (e.g.-run-time linking vs. Armadillo or Eigen using Rcpp) so the opaqueness of the system is an issue for users. I’m don’t think this changes what we do. “We” don’t have the people-power to get better messages in place or to fix the problem at the moment but there’s no sense in acting like it isn’t a pain point for users. If I could prioritize it, I would write some doc and helper functions to allow users to better report on these issues rather than leaving it at “something is wrong with your compiler install”… but ah can’t right now.

Bob_Carpenter · February 17, 2018, 7:39pm

We’ve talked about a lot of this over the years. Let me try to summarize.

The only thing I don’t think we can handle are the probability functions.

I also think we can precompile the algorithms with a better base class (that doesn’t rely on templated methods, which can’t be virtual).

We aren’t with CVODES. And we also won’t be with MPI.

A more general solution would be to precompile POD types and use those. They’d be easy to delegate to. For instance, the normal would be:

template <bool Propto, bool Jacobian,
          typename T1, typename T2, typename T3>
typename return_type<T1, T2, T3>::type
normal_lpdf(const T1* y, size_t y_size,
            const T2* mu, size_t mu_size,
            const T3* sigma, size_t sigma_size)

There’s one double instantiation and seven autodiff instantiations (for each autodiff type, but now we only have rev). We can get a T1* from a T using &x, from a vector<T> with &x[0] and from a Matrix<T, R, C> as &x(0).

Bob_Carpenter · February 17, 2018, 7:40pm

This is a problem for everything we do unless we pack up our own dependencies. I was arguing we do just that, but it’s not compatible with CRAN size restrictions and everyone seems to think being on CRAN is critical.

Bob_Carpenter · February 17, 2018, 7:41pm

How hard would it be to have a fully encapsulated RStan install from GitHub that managed all the headers? And I don’t mean to just build it, but also maintain it going forward.

seantalts · February 17, 2018, 7:42pm

I wonder if we can’t have both? Can we have a package on CRAN that includes a setup/install function that people must call before it can be used? e.g.

install.packages("rstan")
library(rstan)
install_rstan()

Topic		Replies	Views
FYI - Next Stan release (2.21) October 18 - now with feature freeze Developers	33	1767	October 24, 2019
Planning the 2.33 release Developers	7	815	August 30, 2023
Stan 2.19 release planned for Monday March 18th Developers	52	3699	June 5, 2019
Stan 2.21.0 released Announcements	11	1920	October 21, 2019
Planning the 2.36 release Developers	18	885	November 26, 2024

Stan 2.18

Related topics