Actually, they were released before Christmas. We will have to push some fixes to CRAN in the next couple of days to fix things on Solaris, but otherwise they should be fine for StanCon.
The main change with StanHeaders (the R package that includes the Stan Math Library) is that a commit was made that fixed the massive slowdown due to using strings rather than pointers to character arrays for function names.
The main change to rstan is that it should work with a Mac that uses the upstream version of clang (which CRAN now uses) rather than Xcode’s clang. It is necessary that the upstream clang be installed in /usr/local/clang4/bin/clang++. After that, there should be no more “unknown exception” messages.
The changes to rstanarm were far more extensive. In terms of user-visible changes to existing functions:
- The default prior on the auxiliary parameter in GLMs (e.g. sigma if the likelihood is Gaussian) is exponential rather than Cauchy. You can still specify
prior_aux = cauchy()
explicitly, although as always you probably will not get exactly the same numerical results as in previous rstanarm versions. - Aki’s new hierarchical shrinkage priors made it in ( https://twitter.com/avehtari/status/948670477240864768 ) and have new arguments (
slab_df = 4
andslab_scale = 2.5
). - Models estimated with
stan_gamm4
may be a bit different now thatstan_gamm4
is usingmgcv::jagam
to parse the formula rather thanmgcv::gamm
orgamm4::gamm4
.
You can also use (some of) the families in the mgcv package with rstanarm model fitting functions. For example, the mgcv::betar
family allows you to fit models with a beta likelihood using group-specific and / or non-linear functions with stan_glmer
or stan_gamm4
as long as you are not parameterizing the auxiliary parameter (whereas the existing stan_betareg
function allows you to parameterize the auxiliary parameter with covariates but does not permit group-specific or non-linear functions).
A bunch of new functions were added to rstanarm, such as
-
bayes_R2
which calculates the posterior distribution of the ratio of the variance (over the observations) of conditional mean to the variance of the predictive distribution for GLMs (including those with group-specific terms that you can condition on or integrate over). Andy already blogged about it. -
stan_nlmer
that uses the same likelihood asnlmer
in the lme4 package, which fits models with a Gaussian likelihood but with a non-linear transformation of the linear predictor that can depend on group-specific terms. Even if you hate Bayesianism, the posterior means of these models (conditional on the group-specific terms) may well be more reliable than MLEs (that integrate over the group-specific terms), but watch out for multimodality in the posterior distribution -
stan_clogit
that uses the same likelihood asclogit
in the survival package, which fits “case-control” models where by the research design a fixed number of observations within a group will be successes, such as a competition with exactly one winner per contest. Thestan_clogit
formula is a bit different than theclogit
one in that the former can have lme4-style group-specific terms.
Finally, Sam Brilleman contributed a ton of code related to his Ph.D. dissertation. The stan_jm
function is a lot like the JM
function in the JM package or the JMbayes
function in the JMbayes package in that it estimates a “joint” model for the survival time and the severity of the symptoms for people with terminal diseases. These two things are obviously not conditionally independent, and you can specify a variety of dependence forms for the association structure. Also, Sam contributed stan_mvmer
, which is sort of a generalization of what stan_jm
does but for lme4-style models that have multiple outcomes with correlated error terms.
We are using a different build process for rstanarm now, so new R packages that come with compiled Stan models should follow this pattern, which is enforced by the rstan_package.skeleton
function in the rstantools R package. It is probably a good idea for existing R packages that come with compiled Stan models to migrate over to the new pattern, although not necessarily ASAP. Under the old way of doing things, Stan programs were in the exec/
folder and chunks of Stan code were in the inst/chunks/
folder. Now, Stan programs are in the root of the src/stan_files/
folder and chunks of Stan code can be in subfolders of src/stan_files/
and included via the “native” method in stanc
(the #include
statement must be flush left and there can be no whitespace or comments after the file name).
The new src/Makevars
or src/Makevars.win
files cause the Stan programs to be compiled separately and then combined into a shared object (whereas before we were gluing all the C++ files into a massive C++ file and compiling that). The upshot of this is that (packages like) rstanarm can now be compiled with much less RAM than before. Conversely, it takes much longer to build rstanarm from source unless you have previously specified the environmental variable MAKEFLAGS = -j4
or something (which negates the RAM savings). If you are not using Windows, the best of both worlds can be achieved by using link-time optimization in your local ~/.R/Makevars file and we may try to accomplish that automatically in the next version of rstanarm.