@jonah, @mjskay, and I have been working on a new R package in the last couple of weeks, which we call ‘posterior’. It is intended to provide useful tools for both users and developers of packages for fitting Bayesian models or working with output from Bayesian models. The primary goals of the package are to:
Efficiently convert between many different useful formats of draws (samples) from posterior or prior distributions.
Provide various summaries of draws in convenient formats.
Provide lightweight implementations of state of the art MCMC diagnostics.
Our plan is to use posterior in all of our R packages wherever we work with posterior (or prior) draws to have a unified and consistent interface that users and developers can rely on.
Currently, posterior is still in the alpha version (we just made the repo public yesterday) so a lot of things may change. We appreciate any input from you to help us improving posterior!
For now you can find the package on Jonahs Github page (https://github.com/jgabry/posterior) but it may eventually be moved to the stan-dev github repo.
Have a great day and don’t forget to try out posterior!
Maybe a question to @mjskay: From a brief glimpse it looks like there is some overlap with the functionality of the tidybayes package (which I really really like btw). Will the posterior package eventually replace tidybayes?
Short answer from me: no it wont replace it by any means but may be used inside tidybayes for some specific operations. @mjskay has surely more details on it.
Yeah, exactly: e.g. tidybayes::tidy_draws() (which is the entry-point in tidybayes for converting any other format to a tidy data frame of draws) will likely be replaced with (or become an alias for) posterior::as_draws_df(). Most of the other things built on top of that will likely stay in tidybayes.
I see the split as: where posterior is a package that supports consistent operations across several formats (one of which is tidy data frames of draws), tidybayes is a package specifically for operations on tidy data frames of draws. So things like spread_draws() and gather_draws() make sense to live in tidybayes and not in posterior.
There’s a related question from way back that @jonah, @paul.buerkner, and @tjmahr discussed briefly, which is whether the various geoms in tidybayes might be spun off into another package (something like a ggdist?). I’ve been mulling that idea lately, especially as those geoms grow in number (the next version adds a bunch more).
Yeah, it’s probably hard to draw the line where the scope of one package stops and that of another one starts. And what you said makes sense.
In tidybayes I love being able to do posterior %>% gather_draws(rho[group], alpha[group], beta[group]) this.
I don’t know if something like this is planned for posterior, but being able to do something like subset(eight_schools_df, variable = "theta[school]", chain = 1:2, iteration = 1:5) would be awesome–although it would probably push things too much into the “tidy” direction.
In any case… it’s cool to see so much happening (CmdStanR, posterior, …)!
@Max_Mantei agree that’s useful. if you want to open an issue we’ll certainly consider it! Like @mjskay said, we’re not trying to replace tidybayes with posterior, but it could be worth allowing syntax like this in subset since posterior could then make it work for other types (array, matrix, etc) and not just data frames.