I don’t know if this is the right place to talk Bayesplot but here goes:
I’m writing some plotting functions for a package and I was going to use Bayesplot for base classes to deal with the irritating reality of handling arrays that might have a variety of labels for each margin (you know, long-format data represented as a vector in Stan with implicit indexing specified elsewhere). … then I see that the base classes in Bayesplot just treat samples as arrays… is there any room to do more OOP with this?
Asking for a friend who wastes a lot of time on indexing mistakes…
I’m imagining something like:
- consistency enforced by validation pre/post operations
- a sample is an array
- the first two dimensions are: 1) iteration; 2) chain;
- all further dimensions are sample-specific
- methods for, in order of implementation:
0) constructor takes an array and optionally a list of data frames for labels- merging chains
- trimming warmup
- thinning
- labelling dimensions w/ a data.frame that functions like the attributes dplyr::group_by uses
- transformations to ggplot-friendly long-format data frames made by unrolling specific margins
- calculating diagnostics
- element-wise math
- broadcasting
Anyway, I’ve implemented a bunch of stuff like this in a package as free functions that operate on a list but my next step was to make the operations more reliable in terms of keeping the object in a consistent state so I wanted to see if there’s room in one of the Stan R packages for that kind of thing.
Just to be clear I’m not thinking of a complete ‘fit’ object, but something that could reliably represent the labelled sample from a given multi-dimensional parameter. Maybe it would make more sense in that rstantools (?) package?