Bayesplot: ppc_ribbon using points for observations

I’ve been working through Bayesplot’s features; thank you to the authors for putting together this package!

I’m currently looking at ppc’s: https://mc-stan.org/bayesplot/reference/PPC-intervals.html. I’m wondering if there’s a way to construct a predictive plot where the observed data is plotted as points and the predictions as lines with ribbons. With ggplot, I can produce something like this:

There’s two levels to this. First using points for the observations. Next, accommodating the possibility that the number of observations doesn’t match the number of predictions.

It’s not too hard to do this with ggplot, but I figured I’d check if bayesplot might support something like this.

@jonah

1 Like

Glad you like it!

If you add geom_point() the to the ppc_ribbon() plot does that do what you’re looking for or not quite? To test it out you can try something like this:

EDIT: this actually does the wrong thing!. See my follow ups below

library("ggplot2")
library("bayesplot")
y <- rnorm(50)
yrep <- matrix(rnorm(5000, 0, 2), ncol = 50)
ppc_ribbon(y, yrep) + geom_point()

I guess that will still plot y as lines but it will also overlay the points. If you prefer just the points and no line for y then I think that will require a slight change to the ppc_ribbon code but it’s not hard. If you want to open an issue for it at the bayesplot repo that would be great. I can probably get to it pretty quickly.

Actually nevermind I think that’s plotting the prediction point estimates as the points instead of y, so that’s no good.

Ok so just using geom_point() will plot point predictions but you can add points for y like this:

y <- rnorm(50)
yrep <- matrix(rnorm(5000, 0, 2), ncol = 50)
ppc_ribbon(y, yrep) + 
  geom_point(data = data.frame(x = seq_along(y), y = y), 
             mapping = aes(x, y), 
             inherit.aes = FALSE)

I can definitely add this to ppc_ribbon to be turned on by an argument if you want to open an issue.

Happy to. I’ll open an issue and submit a PR. Thanks for the guidance.

1 Like

After some straightforward coding on the branch feature-ppc_ribbon_obspoints, I was able to produce the following (notice the last argument):

bayesplot::ppc_ribbon(y = yobs, yrep = yrep, x = time, 
                      y_is_point = T)

ppc_ribbon_example

This branch of the package can be installed via

devtools::install_github("https://github.com/stan-dev/bayesplot/tree/feature-ppc_ribbon_obspoints", dependencies = TRUE, build_vignettes = FALSE)

I find this pretty satisfactory. The only thing that bothers me is the legend: for y it should only include a point and for y_rep a ribbon. I know it’s possible to customize legends, but rather than doing it for this specific case, is there a way of fixing this issue globally?

Also @jonah, how do the unit tests run? I figure the relevant file is test-ppc-intervals.r and I’d need to add a visual test. But I’m not sure how to run the test, since not all dependency are on the R script.

1 Like

Cool, thanks a lot Charles! I like it.

I’m not 100% sure, I’d need to spend some time playing around with but I don’t quite have the time right now. I definitely remember it being tricky to get the legends right when y and yrep are using different geoms, but it’s been a while since I last tried to mess around with the legends so I don’t fully remember.

You can run that individual file with

devtools::test(filter = "ppc-intervals")

or all the test files with

devtools::test()
1 Like

Where can I specify the benchmark for the unit test? So I’m looking at the following code and it’s not clear to me how the first argument of expect_doppelganger is specified (or how it would point to a benchmark)

  p_50 <- ppc_ribbon(vdiff_y, vdiff_yrep, prob = 0.5)
  vdiffr::expect_doppelganger("ppc_ribbon (interval width)", p_50)