Bayesplot: ppc_ribbon using points for observations

charlesm93 · December 2, 2020, 4:58pm

I’ve been working through Bayesplot’s features; thank you to the authors for putting together this package!

I’m currently looking at ppc’s: https://mc-stan.org/bayesplot/reference/PPC-intervals.html. I’m wondering if there’s a way to construct a predictive plot where the observed data is plotted as points and the predictions as lines with ribbons. With ggplot, I can produce something like this:

There’s two levels to this. First using points for the observations. Next, accommodating the possibility that the number of observations doesn’t match the number of predictions.

It’s not too hard to do this with ggplot, but I figured I’d check if bayesplot might support something like this.

@jonah

jonah · December 2, 2020, 8:41pm

Glad you like it!

If you add geom_point() the to the ppc_ribbon() plot does that do what you’re looking for or not quite? To test it out you can try something like this:

EDIT: this actually does the wrong thing!. See my follow ups below

library("ggplot2")
library("bayesplot")
y <- rnorm(50)
yrep <- matrix(rnorm(5000, 0, 2), ncol = 50)
ppc_ribbon(y, yrep) + geom_point()

I guess that will still plot y as lines but it will also overlay the points. If you prefer just the points and no line for y then I think that will require a slight change to the ppc_ribbon code but it’s not hard. If you want to open an issue for it at the bayesplot repo that would be great. I can probably get to it pretty quickly.

jonah · December 2, 2020, 8:42pm

Actually nevermind I think that’s plotting the prediction point estimates as the points instead of y, so that’s no good.

jonah · December 2, 2020, 8:47pm

Ok so just using geom_point() will plot point predictions but you can add points for y like this:

y <- rnorm(50)
yrep <- matrix(rnorm(5000, 0, 2), ncol = 50)
ppc_ribbon(y, yrep) + 
  geom_point(data = data.frame(x = seq_along(y), y = y), 
             mapping = aes(x, y), 
             inherit.aes = FALSE)

I can definitely add this to ppc_ribbon to be turned on by an argument if you want to open an issue.

charlesm93 · December 3, 2020, 2:25pm

Happy to. I’ll open an issue and submit a PR. Thanks for the guidance.

charlesm93 · January 4, 2021, 4:51pm

After some straightforward coding on the branch feature-ppc_ribbon_obspoints, I was able to produce the following (notice the last argument):

bayesplot::ppc_ribbon(y = yobs, yrep = yrep, x = time, 
                      y_is_point = T)

ppc_ribbon_example

This branch of the package can be installed via

devtools::install_github("https://github.com/stan-dev/bayesplot/tree/feature-ppc_ribbon_obspoints", dependencies = TRUE, build_vignettes = FALSE)

I find this pretty satisfactory. The only thing that bothers me is the legend: for y it should only include a point and for y_rep a ribbon. I know it’s possible to customize legends, but rather than doing it for this specific case, is there a way of fixing this issue globally?

Also @jonah, how do the unit tests run? I figure the relevant file is test-ppc-intervals.r and I’d need to add a visual test. But I’m not sure how to run the test, since not all dependency are on the R script.

jonah · January 4, 2021, 5:07pm

Cool, thanks a lot Charles! I like it.

I’m not 100% sure, I’d need to spend some time playing around with but I don’t quite have the time right now. I definitely remember it being tricky to get the legends right when y and yrep are using different geoms, but it’s been a while since I last tried to mess around with the legends so I don’t fully remember.

You can run that individual file with

devtools::test(filter = "ppc-intervals")

or all the test files with

devtools::test()

charlesm93 · January 4, 2021, 10:10pm

Where can I specify the benchmark for the unit test? So I’m looking at the following code and it’s not clear to me how the first argument of expect_doppelganger is specified (or how it would point to a benchmark)

  p_50 <- ppc_ribbon(vdiff_y, vdiff_yrep, prob = 0.5)
  vdiffr::expect_doppelganger("ppc_ribbon (interval width)", p_50)

Topic		Replies	Views
Bayesplot suggestions General bayesplot	2	634	May 10, 2018
Plots ppc_intervals-style for test set Modeling	4	402	March 26, 2020
PPC for ordered_logistic? General	3	684	May 23, 2019
Pedictive posterior visualization for logistic(or binomial) regression Modeling	5	644	February 25, 2022
Printing the posterior predictive p-value with ppc_stat General bayesplot	2	754	October 1, 2020

Bayesplot: ppc_ribbon using points for observations

Related topics