I’m trying to understand what is plotted in the violin based on yrep
.
Say that my data looks like this:
data <- data.frame(n= 1:50, group= rep(1:10,5), x=rnorm(50), y= rnorm(50))
I fit a linear model, and I get my y_rep
. When I use ppc_dens_overlay
I get one density plot for each simulated dataset, so if my data has 50 observation and I have 20 draws from the posterior predictive distribution. I will get 20 light blue density plots made by 50 observations each, right? This makes total sense to me.
I want to have different plots for each group, I was expecting to find a ppc_dens_overlay_grouped
where each group appears in one facet, and so on each facet I will still have 20 light blue density plots but based on 5 observations each. As you know, there is no function like that and the closest thing is ppc_violin_grouped
.
ppc_violin_grouped(y, yrep, group= data$group)
What I don’t understand is how come ppc_violin_grouped
gives me one “predicted” violin for each group without choosing any stats. Is it just making each violin out of the 20 (draws) * 5 (observations for group) y_rep ignoring from which draw each y_rep is coming? If so, does it make sense? Shouldn’t I get 20 violin overlayed for each group?
I hope my problem is clear, if not I’ll be happy to expand and give more examples.