This is my first attempt with Bayesian statistics. I need to figure out if I have enough information to set more informative priors for a zero-one-inflated beta regression in brms. Initially I used the default priors (coi: beta(1, 1), Intercept: student_t(3, 0, 10), phi: gamma(0.01, 0.01), sd: student_t(3, 0, 10), zoi: beta(1, 1)), which according to my understanding are uninformative and have no impact on the posterior. I saw that many people in Bayesian statistics encourage researchers to think about using more informative priors, so I wonder if I could derive something from a previous study I did, with the same dependent variables.
Previous study:
The stimulus was a set of 36 clips that had been selected subjectively by experimenters to fit into 9 classes which were combinations of two 3-level categoricals: pleasure (negative, neutral, positive) and intensity (low, medium, high). There were 6 clips in each class. The participants watched the clips and evaluated them in terms of pleasure and intensity on two continuous sliders with values in [0,1].
Current study:
The stimulus is a set of 18 clips that have been generated from the clips of the previous study, and the generative model was conditioned on the pleasure/intensity continuous scores given by the participants of the previous study. The clips are labeled with 3-level categorical tags, p_cat (negative, neutral, positive) and i_cat (low, medium, high), that were derived from the discretization of conditioning continuous scores in 3 bins. Current participants evaluated them as in the previous study: pleasure and intensity in two continuous sliders with values in [0,1]. Here is my current model:
where p_cat and i_cat are the 3-level categories for pleasure and intensity.
Can I use information from the distribution of the response in the previous study as a prior to the current study?
In case that would be an acceptable approach, would it make sense to add priors on the coefficients for each level of my categorical variables p_cat and i_cat? For instance, in terms of p_cat and the response pleasure, a prior for the coefficient of Intercept (reference level for negative p_cat), another for the coefficient of neutral p_cat, and a third one for positive p_cat.
If this is acceptable, can I get all the responses from previous study, split them in 3 bins, get the mean and variance of each bin and use this to set the priors mentioned in 2?
I think the current basic recommendation for priors is to check what your prior implies in terms of predictions from your model (prior predictive checks): https://youtu.be/ZRpo41l02KQ?t=2694
So in terms of getting priors, just make a rough guess using normals and then see what the predictions are. If you get crazy stuff wiggle things up or down and see what happens.
It might seem unsatisfactory to do something like this cause it’s like doing a really rough inference in your head haha, But don’t try to fit to data or anything. Just make sure your predictions are on the right scale and such.
Well presumably you could, but the technical aspect of this question isn’t the tricky part. The onus is on you to answer if that makes sense in your application and report what you did appropriately. I guess that’s the tricky part haha.
You probably want priors everywhere, yes. I don’t know if specifying them separately is possible in brms or rstanarm (I doubt it).
That’s up to you!
It really sounds like what you want to do is fit two models to two sets of data at once. You can do that in Stan. Get both models working on their own and then lump em’ together and share the parameters. Taking a posterior from one inference and using it in another isn’t really something we do so much (cause we don’t really have the posterior distribution itself as an object).
Unfortunately, my background in Bayesian statistics is very basic, since it is my first time trying to apply it, and I have no experience with Stan at all. That’s why I decided to work with brms at this phase. I am not sure if I can fit the two models to two sets since that would be more complicated for me. Maybe I can have a look at it in the future, but for now, I gave it some thought as you suggested and I decided that I could use three normal priors, one for each level of my predictor p_cat. For the rest of the predictors, I do not think I can justify my choices. I think that brms allows doing that for different levels of a categorical predictor as follows:>
my_prior = c(prior(normal(0.17, 0.10), class = Intercept, resp = "pleasure"),
prior(normal(0.33 ,0.10), class = b, coef=p_catNeu, resp = "pleasure"),
prior(normal(0.64 ,0.10), class = b, coef=p_catPos, resp = "pleasure"))
But I wonder:
Since the Intercept is coding the reference level, how will the first prior affect the rest of the predictors, for instance, the reference level of the a_cat variable (for which I do not set any priors) is also coded from the intercept. Should I avoid setting a prior for the intercept, and instead only set on the two other levels (p_catNeu, p_catPos)?
In terms of the prior predictive checks, I watched the video and tried to understand the code. But I am confused about how to implement it because in my case I only have a prior for one 3-level categorical predictor, but no priors for the rest of the predictors. I’ve got prior_samples() after setting sample_prior = TRUE in function brm. Can I use this with something like pp_check function?
Thank you so much for the help, it is great for a beginner to get support in this forum.
Don’t be scared to look at Stan code. In the end brms/rstanarm just translate into Stan stuff anyway. The issue with Stan is that is can be super tedious to to do the things that brms/rstanarm make easy.
I don’t know, honestly. This is the type of thing where I’d look at the generated Stan code/Stan data to make sure what I expected was happening was happening :D. I assume the coding with an intercepts makes the other two terms offsets from that intercept, but I don’t know.
You can get that stuff with:
make_stancode(...)
make_standata(...)
where ... is all the stuff you’d pass to brm (brm(...))
Do whatever plots you can think of to check if the predictions are reasonable.
If you think you don’t know anything about pleasure/intensity, and then have a look at make sure they’re sorta uniform looking. Since they’re [0, 1], make sure all the prior mass isn’t stuck at 0 and 1 or something weird.
Sorry to intrude, but this discussion makes me think it is fine to fiddle with the priors until we arrive at “significance” (e.g. credibility intervals that don’t include zero). I thought the reason not to use uninformative priors is that they increase the chances of a Type I error. But what kind of error arises by trying priors until you obtain one that gives you what you want? It sounds like cheating :) What am I missing?
Thanks!