Novice with no prior experience

GabriellaS-K · March 28, 2022, 10:29am

Hi,

I would like to specify my priors but am still in the ‘beginner’ camp and I’m not confident in what I am doing.

At the moment I have weakly informative priors:

m1priors ← c(
prior(normal(0, 2), class = “Intercept”),
prior(normal(0, 1), class = “b”)
)

I am not completely sure I understand what this is saying, is it saying normal prior (which means what)? With mean 0 and standard deviation 2 for intercept, and 1 for class =b? (Again not sure what this means).

So what I would like to know (aside from the above) is how do I know what to set my priors to be more specific? Would I look at my data and set priors according to it? For example I know that my data spreads from 0-4 and is left skewed.

Thank you all in advance

torkar · March 28, 2022, 12:33pm

Hi Gabriella, I’d recommend you to read up on these things since it would help you tremendously. I can recommend one book to start with: Statistical Rethinking 2nd edition.

In short, no, you should never set your priors based on data. You should always do prior predictive checks and conduct simulations.

mike-lawrence · March 28, 2022, 12:38pm

When you’re new to priors, prior predictive checks are especially useful. Here are some resources:

mike-lawrence · March 28, 2022, 12:45pm

I’ve long violated this, somewhat knowingly, usually via scaling the data to zero-mean/unit-variance, and recently wondered if adding that procedure to running an SBC with that step included would help convince even stubborn folks like me that it’s going to yield distorted inference (just haven’t had time to do it myself).

jsocolar · March 28, 2022, 1:58pm

FWIW, rstanarm does this by default.
https://cran.r-project.org/web/packages/rstanarm/vignettes/priors.html

Of course it’s better to actually elicit domain expertise to inform the prior, in which case it doesn’t matter if one scales the data or not (as long as one scales the prior accordingly). A seriously bad choice of prior will always yield distorted inference; the question is how frequently a default prior that works with scaled data turns out to be “seriously bad”.

Suppose the quantiles were (not) uniformly distributed. Would that mean inference is (not) distorted?

Solomon · March 28, 2022, 2:58pm

prior(normal(0, 2), class = “Intercept”) is saying your prior for the intercept to the normal distribution for which the mean is zero and the standard deviation is 2. In a similar way, prior(normal(0, 1), class = “b”) is saying your prior for the predictor(s) is the normal distribution for which the mean is zero and the standard deviation is 1.

It’s hard to interpret these priors without knowing which likelihood you’re using (e.g., the Gaussian, the ordinal with the logit link). Let’s say you’re using the Gaussian and let’s further presume you have standardized your predictor(s). Your intercept prior suggests you assume your data will be on the lower end of their limits (i.e., near zero), given your predictor(s) are at their means. When I have data of this kind, my default assumption is that my intercept would be more toward the middle of its possible range, which would incline me to set my intercept prior to something more like prior(normal(2, 2), class = “Intercept”).

Anyway, this should give you some food for thought.

Also, +1 for the prior-predictive check suggestions. However, I acknowledge that is a very intimidating thing to do as a beginner. As suggested, follow along with McElreath’s examples to learn how to do this.

GabriellaS-K · April 1, 2022, 12:52pm

thank you, I will start with your recommended book :)

GabriellaS-K · April 1, 2022, 12:52pm

Brilliant, thank you!

GabriellaS-K · April 1, 2022, 12:54pm

Thanks so much, all these answers are really helpful, will go and do some more reading :)

JimBob · April 1, 2022, 1:10pm

Hi @GabriellaS-K ,
Statistical Rethinking, including the online lectures, are a really amazing resource. Going through most of the chapters in that will definitely help you understand your question at a pretty high level.

In direct answer to your question:

prior(normal(0, 2), class = “Intercept”)
This is assuming that, before seeing the data, the intercept in your regression model lies somewhere in a normal distribution with a mean of 0, and a standard deviation of 2. You can see what that distribution looks like:

ggplot(data = tibble(intercept = rnorm(10000, 0, 2))) +
  geom_density(aes(x = intercept))

prior(normal(0, 1), class = “b”)
b here stands for ‘beta’ in the regression model equation, like the coefficient for some variable that represents a deviation from the intercept. For example, if might be a regression where you are looking at Group A vs. Group B:

outcome ~ 1 + Group

Group A would represent the intercept ('1`). Group B represents the effect of group, in terms of its deviation from the intercept. So the prior ‘b’ in this case indicates that the difference in means between group A and group B, before seeing the data, is expected to lie in a normal distribution with a mean of 0 and a standard deviation of 1.

Topic		Replies	Views
Setting informative priors from pilot study for lognormal model brms	15	4040	February 11, 2019
Understanding intercept prior in brms brms	11	1918	May 12, 2025
How to understand how to set prior for Intercepts (e.g. model form y ~ 1 + x)? Modeling prior-choice , brms	1	599	November 2, 2021
Setting priors on specific values of random intercept Modeling	8	511	October 15, 2019
Sampled Priors for Intercept don't match the setting brms prior-predictive	3	235	February 28, 2024

Novice with no prior experience

Related topics