Novice with no prior experience

Hi,

I would like to specify my priors but am still in the ‘beginner’ camp and I’m not confident in what I am doing.

At the moment I have weakly informative priors:

m1priors ← c(
prior(normal(0, 2), class = “Intercept”),
prior(normal(0, 1), class = “b”)
)

I am not completely sure I understand what this is saying, is it saying normal prior (which means what)? With mean 0 and standard deviation 2 for intercept, and 1 for class =b? (Again not sure what this means).

So what I would like to know (aside from the above) is how do I know what to set my priors to be more specific? Would I look at my data and set priors according to it? For example I know that my data spreads from 0-4 and is left skewed.

Thank you all in advance

Hi Gabriella, I’d recommend you to read up on these things since it would help you tremendously. I can recommend one book to start with: Statistical Rethinking 2nd edition.

In short, no, you should never set your priors based on data. You should always do prior predictive checks and conduct simulations.

2 Likes

When you’re new to priors, prior predictive checks are especially useful. Here are some resources:

2 Likes

I’ve long violated this, somewhat knowingly, usually via scaling the data to zero-mean/unit-variance, and recently wondered if adding that procedure to running an SBC with that step included would help convince even stubborn folks like me that it’s going to yield distorted inference (just haven’t had time to do it myself).

2 Likes

FWIW, rstanarm does this by default.
https://cran.r-project.org/web/packages/rstanarm/vignettes/priors.html

Of course it’s better to actually elicit domain expertise to inform the prior, in which case it doesn’t matter if one scales the data or not (as long as one scales the prior accordingly). A seriously bad choice of prior will always yield distorted inference; the question is how frequently a default prior that works with scaled data turns out to be “seriously bad”.

Suppose the quantiles were (not) uniformly distributed. Would that mean inference is (not) distorted?

prior(normal(0, 2), class = “Intercept”) is saying your prior for the intercept to the normal distribution for which the mean is zero and the standard deviation is 2. In a similar way, prior(normal(0, 1), class = “b”) is saying your prior for the predictor(s) is the normal distribution for which the mean is zero and the standard deviation is 1.

It’s hard to interpret these priors without knowing which likelihood you’re using (e.g., the Gaussian, the ordinal with the logit link). Let’s say you’re using the Gaussian and let’s further presume you have standardized your predictor(s). Your intercept prior suggests you assume your data will be on the lower end of their limits (i.e., near zero), given your predictor(s) are at their means. When I have data of this kind, my default assumption is that my intercept would be more toward the middle of its possible range, which would incline me to set my intercept prior to something more like prior(normal(2, 2), class = “Intercept”).

Anyway, this should give you some food for thought.

Also, +1 for the prior-predictive check suggestions. However, I acknowledge that is a very intimidating thing to do as a beginner. As suggested, follow along with McElreath’s examples to learn how to do this.

2 Likes

thank you, I will start with your recommended book :)

Brilliant, thank you!

Thanks so much, all these answers are really helpful, will go and do some more reading :)

Hi @GabriellaS-K ,
Statistical Rethinking, including the online lectures, are a really amazing resource. Going through most of the chapters in that will definitely help you understand your question at a pretty high level.

In direct answer to your question:

prior(normal(0, 2), class = “Intercept”)
This is assuming that, before seeing the data, the intercept in your regression model lies somewhere in a normal distribution with a mean of 0, and a standard deviation of 2. You can see what that distribution looks like:

ggplot(data = tibble(intercept = rnorm(10000, 0, 2))) +
  geom_density(aes(x = intercept))

prior(normal(0, 1), class = “b”)
b here stands for ‘beta’ in the regression model equation, like the coefficient for some variable that represents a deviation from the intercept. For example, if might be a regression where you are looking at Group A vs. Group B:

outcome ~ 1 + Group

Group A would represent the intercept ('1`). Group B represents the effect of group, in terms of its deviation from the intercept. So the prior ‘b’ in this case indicates that the difference in means between group A and group B, before seeing the data, is expected to lie in a normal distribution with a mean of 0 and a standard deviation of 1.

1 Like