Use of `sample_prior='only'` in brms

torkar · May 28, 2018, 1:41pm

(Using brms v2.3.0)

Hi,

If I fit a model using sample_prior='only', i.e.,

fit ← brm(formula = tp ~ 1 + technique + category + subject,
data = d,
prior = c(set_prior(“normal(0,1)”, class=“Intercept”),
set_prior(“normal(0,1)”, class=“b”)),
family = poisson(link=log),
sample_prior = “only” #To ignore the likelihood and check if the model is sane
)

I would expect that I later could do:
ppc_dens_overlay(y = d$tp, yrep = posterior_predict(fit, draws=25))

but unfortunately, I get:
Error in validate_yrep(yrep, y) : NAs not allowed in 'yrep'.

Fake data to reproduce this can be found here:

I have a nagging feeling I’ve completely missed something here…

jonah · May 28, 2018, 5:06pm

I do think that approach should work in general. Is posterior_predict returning NA’s? If the error message is right then it seems like there are NA predictions when there shouldn’t be. If there are NAs, are there just a few or are all the predictions NA? If not then it could be a bayesplot issue (incorrect error message).

torkar · May 28, 2018, 5:10pm

Quite many NAs…

yrep_fit_posterior ← posterior_predict(fit_prio_only)
There were 47 warnings (use warnings() to see them)
sum(is.na(yrep_fit_posterior))
[1] 7536
length(yrep_fit_posterior)
[1] 280000

But none when I disregard sample_prior='only

paul.buerkner · May 28, 2018, 5:11pm

I just tested it and think I found the reason. The priors are so wide that they imply large numbers which cause exp(<larger number>) to overflow causing NAs to be predicted.

As a side issue it should be nsamples not draws, but that is not the problem here.

torkar · May 28, 2018, 5:12pm

The nsamples I got to know about just a few minutes back by reading the docs ;)

torkar · May 28, 2018, 5:20pm

You mean that my N(0,1) and cauchy(0,1) are too wide? And I thought they were fairly tight…

Zachary_Horne · May 28, 2018, 5:56pm

I’m also confused that you said normal(0,1) is wide

paul.buerkner · May 28, 2018, 7:31pm

Sorry, I should have put that into more context. The scales of the predictors are so large that the chosen priors lead to very a large prior on the intercept, which is parameterised in a special way. See ?set_prior for details. I see the following output:

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    -0.13     18.23   -36.15    35.96       4000 1.00
technique    -0.02      1.02    -1.99     2.01       4000 1.00
category     -0.01      1.00    -1.96     1.95       4000 1.00
subject       0.01      1.00    -2.01     1.98       4000 1.00

Thus, the intercept is the problem. You may put a prior on the intercept directly via

fit2 <- brm(formula = tp ~ 0 + intercept + technique + category + subject,
           data = dat,
           prior = set_prior("normal(0,1)", class="b"),
           family = poisson(link=log),
           sample_prior = "only" #To ignore the likelihood and check if the model is sane
)

In this case, the normal(0, 1) prior really affects the intercept instead of the transformed intercept.

Zachary_Horne · May 28, 2018, 7:34pm

This is helpful. Thank you

avehtari · May 28, 2018, 7:44pm

@paul.buerkner already solved the problem with the intercept, but before that I was thinking about this

For poisson(link=log) they can be too wide, and compared to your data they seem to be too wide. If I read your model right, the intended model has four additive terms for log intensity with each having a prior N(0,1), and thus the prior for the log intensity would be N(0,4^2), which corresponds to assuming that there is 5% probability that the intensity is larger than 720. This may or may not be against your prior information, but at least it seems to be 36 times larger than the largest count in your data. See also @anon75146577’s blog post

torkar · May 28, 2018, 8:06pm

Thanks @paul.buerkner and @avehtari, much clearer now and much appreciated. And thanks to @jonah et al. for Bayesplot!

torkar · May 29, 2018, 8:37am

It might be worth adding that the model @paul.buerkner provides in his answer still generates NAs when running posterior_predict(). However, as indicated by Paul and @avehtari, setting the prior to, e.g., set_prior("normal(0,0.1)", class="b"), removes the NAs.

jonah · May 29, 2018, 9:13pm

Also related to the prior predictive checking section of the visualization paper (Dan Simpson is everywhere)

Topic		Replies	Views
Prior predictive checks with brms brms	4	2866	August 16, 2023
Sampled Priors for Intercept don't match the setting brms prior-predictive	3	229	February 28, 2024
Predictions from a fitted model's prior distribution brms brms	6	678	October 17, 2023
NAs produced in posterior_predict for truncated model brms	6	721	April 8, 2022
Brms doesn't appear to account for priors on `sigma` when using `sample_prior = "only"`? brms priors , prior-predictive , brms	2	523	April 6, 2023

Use of `sample_prior='only'` in brms

Related topics