Use of `sample_prior='only'` in brms


#1

(Using brms v2.3.0)

Hi,

If I fit a model using sample_prior='only', i.e.,

fit <- brm(formula = tp ~ 1 + technique + category + subject,
data = d,
prior = c(set_prior(“normal(0,1)”, class=“Intercept”),
set_prior(“normal(0,1)”, class=“b”)),
family = poisson(link=log),
sample_prior = “only” #To ignore the likelihood and check if the model is sane
)

I would expect that I later could do:
ppc_dens_overlay(y = d$tp, yrep = posterior_predict(fit, draws=25))

but unfortunately, I get:
Error in validate_yrep(yrep, y) : NAs not allowed in 'yrep'.

Fake data to reproduce this can be found here:

I have a nagging feeling I’ve completely missed something here…


#2

I do think that approach should work in general. Is posterior_predict returning NA’s? If the error message is right then it seems like there are NA predictions when there shouldn’t be. If there are NAs, are there just a few or are all the predictions NA? If not then it could be a bayesplot issue (incorrect error message).


#3

Quite many NAs…

yrep_fit_posterior <- posterior_predict(fit_prio_only)
There were 47 warnings (use warnings() to see them)
sum(is.na(yrep_fit_posterior))
[1] 7536
length(yrep_fit_posterior)
[1] 280000

But none when I disregard sample_prior='only


#4

I just tested it and think I found the reason. The priors are so wide that they imply large numbers which cause exp(<larger number>) to overflow causing NAs to be predicted.

As a side issue it should be nsamples not draws, but that is not the problem here.


#5

The nsamples I got to know about just a few minutes back by reading the docs ;)


#6

You mean that my N(0,1) and cauchy(0,1) are too wide? And I thought they were fairly tight…


#7

I’m also confused that you said normal(0,1) is wide


#8

Sorry, I should have put that into more context. The scales of the predictors are so large that the chosen priors lead to very a large prior on the intercept, which is parameterised in a special way. See ?set_prior for details. I see the following output:

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    -0.13     18.23   -36.15    35.96       4000 1.00
technique    -0.02      1.02    -1.99     2.01       4000 1.00
category     -0.01      1.00    -1.96     1.95       4000 1.00
subject       0.01      1.00    -2.01     1.98       4000 1.00

Thus, the intercept is the problem. You may put a prior on the intercept directly via

fit2 <- brm(formula = tp ~ 0 + intercept + technique + category + subject,
           data = dat,
           prior = set_prior("normal(0,1)", class="b"),
           family = poisson(link=log),
           sample_prior = "only" #To ignore the likelihood and check if the model is sane
)

In this case, the normal(0, 1) prior really affects the intercept instead of the transformed intercept.


#9

This is helpful. Thank you


#10

@paul.buerkner already solved the problem with the intercept, but before that I was thinking about this

For poisson(link=log) they can be too wide, and compared to your data they seem to be too wide. If I read your model right, the intended model has four additive terms for log intensity with each having a prior N(0,1), and thus the prior for the log intensity would be N(0,4^2), which corresponds to assuming that there is 5% probability that the intensity is larger than 720. This may or may not be against your prior information, but at least it seems to be 36 times larger than the largest count in your data. See also @Daniel_Simpson’s blog post


#11

Thanks @paul.buerkner and @avehtari, much clearer now and much appreciated. And thanks to @jonah et al. for Bayesplot!


#12

It might be worth adding that the model @paul.buerkner provides in his answer still generates NAs when running posterior_predict(). However, as indicated by Paul and @avehtari, setting the prior to, e.g., set_prior("normal(0,0.1)", class="b"), removes the NAs.


#13

Also related to the prior predictive checking section of the visualization paper (Dan Simpson is everywhere)