NAs produced in posterior_predict for truncated model

I’ve been running some truncated normal models in brms, alongside standard normal and cumulative/ordinal models. When I use the posterior_predict function, or even just pp_check, it seems that every so often a couple of predicted values in some of the posterior draws come out as NA in the truncated model, but not for the typical normal or ordinal model. There are no NAs in the data that is predicted from and no reason I can see that would produce an NA. Is there something causing this to happen in truncated models, e.g., somehow it predicts a value outside the truncated range and gives it an NA or something like that?

  • Operating System: Windows 10
  • brms Version: 2.16.3

Could you please provide a minimal reproducible example?

Yes - here is an example. I’m using real data so I’ve just cut out a subsection of it so that you can use it. The real data I have about 2k respondents, so not having enough data doesn’t seem to be the issue if that is a consideration with this small data set:

test_form <- importance | trunc(lb = .99, ub = 10.01) ~ 1 + item + (1 | respondent_id)

prior <- c(set_prior("exponential(1)", class = "sigma"),
           set_prior("exponential(1)", class = "sd"))

test_fit <- brm(formula = test_form,
                family = gaussian(),
                data = data,
                control = list(adapt_delta = 0.95, max_treedepth  = 10),
                chains = 3,
                cores = 3,
                iter = 4000,
                backend = "cmdstanr",
                threads = threading(4),
                prior = prior,
                warmup = 500)


reprex.RData (3.2 MB)
The uploaded file includes the data and model fit.

I find that running pp_check here says there are NAs, and you would also see that if you run posterior_predict

(on a separate note, if you can point to something that can help explain what the parameters coming out of the truncated model exactly relate to it would be much appreciated - sometimes the sds and beta values are really dramatically different from a non-truncated model, yet their predictions are quite similar!)

I have checked your example and the NAs appear because the truncation bounds yield numerically identical quantiles (both 0% or 100% quantile) within the untruncated distribution. unfortunately there is not much I can do about it at this point :-(

The parameters of the truncated distributions are hard to interprete due to the truncation and there is no general rule on what they mean. To get an idea and a better feeling for it, I recommend you create plots showing the density or samples of the truncated distribution and then vary the parameters to see how the implied truncated distribution changes.

Thanks for looking into it - much appreciated! Would ignoring these NAs and just summarising the posterior without them produce any strange biases or other important thing that cannot be ignored? Seems like there are usually only a couple of them in some of the posteriors and the posterior samples otherwise make sense, so they don’t seem to be doing anything too worrying

If there are just a few, it will likely not matter much if at all, but it is likely to still bias things (the more NA, the more bias).

1 Like