Help regarding handling of NA in posterior_predict

I have fit a model using rstanarm::stan_glm. However, while running posterior_predict I am getting error about the NAs in the data. It looks like there is no support for handling NAs yet. Is there any workaround?

Following is the code trace of error

pp <- posterior_predict(fit_pooled_stan_1, newdata = test_2019, transform=TRUE)

Error: NAs are not allowed in 'newdata'.
stop("NAs are not allowed in 'newdata'.", call. = FALSE)
validate_newdata(object, newdata = newdata, m = m)
posterior_predict.stanreg(fit_pooled_stan_1, newdata = test_2019, transform = TRUE)
posterior_predict(fit_pooled_stan_1, newdata = test_2019, transform = TRUE)

  • Operating System: Mac OS Mojave
  • rstanarm Version: rstanarm_2.21.2


I think this really depends on what the NAs represent. How do you imagine predicting a value should work without knowing all of the predictors? One meaningful special case is when you want to predict new varying intercepts/varying effects (random effects) for previously unseen group categories in which case (for rstanarm) you don’t put NA in the newdata, but put actual new, previously unseen levels (if you want to sample new values for the levels) or use re.form to avoid sampling some of the groupings.

If your use case is different, then you need to explain what is the behaviour you expect and we may help you in realizing this with rstanarm

Does that make sense?

Best of luck with your model.

This sounds loosely related to my question, Visualizing predictive uncertainty. In that case, there were potentially NAs in the y values. The first (left) part of the graph showed the fit of the model. I wanted (and still want) to show on the second (right) part of the graph the forecasted results and their uncertainty (e.g., 50% anyd 90% CIs). There is no y to show, as that future has yet to happen, ppc_intervals() won’t work unless the length of the y values is equal to ncols(newdata), and it won’t let me append NAs to the y values to make them equal.