Issue using posterior_vs_prior with Poisson or Neg binomial models

Rachel · August 27, 2019, 4:00am

Please also provide the following information in addition to your question:

Operating System: macOS Mojave 10.14.6
rstanarm Version: 2.18.2

First, thank you to the developers of rstanarm, it’s such a user-friendly way to learn Bayesian statistics.

I have been unable to use the posterior_vs_predict function for rstanarm Poisson or Negative binomial models. I have read through the R documentation and googled but I still don’t understand why I can’t get it to work. In a recent statistics class, the teacher (on Linux, I believe same version of rstanarm) also encountered this problem. Is there something different I must call within this function for Poisson or Neg binomial models? For Gaussian or Binomial rstanarm models I normally use the below code with no issues:

posterior_vs_prior(rstanarm.model, pars = c(‘parameters specified by name’),
color_by = ‘vs’, group_by = TRUE, facet_args = list(scales = ‘free_y’))

Even keeping it simple, eg. posterior_vs_prior(rstanarm.model), this won’t work with a Poisson or Neg binomial model.

Thanks in advance!

stijn · August 27, 2019, 5:04am

Welcome to the forum!

Could you give more detail on the error that you get? I tried a minimal example with simulated data and that seemed to work.

x <- rnorm(100, 0, 1)
y <- rpois(100, exp(x))
d <- data.frame(x = x, y = y)
stp <- stan_glm(y ~ x, data = d, family = poisson)
posterior_vs_prior(stp)

Rachel · August 29, 2019, 4:31am

Thanks for your response. You reminded me of the importance of going back to the basics to locate the issue!

Keeping with your minimal example, it’s only when changing both the iterations and warmup from the default that the plot throws an error.

stp ← stan_glm(y ~ x, data = d, family = poisson, iter = 5000, warmup = 2000)
posterior_vs_prior(stp)
Error in FUN(newX[, i], . . . ) : is.atomic(x) is not TRUE

I don’t really understand the error, except x is not an atomic vector. Isn’t a numeric vector (which x is currently classified as) a type of atomic vector?

I believe it may be something to do with the default warmup to iteration ratio for Poisson and Negative binomial models being set at 50%. I assume this is in place as a safe guard.

I decreased the warmup (to 40%) in my models since validation checks still looked good and I thought it’s best to make use of as many samples as possible. Is this unnecessary?

Thanks in advance.

stijn · August 29, 2019, 6:04am

I think this is a bug. I get the same error with

stp <- stan_glm(y ~ x, data = d, family = poisson, iter = 4001, warmup = 2000)
stp <- stan_glm(y ~ x, data = d, family = gaussian, iter = 4001, warmup = 2000)

but not with

stp <- stan_glm(y ~ x, data = d, family = poisson, iter = 3001, warmup = 1000)
stp <- stan_glm(y ~ x, data = d, family = gaussian, iter = 3001, warmup = 1000)
stp <- stan_glm(y ~ x, data = d, family = poisson, iter = 4000, warmup = 1000)
stp <- stan_glm(y ~ x, data = d, family = gaussian, iter = 4000, warmup = 1000)

@bgoodri @jonah I can report this on github if you think it’s a bug. I followed traceback and I think it happens in or shortly after this line

EBFMI <- get_num_upars(object)/apply(E, 2, var)

in throw_sample_warnings when sampling from the prior (update.stanreg)

stijn · August 30, 2019, 7:59am

It’s a bug. I reported it here

Rachel · August 30, 2019, 8:42am

Thank you!
I really like the posterior_vs_prior function as it’s a quick and easy way to check my priors makes sense, so I’m looking forward to a solution.

stijn · August 30, 2019, 9:05am

If this is representative of your specific example

stp <- stan_glm(y ~ x, data = d, family = poisson, iter = 5000, warmup = 2000)

you can do

stp <- stan_glm(y ~ x, data = d, family = poisson, iter = 5000)
posterior_vs_prior(stp)

You have 500 more warmup and 500 less posterior samples per chain but that should still work.

The following will also work but sampling from the prior will have a shorter warmup (which is probably fine for a regular regression).

stp <- stan_glm(y ~ x, data = d, family = poisson, iter = 5000, warmup = iter / 2.5)
posterior_vs_prior(stp)

Rachel · August 30, 2019, 9:31am

Thanks, yea I have been doing the former example and it’s fine. It would only provide more flexibility if I can change the warmup and still use this function in the future.
Maybe it’s time I learn rstan/stan.

stijn · August 30, 2019, 10:42am

I should be revising my paper but couldn’t stop thinking about this. This is also more for others who might hit this problem. If you load bayesplot first and then use this slightly hacked together version of the original function, I think you can keep using it even when you specify the warmup = 2000 in the original stan_glm call.


require(bayesplot)
posterior_vs_prior_stanreg <- function (object, pars = NULL, regex_pars = NULL, prob = 0.9, 
          color_by = c("parameter", "vs", "none"), group_by_parameter = FALSE, 
          facet_args = list(), ...) 
{
  if (!rstanarm:::used.sampling(object)) 
    rstanarm:::STOP_sampling_only("posterior_vs_prior")
  stopifnot(isTRUE(prob > 0 && prob < 1))
  color_by <- switch(match.arg(color_by), parameter = "parameter", 
                     vs = "model", none = NA)
  if (group_by_parameter) {
    group_by <- "parameter"
    xvar <- "model"
  }
  else {
    group_by <- "model"
    xvar <- "parameter"
  }
  aes_args <- list(x = xvar, y = "estimate", ymin = "lb", ymax = "ub")
  if (!is.na(color_by)) 
    aes_args$color <- color_by
  if (!length(facet_args)) {
    facet_args <- list(facets = group_by)
  }
  else {
    facet_args$facets <- group_by
  }
  message("\nDrawing from prior...")
  capture.output(Prior <- suppressWarnings(update(object, prior_PD = TRUE, 
                                                  refresh = -1, iter = 2000,
                                                  warmup = 1000, chains = 2)))
  objects <- rstanarm:::nlist(Prior, Posterior = object)
  plot_data <- rstanarm:::stack_estimates(objects, prob = prob, pars = pars, 
                               regex_pars = regex_pars)
  graph <- ggplot(plot_data, mapping = do.call("aes_string", 
                                               aes_args)) + geom_pointrange(...) + do.call("facet_wrap", 
                                                                                           facet_args) + theme_default() + xaxis_title(FALSE) + 
    yaxis_title(FALSE) + xaxis_ticks() + xaxis_text(angle = -30, 
                                                    hjust = 0) + grid_lines(color = "gray", size = 0.1)
  if (group_by == "parameter") 
    return(graph)
  abbrevs <- abbreviate(plot_data$parameter, 12, method = "both.sides", 
                        dot = TRUE)
  graph + scale_x_discrete(name = "Parameter", labels = abbrevs)
}

jonah · September 12, 2019, 7:43pm

Thanks for reporting this and for attempting to troubleshoot! I haven’t had a chance to look seriously into it yet but hopefully will in the coming days.

mcol · September 13, 2019, 4:31pm

There’s a PR on the rstanarm github repo about this.

jonah · September 14, 2019, 12:02am

Awesome, thanks!! Ben and I shoild be getting back to rstanarm in the next few days.

Topic		Replies	Views
Poisson and negative binomial regression rstanarm fitting-issues , specification	8	3597	July 9, 2017
Posterior predictive checking Modeling	10	2077	November 26, 2019
Prediction with rstanarm differs from prediction with stan rstanarm	4	936	January 19, 2018
Rstanarm prior specification: stan_glm.nb() and stan_glm() Poisson rstanarm	6	1255	June 21, 2017
Help with Poisson model Modeling loo	45	7738	March 3, 2018

Issue using posterior_vs_prior with Poisson or Neg binomial models

Related topics