Loo_moment_match() and reloo() errors

I was thrilled to see the new moment-matching approach to importance sampling in loo, and it helped me a lot when working on a series of models so far. However, when I apply this to the Poisson hurdle model described below, it says: “Error in validate_ll(log_ratios) : All input values must be finite.” However, the pointwise log-likelihoods of the model are all finite. Trying reloo() on the same model results in NaN for the elpd estimate.
Here are the data and the code:

pconstr.csv (1.2 KB)

d <- read.csv('pconstr.csv', stringsAsFactors = T)
m <- brm(bf(Count ~ ConstraintType + L2 + (1|Language),
                    hu ~ ConstraintType + L2 + (1|Language)),
         family = hurdle_poisson(),
         prior = c(prior(student_t(5, 0, 1), class = b),
                       prior(cauchy(0,1), class = sd)),
         data = d,
         chains = 4, cores = 4,
         iter = 2000, warmup = 1000,
         save_all_pars = T,
         control = list(adapt_delta = .99))
loo(m, moment_match = T) # error message
any(!is.finite(log_lik(m))) # FALSE
loo(m, reloo = T) # NaN

I have also tried the same model with more regularizing priors on the intercept and/or the predictors, without the varying intercept, without the L2 predictor, without the ConstraintType predictor, with deviation instead of treatment coding, without predictors on the hurdle, and with negative-binomial instead of poisson likelihoods. The error persists. It also persists when I throw out data at predictor levels that induce high std. errors, or when I exclude a third of the 0s in the data.

I’d be grateful for any suggestions.

  • Operating System: macOS 10.15.6
  • R version: 4.0.2
  • brms Version: 2.13.5
  • loo Version: 2.3.1

You don’t have much data and I guess the hurdle part has complete separation. Your priors in the shown code seem to be quite wide which then in case of complete separation makes probability of zero to go to 0 or 1 in floating point accuracy. You can try making first a logistic or probit regression model just for modeling zeros vs. non-zeros and see if you get problems.

Thanks for the response! Alas, the problems persist even with very narrow priors, such as N(0,.1) and Exp(5) (which didn’t change the posterior much). But I tried the Bernoulli model on 0 vs non-0 works that you suggested, and this works fine, with no errors.

I guess for the counts I’d then fit a truncated Poisson (Count | trunc(lb = 1) ~) to the remaining count data (i.e. without the zeros). I tried that. The truncated model still incurred complaints about nonfinite values (contradicting any(!is.finite(log_lik(m))) but we really don’t have much data left there. But reloo() works with no NaN (and modest std. err. on the elpd)!

EDIT: I was puzzled by the fact that the Bernoulli but not the hurdle models overcome the separation problem. So I double-checked brms’s prior specification of the hurdle model. It turns out that one needs to specify the priors for the hurdle separately. The model now allows moment-matching IS with super-strong priors on the hurdle:

c(prior(student_t(5, 0, 1), class = b),
  prior(normal(0, .1), class = b, dpar = hu),
  prior(exponential(1), class = sd),
  prior(exponential(6), class = sd, dpar = hu))

Thank you very much for your help!

1 Like