"Better model" by marginal likelihood is worse by visual inspection

jarroyoe · March 1, 2024, 11:53pm

I have some count data with many zeros that I am trying to model using either a Zero-Inflated or a Hurdle model. I am comparing using Poisson or Negative Binomial as the base distribution, and throughout standard comparison methods (likelihood, marginal likelihood, LOO), the Negative Binomial models come ahead as better models. However, when you look at their parameter distributions, these models are not informative (they are indistinguishable from the null model). Also, when you try to plot their predictions, they have extremely long tails, with some predictions going 7 orders of magnitude higher than the data!

Visually, I would rather choose the Poisson models as their parameter distributions are informative and the distributions of their predictions follow the data more closely. I don’t know if I’m misinterpreting model selection using log likelihood, or if there’s some other diagnostic I am missing either during fit (no fit warnings were triggered) or after the models are fit.

Any comments would be appreciated!

avehtari · March 2, 2024, 11:55am

If the Poisson model is underdispersed the posterior can be too narrow, and what you call more informative can be a lie. Can you post some posterior and LOO predictive plots, similar to what I’ve used in Roaches case study and Nabiximols case study? In Roaches case study, it is clear that (zero-inflated-)negative-binomial is better than Poisson models, but there too, the tail of the posterior predictive distribution is long. This likely to happen as there is not enough information about the tail shape beyond the largest observed count. If you know maximum feasible count, you could just use truncated distribution, or you could use highly informative prior on the shape parameter to inform that the tail needs to go down faster than some rate you think is reasoanble.

jarroyoe · March 4, 2024, 4:28pm

Thanks for your response! I ran the diagnostics using both pit_ecdf and loo_pit_overlay. Although pit_ecdf shows no big difference between distributions, loo_pit_overlay seems to show a big difference. What does this difference mean exactly?

avehtari · March 4, 2024, 4:56pm

It seems you are using an old version of brms. Add ndraws=4000 to the pp_check(..., type="pit_ecdf", ndraws=4000) to get better plots.

Also, use the plotting code specifically as shown in those case studies, as loo_pit in bayesplot is not yet correct for discrete data (there is dependency on other package which makes it slower to fix).

Post also the loo() output for both models, so that I can see the diagnostics.

jarroyoe · March 4, 2024, 5:56pm

Thanks, in that case the ppc_pit_ecdf plots with loo look relatively similar regarding performance:

And its loo outputs

ZIP

Computed from 200000 by 20 log-likelihood matrix

         Estimate    SE
elpd_loo   -205.4  86.1
p_loo        92.7  50.2
looic       410.8 172.2
------
Monte Carlo SE of elpd_loo is 0.0.

Pareto k diagnostic values:
                         Count Pct.    Min. n_eff
(-Inf, 0.5]   (good)     15    75.0%   10324     
 (0.5, 0.7]   (ok)        5    25.0%   2351      
   (0.7, 1]   (bad)       0     0.0%   <NA>      
   (1, Inf)   (very bad)  0     0.0%   <NA>      

All Pareto k estimates are ok (k < 0.7).
See help('pareto-k-diagnostic') for details.

ZINB

Computed from 200000 by 20 log-likelihood matrix

         Estimate   SE
elpd_loo    -92.9 18.3
p_loo        14.9 10.4
looic       185.8 36.5
------
Monte Carlo SE of elpd_loo is 0.1.

Pareto k diagnostic values:
                         Count Pct.    Min. n_eff
(-Inf, 0.5]   (good)     12    60.0%   41667     
 (0.5, 0.7]   (ok)        8    40.0%   156       
   (0.7, 1]   (bad)       0     0.0%   <NA>      
   (1, Inf)   (very bad)  0     0.0%   <NA>      

All Pareto k estimates are ok (k < 0.7).
See help('pareto-k-diagnostic') for details.

Topic		Replies	Views
Worst LOO-PIT plots ever. PP Checks great Modeling loo	6	1833	February 23, 2023
Help! Count Data Model Modeling specification , example-models	1	527	August 25, 2023
Comparing Hurdle Model with Non-Hurdle Model Modeling loo	5	471	November 21, 2023
NB vs. Poisson: LOO and grouped Kfold CV General loo	12	858	July 20, 2022
Modelling "hurdle lognormal" vs "hurdle gamma" on data with many zeros Modeling specification , brms	4	72	March 21, 2025

"Better model" by marginal likelihood is worse by visual inspection

Related topics