LOO Continuously Ranked Probability Score

cwolfe · May 23, 2024, 6:36pm

I am comparing 2 models using the LOO implementation of continuously ranked probability scores. I just want to ensure I am interpreting the results correctly. It seems in most circumstances in applied research the CRPS is > 0 with lower values equating to a better model. At least most of the literature on the subject seems to say as much. Yet, both the example here and my results (Fit 1 = -0.304 and Fit 2 = -0.298) are negative.

Am I right to assume that these are the average differences between the cdf of the fitted model compared to the observations and are on the same scale as the observations? In other words, technically Fit 2 is “better” - although given the SE of the estimate, they are practically the same.

Pinging @avehtari given his work with loo.

avehtari · May 24, 2024, 6:59am

We have followed the convention in [1912.05642] Local scale invariance and robustness of proper scoring rules, with higher value being better, as we have also used log score defined as log(p(.)). At the moment, we are missing loo_compare() support for CRPS, which would compute the correct SE for the comparison (now the SE reported is for each estimate separately). There is an issue for loo_compare() support loo_compare for crps and loo_crps · Issue #220 · stan-dev/loo · GitHub, but not yet anyone with time to code it. If we see more wishes for this, we will eventually prioritize it more.

Topic		Replies	Views
New loo 2.6.0 in CRAN Publicity loo	0	505	April 3, 2023
Advice about LOO General	1	422	June 29, 2021
Understanding LOOIC Modeling loo , interpret-results , cognitive-science	15	15588	November 8, 2021
Quantifying Uncertainty with the LOO-CV criterion Modeling techniques , fitting-issues , specification , loo	10	136	March 31, 2025
Interpreting elpd_diff - loo package Modeling loo , interpret-results	47	14627	November 9, 2020

LOO Continuously Ranked Probability Score

Related topics