I am comparing 2 models using the LOO implementation of continuously ranked probability scores. I just want to ensure I am interpreting the results correctly. It seems in most circumstances in applied research the CRPS is > 0 with lower values equating to a better model. At least most of the literature on the subject seems to say as much. Yet, both the example here and my results (Fit 1 = -0.304 and Fit 2 = -0.298) are negative.

Am I right to assume that these are the average differences between the cdf of the fitted model compared to the observations and are on the same scale as the observations? In other words, technically Fit 2 is â€śbetterâ€ť - although given the SE of the estimate, they are practically the same.

Pinging @avehtari given his work with loo.