Is it possible to compare a gaussian model to an ordinal model using loo-cv

llewmills · October 7, 2024, 4:43am

I have been working on a project analysing data from routinely-collected outcomes data from clients receiving opioid agonist treatment (methadone, buprenorphine) in Australian public drug and alcohol clinics. I am examining the effect of level of amphetamine use (in days used) in the 28 days prior to starting treatment on substance use and quality of life in the first year of treatment. i have been using brms, and following developed by @avehtari (for which I am eternally grateful, see here). I have been posting on the forum quite a lot (here and here and here and here).

In that workflow Professor Vehtari explains how you can compare a Gaussian regression - where the outcome is treated as numeric - to a binomial or beta-binomial regression - where the outcome is treated as a (discrete) bounded count - using loo-cv, as long as the width of the interval of each possible count is 1 (i.e. because then the posterior predictive probability and posterior predictive density will be equivalent). If you use loo-cv to compare two models where the outcome is expressed in different functional forms - number vs count - using the loo_compare() function, saying

Warning message:
Not all models have the same y variable. ('yhash' attributes do not match)

But Professor Vehtari explains why, in this special case, this warning can be ignored and you can legitimately make inferences about which model explains the data better (obviously with lots of posterior and prior predictive checking).

Using the same dataset I have been analysing quality of life outcomes, seeing whether amphetamine use at start of treatment affects trajectory of quality of life over the first year of treatment. This time the outcome is a score from 0-10, where higher score indicates better psychological health. The response scale is an integer (i.e. people cannot indicate fractions between whole numbers. I was wondering if, using the same logic as laid out in Professor Vehtari’s notebook, one could compare a gaussian model, where the outcome is treated as a number, to a cumulative link model, where the outcome is treated as an ordinal categorical (with 1 added to each score so that the response scale goes from 1-11 rather than 0-10)?

I ran the two models in brms and then compared them using loo_compare() and got the following output.

                   elpd_diff se_diff
fit_ordinal_psych        0.0     0.0 
fit_gaussian_psych    -103.4    13.1

Accompanied by the same warning I listed above, that the models have different y variables. I looks like the ordinal performs better on this metric than the gaussian.

Can I ignore the warning?

avehtari · October 7, 2024, 1:28pm

Yes, if the normal model models also the integer outcomes. Normal model is likely to be worse as 1) ordinal model is actually very flexible, 2) the normal model has lower predictive densities due to having some predictive mass in the tails beyond the discretization range [-0.5,10.5] for the outcomes (this latter part could be solved by using a truncated normal, but ordinary model is so much more flexible that in this it probably doesn’t matter).

llewmills · October 7, 2024, 8:48pm

Thank you @avehtari. A great help as usual

Topic		Replies	Views
Using loo-cv to compare a Gaussian model treating a 0-10 integer variable as numeric to another model treating the same variable as ordinal Modeling brms	1	51	April 5, 2025
Comparing models with different functional forms of the same y variable using loo_compare Modeling brms	15	428	February 28, 2024
Comparing different functional forms of the same outcome variable using LOO-CV: Follow-up questions Modeling brms	3	106	September 17, 2024
Model comparison- big data Modeling loo	5	825	July 24, 2018
LOO and bayes_R2 (seem to) contradict posterior predictive check Modeling loo , posterior-predictive , brms	14	1286	October 14, 2022

Is it possible to compare a gaussian model to an ordinal model using loo-cv

Related topics