Pointwise loo likelihood for binary classification

kcar · April 29, 2019, 8:18am

Hi all,
Currently I am using stan for a binary classification problem. To check the discriminatory power of my model I want to use measures like the recall, precision and the receiver operator characteristics curve. Due to the size of the data set I do not want to split up my data set into a training and a validation set. Instead of this I thought about using a similar approach as used in Vehtari, Gelman, Gabry’s paper , in which pareto smoothed importance sampling is used the calculate the cross validation likelihood by:

stan%20vraag

Where w_i^s is the weight determined by the pareto smoothing of raw importance sampling ratios.
As input for the recall etc., I want to use the cross validation likelihoods, however I am not sure if this is a good approach due to variance and bias in the approximation. Does anyone know if it is okay to use these approximations for these types of measures?
Kind regards,
Koen

avehtari · April 29, 2019, 3:07pm

Maybe this example helps?

kcar · May 1, 2019, 10:06am

Yes, thank you, I wanted to use something similar to subsection 4.3. I also like the addition of the qplot. Would you expect a high k-value when the a data point is not on the diagonal line?

avehtari · May 1, 2019, 6:15pm

If by a data point you mean predictive probability vs loo predictive probability, then yes when their difference is large it’s more likely that corresponding khat is large.

Topic		Replies	Views
Loo for hierarchical model with trial-by-trial dependencies Modeling loo	8	99	April 2, 2025
Calculating LOO-CV for a multinormal regression model Modeling loo	32	2246	April 7, 2020
LOO-CV for non-bayesian models (too stupid idea?) General loo	2	512	March 1, 2019
Approximating leave one cluster out cross validation Modeling loo	6	985	June 12, 2018
Feature request: other loss functions in loo General loo	3	585	February 18, 2020

Pointwise loo likelihood for binary classification

Related topics