Model Comparison when high Pareto k?

Hi everyone!

I’m modeling some data thanks to a joint model between survival and longitudinal data. Those longitudinal data have a hierarchical structure, with two random effects levels : each patients have a least one metastasis that is at least observed once.

I have several model that I would like to compare.

I’ve used the loo package with a leave-one-patient-out or leave-one-measurement out, but it leads in both scenario to high pareto’s k.
Of my understanding, this doesn’t not necessarly means that the model is wrong, but that the importance sampling part can’t be trust.
The WAIC is also failing.

So I wanted to use the kfold function, but I feel like we can only use this fonction with rstanarm model and not with rstan model. Am I wrong ? Or could we use the kfold function with rstan (is there a work around) ?

Otherwise, is there another way to compare model, please ?

Any help would be appreciate!

Thank you

-S

loo package has a vignette Holdout validation and K-fold cross-validation of Stan programs with the loo package • loo, which shows how to do K-fold-CV with RStan.

Based on this, it is likely that your model is flexible and the posterior is changing a lot when removing one observation or patient, which can explain high Pareto-k’s. If you tell more about your model and post the model code and the loo output, I may comment more.

WAIC always fails before PSIS-LOO fails.

Thank you !
I was using
fit <- stan(model_code = stan_program, data = dat, ...)
instead of
fit <- sampling(stanmodel, data = dat, ...)