Various questions about interpretation of loo results

avehtari · August 1, 2019, 7:26am

Thanks for posting the questions. It seems we should clarify the documentation a bit related to convergence diagnostics and loo

See also loo-glossary

From the glossary

If p_loo > p , then the model is likely to be badly misspecified. If the number of parameters p<<N , then PPCs are also likely to detect the problem. See the case study at Roaches cross-validation demo for an example. If p is relatively large compared to the number of observations, say p>N/5 (more accurately we should count number of observations influencing each parameter as in hierarchical models some groups may have few observations and other groups many), it is possible that PPCs won’t detect the problem.

You have p_loo=285 > p=262.

From the glossary

If k>0.7 , then importance sampling is not able to provide useful estimate for that component/observation. Pareto k is also useful as a measure of influence of an observation. Highly influential observations have high k values. Very high k values often indicate model misspecification, outliers or mistakes in data processing. See Section 6 of Gabry et al. (2019) for an example.

You have several k>0.7, that is, importance sampling is failing as the full posterior and leave-one-out posteriors are too different.

It is likely that the problem is now mostly in importance sampling and not in MCMC.

That applies mostly to MCMC to make it more likely that Rhat and n_eff computations are reliable. You could compute Rhats and n_eff’s for exp(log_lik) (see Convenience function for computing relative efficiencies — relative_eff • loo) if you think you have a problem with MCMC sampling.

Before using loo, it is recommended that you have checked that sampling works with Rhat, n_eff, divergences, E-BMFI, etc. loo checks only the combined n_eff and khat, but if combined n_eff’s are large and Pareto k’s are small, there is no need to check Rhat for each exp(log_lik) separately.

For discrete models, elpd_loo can be interpreted as log probabilities. For continuous models elpd_loo can be compared to baseline model. Large SE indicates problems. If Monte Carlo SE of elpd_loo is NA, then the result is very unreliable.

It can be used and it works sometimes, but if you have latent variable model with n latent variables, it seems that in your case you would need to marginalize in order to get reliable result. If you are using the latent variables just to add overdispersion, consider to use instead overdispersed observation model.

I’m not familiar with blavaan

Yes.

Topic		Replies	Views
Loo package General loo	16	1767	February 27, 2019
Model checking & comparison using loo vs loo_compare Modeling	8	3051	January 4, 2024
Pareto values versus ELPD differences Modeling loo	5	1069	June 15, 2021
High Pareto-k values for the same observations across different models: Can I still use loo to compare these models? Modeling loo	2	629	November 5, 2018
Bad Pareto k diagnostic with good chain diagnostics General	12	2011	April 26, 2021

Various questions about interpretation of loo results

Related topics