A quick note what I infer from p_loo and Pareto k values

schen5 · March 21, 2018, 4:45pm

Thank you for suggestions !

The data has 5 groups in total, with sample size ranging from 552 to 1005 for each (552, 624, 770, 983, 1005).

There are indeed some outliers, which I would expect that they are hard to be predicted from the rest of the data. Not sure why they are so influential for the posterior though. Would you suggest using more informative priors (only for partial-pooling model)?

Thank you for suggestions ! The posterior predictive check looks okay (I think): some outliers were not predicted accurately; no significant difference was observed between partial pooling and total pooling (although it is hard to tell from visualization of ~4000 data points… that’s why I am turning into LOO-CV to quantify).

The visualization paper was very helpful! I am new to the PIT-LOO, and not totally sure if I am doing it correctly but my plot looks like this (top two are total pooling, bottom two are partial pooling):

and pointwise differences in ELPD:

and k_hat (left: total pooling, right: partial pooling; the last group, 3300-th to 4000-th, is more heterogeneous):

Any insights would be appreciated. Thank you so much again!

(On the other hand, I am not 100% sure if the hierarchical model I am specifying is correct. I might start another thread for this model specification soon… here)

Topic		Replies	Views
Bad Pareto k diagnostic with good chain diagnostics General	12	1829	April 26, 2021
Good PP check and R square but large Pareto k values Modeling performance , loo	10	2124	September 2, 2020
Practical implications of many high-Pareto K observations loo Modeling loo , ecology	4	1577	September 15, 2020
High Pareto-k values for the same observations across different models: Can I still use loo to compare these models? Modeling loo	2	571	November 5, 2018
Some off pareto values, 99% ok - issue? General fitting-issues , loo , arviz	2	774	May 30, 2022

A quick note what I infer from p_loo and Pareto k values

Related topics