Best practices for Simulation Based Calibration with hierarchical models

SBC doesn’t require that that the estimators are exact, just that they’re unbiased. Consequently the only challenge in implementing SBC for Markov chain Monte Carlo is removing the autocorrelations as much as possible. Increasing the number of samples just makes the SBC assessment more sensitive to potential problems.

To be clear SBC assesses the accuracy of a sampling produce, and implicitly posterior expectation value estimation, within the context of a specified model, not the model itself. The method says nothing about whether the specified model is useful in any given application.

The problem is that there is no single deviation from uniformity. The rank histogram can deviation in many different ways, and each distinct deviation says something different about the nature of the problem. Even if a single summary/test is designed to capture interpretable deviations, such as those discussed in the paper, it will largely ignore other possible deviations.

Many uniformity tests, for example Kolmogorov-Smirnov, are based on statistics that don’t correspond to any particularly interpretable deviation in the SBC case, and hence aren’t all that useful for automated testing. We considered trying to come up with template-like tests to capture the smiles/frowns/tilts discussed in the paper but ultimately the rank histogram was the most information dense way of presenting the results.

At the same time recall that even in high-dimensional models there are often only a few parameters/summaries of interest to the final application, and SBC is much more productive when those parameters are prioritized instead of trying to test every parameter at once.

In my opinion Wasserstein is most usefully interpreted as an integral probability metric that bounds differences the expectation values of certain sets of functions instead of just differences of probabilities. See for example the discussion in https://betanalpha.github.io/assets/case_studies/markov_chain_monte_carlo.html#33_extra_credit:_theoretical_convergence.

4 Likes