Could SBC rank uniformity be attacked by unstable equilibrium point in generator?

Justification for SBC rank uniformity are two:

  • theoretical: measure theory that excludes computational/numerical considerations (integration, optimization error)
  • experimental: mostly with multivariate normal

I was curious whether generator’s nonlinearity can impede diagnostics (e.g. ode model). For instance, would rank uniformity equally hold for tight prior near unstable equilibrium point (in fig.10-6)? Could the diverging force in the uni-direction process affect SBC’s self-consistency spirit: prior and posterior are exchangeable conditional on data? Or would SBC be free from these dynamic forces as there exists an honorable screenshot statistical model which cancels out the diverging effect? By screenshot, I mean one version of the statistical model used for data generation and parameter estiamtion, which ideally aggregates across time within a specific length of time window in the process. For instance, we can imagine some representable statistical model within every five periods of time from fig.10-7 where diverging effect is relatively less prominent.

To be honest, the fact that nonlinearity can impede outcome rank uniformity in polya urn model, motivated this post. Compared to linear model where the probability of balls added in each run is linearly proportional to the current proportion which results in uniform rank as in fig.10-3, the nonlinear polya process disrupts this uniformity as in fig.10-7. Aside from numeric issues from integration and optimization approximation (i.e. stuck or inconsistent target density evaluation or its gradient), I am curious whether there exist deeper mechanisms forbidding \theta, \theta' symmetry in \pi(\theta, y, \theta') from this post which delves into SBC proof. Figure’s source is chp.10 path dependence from Business Dynamics.

Note.1. A similar question was asked in Simulation Based Calibration (SBC) for non-linear model. I think is worth revisiting from a specific polya process context and the philosophy of iteration of simulation-based calibration and calibration-based simulation approximate joint space of simulate and calibrate function.

Note.2. I wonder whether the raised question can be sidestepped by considering specific verification procedures for stochastic process e.g. this paper on sbc for gaussian process. There are two pieces to this question: i) does unstable equilibrium point affect their method ii) if not, could we view their model to be rightly calibrated? (statically calibrated, but not dynamically calibrated perhaps)

1 Like

The rank uniformity of SBC follows from well-defined assumptions – the models used to simulate data and construct a posterior distribution have to be consistent with each other and the posterior samples have to be exact.
Ideally the models are not only consistent with other but also accurately represent the data generating process of applied interest, but that is not technically necessary.

In theory chaotic dynamics are not a problem. For example an exact solver could explore all of the many possible final states as the initial states are varied across different simulations; in the Polya urn case it doesn’t matter if forward simulations of the dynamics get stuck in certain modes so long as repeated simulations get stuck in all of the different modes, and do so with the right proportions. Alternatively a probabilistic solver could quantify the distribution of final states, unimodal or multimodal, and then simulated data could be generated by sampling from that distribution. Provided that the forward process can be simulated exactly then the prior predictive simulations will match the data generating process.

Unfortunately in practice we do not enjoy those kinds of exact solvers. If the numerical solvers used in practice are not able to reach all of the relevant final states then the prior predictive simulations will be biased away from the desired data generating process.

How this affects SBC depends on what happens with the posterior samples. For example if the posterior samples are exact then there will be a mismatch between the prior predictive simulations and the posterior samples, and this will manifest as non-uniformity in the ranks. Note that in general the non uniformity in the ranks may not be resolvable without many ranks.

In most cases, however, chaos in the forward model will bias not only the implementation of the prior predictive simulations but also the posterior samples. For example in Stan we would need to integrate the forward dynamics in order to evaluate the likelihood function and its gradient, and inaccurate integration would lead to inaccurate or even inconsistent evaluations which can bias Markov chain Monte Carlo sampling. What happens to SBC depends on how compatible this bias is with the prior predictive bias!

Typically the two biases will be different. In this case neither the prior predictive simulations and the posterior samples will be consistent with the data generating process but more importantly they will not be consistent with each other and the SBC ranks will be non-uniform (although again the non-uniformity might not be detectable).

In theory, however, it is possible that the biases can be consistent with each other. In other words while neither the prior predictive simulations nor the posterior samples are consistent with the desired data generating process they can be consistent with some other, biased data generating processes. In this case the SBC ranks will be uniform!

Now it’s not clear how unlikely this is, but this kind of conspiratorial behavior is technically possible. This doesn’t invalidate SBC because all of the claims of SBC still hold conditioned on the assumptions. Critically uniformity of SBC ranks doesn’t require that the prior predictive simulations and posterior samples are consistent with a given data generating process, it just requires that they are consistent with each other. This isn’t really a failure of SBC but rather a failure in how SBC is often misinterpreted.

It’s easy to take the accuracy of the prior predictive simulations for granted, but to use SBC robustly has to be careful to check that independently of the SBC ranks.