@avehtari, following up on the short discourse regarding CI’s, just to make sure to see where the analytic “difficulty” is located:

If we denote by \mathcal{C}_k the number of (integer) ranks that are less or equal to k (integer) with 0 \leq k \leq L. Don’t we have for \ell (integer) with 0\leq \ell \leq N, under the (Null)-hypothesis that the N ranks are i.i.d. uniform on the integers \{0,1,\dots, L\}:

\mathbb{P}[\mathcal{C}_k \leq \ell ] = I_{1-\frac{k+1}{L+1}}(N-\ell, 1+\ell)

where N is the number of synthetic datasets generated as part of the SBC algorithm and L the number of (uncorrelated) posterior samples generated, given a synthetic dataset. I is the regularized incomplete beta function. This comes via the cumulative distribution function of a Bionomial with number of trials equal to N and success probability equal to (k+1)/(L+1), evaluated at \ell.

Now, the empirical cumulative distribution function (ECDF) at rank k, denoted by \mathcal{E}_k\in[0,1]\subseteq \mathbb{R}, is distributed as \mathcal{C}_k/N thus

\mathbb{P}[\mathcal{E}_k\leq \frac{\ell}{N}] = \mathbb{P}[\mathcal{C}_k \leq \ell ] = I_{1-\frac{k+1}{L+1}}(N-\ell, 1+\ell)

In theory we could use this to get the intended CI, given the null hypothesis of N ranks uniformly distributed over the integers {0, 1,\dots,L}:

So, for each 0\leq k \leq L we need to find integers \ell_k \geq \ell_k' such that (say)

\mathbb{P}[\frac{\ell'_k}{N}\leq \mathcal{E}_k] = 0.05 \Leftrightarrow
\mathbb{P}[\frac{\ell'_k}{N}\geq \mathcal{E}_k] = 0.95

and

\mathbb{P}[\frac{\ell_k}{N}\geq \mathcal{E}_k] = 0.05

to get a 90\% CI. (For each k): So for the first case, couldn’t we start at \ell_k'=N and stepwise decrease \ell_k' by one, until we find that \mathbb{P}[\frac{\ell'_k}{N}\geq \mathcal{E}_k]\leq 0.95? Similar, for the second case, we would start at \ell_k=0 and increase by one, until the first time we have \mathbb{P}[\frac{\ell_k}{N}\geq \mathcal{E}_k] \geq 0.05?