I’m trying to follow the SBC example in the Stan’s user guide but I have some questions
In 25.3.3, it says
K is the number of parameters
N is the total number of simulated data sets and fits
M is the number of posterior draws per simulated data set
but section 25.4.1 is confusing. Shouldn’t I have one rank for each parameter for each simulation? That is r_{n,k} rather than r_{n,m}?
So to make it consistent with the previous page, shouldn’t this
Inputs: M draws, J bins, N parameters, ranks r[n, m]
b[1:J] = 0
for (m in 1:M)
++b[1 + floor(r[n, m] * J / (M + 1))]
be as follows
Inputs: N simulations, J bins, K parameters, ranks r[n, k]
b[1:J, 1:K] = 0
for (k in 1:K)
for (n in 1:N)
++b[1 + floor(r[n, k] * J / (N + 1)), k]
I haven’t read that section yet (I’ll take a look but I don’t have time right this second), but tagging @Bob_Carpenter because I think he wrote that section (and most sections!) and he might be able to clear this up quickly.
After debugging the functions in SBC.r I’m now more confused than before about these sections. The implementation in rstan seems quite different from what Bob wrote, and binning seems to be simply ignored.
To be honest I’m only vaguely familiar with @bgoodri’s RStan implementation and I haven’t read @Bob_Carpenter’s implementation at all yet. Maybe they can reconcile their implementations because (even if they’re both correct) I think it’s confusing if there are multiple versions of this that people have a hard time comparing.
The main difference is that rstan::sbc() just gathers up the rankings and leaves the thinning, binning, etc. for later, mostly because it is hard to know in advance exactly how to do that.