Comparing theory and stan implementation on sbc

hyunji.moon · July 23, 2020, 8:28am

I was comparing rstan sbc.R and stan user manual(+paper) and had some questions:

1. Rank consistency
vignettes has: int ranks_[1] = {pi > pi_};

but user manual and original paper, Validating Bayesian Inference Algorithms with Simulation-Based Calibration, uses

\sum_{m=1}^{M} \mathbf{I}\left[\theta_{k}^{(n, m)}<\theta_{k}^{\operatorname{sim}(n)}\right]

Would it be better to use int ranks_[1] = {pi < pi_}; as the paper shows diagnostic techniques, which depend on the inequality direction.

2. M, N
stan manual explains:

M: the number of posterior draws per simulated data set
N: total number of simulated data sets and fits

How many times of stan fit is supposed to happen in sbc? MN or N?
If the answer is N, does M in code, correspond to N in user guide, as M times of sampling is happening?

3. n_eff consideration
Considering n_eff, suggested by the above paper, would the following code improvement be o.k? if(n_eff/M >= 1-n_eff_reltol && n_eff/M <= 1+n_eff_reltol) is what @avehtari suggested here.

question: can M substituted as iter_sampling?
concern: n_eff/iter_sampling might have different value, once thinned. However, in current frame, thinning is not happening because of anesthetic chain as @bgoodri suggested here.
original

    out <- sampling(stanmodel, data, pars = c("ranks_", if (has_log_lik) "log_lik"), include = TRUE,
      chains = 1L, cores = 1L, seed = S, save_warmup = FALSE, thin = 1L, ...)

with n_eff (cmdstanr code)

    iter_sampling = 1000
    while(TRUE){
      out <- stanmodel$sample(data, chains = 1, iter_sampling = iter_sampling, seed = floor(S), parallel_chains = 1, save_warmup = FALSE, thin = 1)
      n_eff <- out$summary("lp__") %>% pull("ess_bulk")
      if(n_eff/iter_sampling >= 1-n_eff_reltol && n_eff/iter_sampling <= 1+n_eff_reltol) break
      iter_sampling = 2 * iter_sampling
    }

.

bbbales2 · July 24, 2020, 11:54pm

I don’t know. @jonah do you know this?

N is the number of Stan fits. M is the number of draws in the manual.

It looks like M in that code is number of stan fits.

I guess M is the number of draws after thinning, and iter_sampling is the number of draws before thinning.

I think we still do thinning in SBC, just what Bgoodrich is saying is you might choose the amount of thinning after you do all the calculations (because it is expensive to rerun the calculations and it won’t be obvious how to thin before you do them).

Jean_Billie · July 26, 2020, 3:16am

I guess, you are correct. The inequality sign is the opposite between the paper and Stan.

Even if the interpretation of the histogram of ranks depends on the inequality direction, the uniformity dose not depend on the inequality.
However, when the histogram is not uniformly distributed, then maybe, we have to interpret it by modifying the paper’s manner according to the inequality. E.g., If a histogram gerated by rstan::sbc() has the asymmetry form in page 10 Fig 7 (b) of the paper, then the interpretation will be modified.

I guess the anwer is the product of the size of posteriors and the size of ranks, because we can specify the number of posterior samples and the number of samples of rank statistics as following manner. For each single rank statistic, the size iter of posterior samples are drawn, and thus, for all rank statistics, it is the product.

rstan::sbc(
stanmodel = stanmodel,
      data = data,
          M = M, The sample size of rank statistics
        iter = iter, The sample size of the posteriors
 ....)

I am a beginner, so, …dont trust me. I am also interested in this topic.
Please correct me if I am wrong.

jonah · July 27, 2020, 6:14pm

Good question @hyunji.moon. I’m not actually super familiar with the SBC implementation in RStan. I do think it would be better if the code and paper were consistent with each other, but I don’t think it’s wrong to do it either way. I’m not an SBC expert though! We can check with @bgoodri though as he implemented SBC in RStan. Was there a particular reason to do the inequality this way?

bgoodri · July 27, 2020, 6:44pm

It doesn’t matter for whether the histograms (now freqpolys) look uniform.

hyunji.moon · July 30, 2020, 10:27pm

In this case, the number of fit itself is N, right? We draw M samples from one fit which happens N times. Thanks for the reply.

hyunji.moon · July 30, 2020, 10:35pm

I agree, but as I have mentioned above, ‘as the paper shows diagnostic techniques, which depend on the inequality direction’ the inequality direction might matter in some cases, the direction of bias for example.

Juan_Ignacio_de_Oyarbide · October 29, 2021, 2:58pm

Hello everyone,

I am sorry I am jumping into this topic with a basic question, but why don’t we compute the rank statistics within the data averaged posterior? If we do it for each simulated dataset of SBC, wouldn’t the uniformity depend on the number of draws? If the number of draws goes to infinity the posterior should be centered on the ground truth that we sampled and then used to generate the data.

I am also following the example in the manual: 27.3 SBC in Stan | Stan User’s Guide, which is also confusing for me.

Thank you,

Topic		Replies	Views
Draws and simulations in the SBC example in the Stan's user guide Modeling specification , simulation-based-calibration	4	590	October 27, 2020
Extending Rstan's Simulation Based Calibration Algorithms simulation-based-calibration	4	654	July 25, 2024
Simulation-based calibration case study with RStan Algorithms simulation-based-calibration	31	3055	July 30, 2019
Simulation-based calibration (SBC) and thinning Algorithms simulation-based-calibration	4	1034	November 8, 2021
SBC: definition of ranks Algorithms simulation-based-calibration	3	912	October 15, 2019

Comparing theory and stan implementation on sbc

Related topics