Simulation-based calibration case study with RStan

In case someone would like to have a parallel version of Bob’s SBC function, I copied from the rstanarm implementation of sbc() (the latter does not support the automatic adaption of thinning based on n_eff while ensuring a fixed number of output samples). Here we go:

sbc_parallel <- function(model, data = list(),
                sbc_sims = 1000, stan_sims = 999,
                init_thin = 4, max_thin = 64,
                target_n_eff = 0.8 * stan_sims) {
    num_params <- num_monitored_params(model, data) 
    ranks <- matrix(nrow = sbc_sims, ncol = num_params) 
    thins <- rep(NA, sbc_sims)
    sbc_sim <- function(n) {
        seed <- seq(from = 0, to = .Machine$integer.max, length.out = sbc_sims)[n]
        n_eff <- 0
        thin <- init_thin 
        ranks <- rep(NA,num_params) #matrix(nrow = sbc_sims, ncol = num_params) 
        while (TRUE) {
            fit <- sampling(model,
                            data = data,
                            seed=seed,
                            chains = 1,
                            iter = 2 * thin * stan_sims,
                            thin = thin,
                            control = list(adapt_delta = 0.99), 
                            refresh = 0)
            fit_summary <- summary(fit, pars = c("lp__"), probs = c())$summary 
            n_eff <- fit_summary["lp__", "n_eff"]
            if (n_eff >= target_n_eff || (2 * thin) > max_thin) break;
            thin <- 2 * thin
        }
        
        lt_sim <- rstan::extract(fit)$I_lt_sim
        for (i in 1:num_params)ranks[i] <- sum(lt_sim[ , i]) + 1
        c(thin,ranks)
    }
    #rslt <- lapply(1:sbc_sims, sbc_sim)
    rslt <- parallel::mclapply(1:sbc_sims, sbc_sim)
    for (n in 1:sbc_sims) {
        thins[n] = rslt[[n]][1]
        ranks[n,] = rslt[[n]][-1]
    }
    pval <- rep(NA, num_params) 
    for (i in 1:num_params)
        pval[i] <- test_uniform_ranks(ranks[ , i],
                                      max_rank = stan_sims + 1)
    list(rank = ranks, p_value = pval, thin = thins) 
}

Moreover I also copied the seed specification from @bgoodri, albeit I understand from the discussion above that the “max-spacing” approach for seeds might not be needed (so we could have set seed=n above as well, I guess, or leave it as it is in Bob’s original sbc implementation).

3 Likes