Thank you for your suggestion and apologies for my late reply.
I finally managed to write more efficient code, separating the compilation part and sampling part, as shown below. Since I fit one model to multiple simulation data, I put the compilation procedure outside of future_map
, as you suggested. In the compilation part, I use brms::make_stancode()
to write Stan commands automatically and accurately, and pass the commands to cmdstanr
’s write_stan_file()
and cmdstan_model()
for the actual compilation.
The sampling part is executed using furrr::future_map()
to iterate CmdStanModel$sample(..., chains = 4, threads_per_chain = 3, parallel_chains = 4)
using all 12 logical cores I have. I confirmed by observing Task manager’s performance tab that all cores were used in this sampling procedure.
It took only 151.69 seconds to perform 100 simulations of 6000 iterations (2000 warm-up + 4000 post-warm-up) per simulation. Since it had taken approximately 40 min to do the simulation when using purrr::map()
, we can enjoy drastic and dramatic advancement in speed.
I really appreciate your supportive comments, @rok_cesnovar and @ajnafa!
model <- make_stancode(
samples ~ 1,
## no matter which data is specified for this argument,
## the same model (Stan code) is built,
## if common priors are specified across the simulations.
data = samples_Nsim |>
filter(sample_id == 1) |>
dplyr::select(samples),
family = gaussian(),
prior = prior(normal(10, 1), class = Intercept) +
prior(normal(0, 5), class = sigma),
chains = 4,
cores = 4,
control = list(adapt_delta = 0.95),
backend = "cmdstanr"
) |>
write_stan_file() |>
cmdstan_model(
# force_recompile = TRUE,
cpp_options = list(stan_threads = TRUE)
)
bayes_samples_Nsim_without_brm <- samples_Nsim |>
group_by(sample_id) |>
nest(samples = samples) |>
ungroup() |>
mutate(
stan_data = map(
samples,
~ make_standata(
samples ~ 1,
data = .x,
family = gaussian(),
prior = prior(normal(10, 1), class = Intercept)
)
),
Bayesian_model = # map(
future_map(
.x = stan_data,
.options = furrr_options(
scheduling = 1,
seed = TRUE,
prefix = "prefix"
),
~ model$sample(
data = .x,
seed = 123,
iter_sampling = 4000,
iter_warmup = 2000,
chains = 4,
threads_per_chain = 3,
parallel_chains = 4
)
),
estimates =
map(
.x = Bayesian_model,
~ posterior::summarise_draws(
.x,
mean,
sd,
q2.5 = ~ quantile(.x, probs = c(0.025)),
q97.5 = ~ quantile(.x, probs = c(0.975))
) |>
filter(variable == "Intercept")
),
Bayesian_estimate = map_dbl(
.x = estimates,
~ .x |>
select(mean) |>
pull()
),
Bayesian_std_error = map_dbl(
.x = estimates,
~ .x |>
select(sd) |>
pull()
),
Bayesian_lower = map_dbl(
.x = estimates,
~ .x |>
select(`2.5%`) |>
pull()
),
Bayesian_upper = map_dbl(
.x = estimates,
~ .x |>
select(`97.5%`) |>
pull()
)
)