Problems running LOO on >1 core

Longshot408 · July 29, 2020, 2:32am

Can you confirm that this would be the correct method then?

> Main_EffectsModel=stan_glm(Accept_Reject~Discount+Floor, 
+                            family = binomial(link = "logit"), 
+                            data=sonadata_clean, 
+                            prior = Priors_MEmodel,
+                            #prior_intercept = normal(), 
+                            #prior_PD = TRUE, 
+                            algorithm = c("sampling"), 
+                            mean_PPD = TRUE,
+                            adapt_delta = 0.95, 
+                            #QR = FALSE, 
+                            #sparse = FALSE,
+                            chains=3,iter=550,cores=3)
> lik=log_lik(Main_EffectsModel)
> loo::loo(lik, save_psis=TRUE,cores=16)

Computed from 825 by 633 log-likelihood matrix

         Estimate   SE
elpd_loo   -231.6 16.3
p_loo         4.0  0.4
looic       463.3 32.6
------
Monte Carlo SE of elpd_loo is 0.1.

All Pareto k estimates are good (k < 0.5).
See help('pareto-k-diagnostic') for details.
Warning message:
Relative effective sample sizes ('r_eff' argument) not specified.
For models fit with MCMC, the reported PSIS effective sample sizes and 
MCSE estimates will be over-optimistic.

jonah · July 29, 2020, 2:34am

Looks good!

paul.buerkner · July 29, 2020, 5:40am

The other difference is that brms uses loo.matrix by default while rstanarm uses loo.function. brms loo has the pointwise argument to switch to loo.function. When I activate that together with cores > 1, the loo computation does not terminate in reasonable time (few minutes) for a model that takes 2 seconds in pointwise evaluation when cores = 1. So there may be another problem with loo.function and cores on windows more generally. I will try to look into it later in more detail.

paul.buerkner · July 29, 2020, 11:34am

I have opened a PR to fix the problem with relative_eff.function on Windows (https://github.com/stan-dev/loo/pull/152)

Longshot408 · July 29, 2020, 1:03pm

On a somewhat related note, is this the only way to get my model itself to run in parallel on Windows?

jonah · July 29, 2020, 7:09pm

@Longshot408 No I think parallelization when running a model should work fine. The only recent issue with parallelization I know of is

Are you getting that or other errors? If so can you open a separate topic and we’ll try to sort it out there?

jonah · July 29, 2020, 7:09pm

Thanks, that would be helpful!

Longshot408 · July 29, 2020, 7:37pm

Nope, no errors, just disapointing benchmarks. After switching from an i5-6500 to a Ryzen 7 3700X I wanted to see how much more efficient my model run times were going to be; turns out the only benefit was the increased clockspeed of the newer CPU. Benchmarks stopped improving after setting “cores=3”.

After doing some more digging, is this because the cores setting is limited by how many chains you run??

jonah · July 29, 2020, 7:47pm

Yeah that’s right. But you can leverage the extra cores by using certain features of the Stan language that allow for within-chain parallelization. Here’s a good tutorial from @bbbales2 on using one of those functions:

https://mc-stan.org/users/documentation/case-studies/reduce_sum_tutorial.html

Unfortunately, we haven’t yet updated the models in rstanarm to use those functions.

jonah · July 29, 2020, 8:13pm

I think the error with multiple cores on windows is now fixed on GitHub. So it’s possible to either use the workaround I mentioned above

or to install the development version of the loo package from GitHub, in which case the workaround shouldn’t be necessary anymore:

devtools::install_github("stan-dev/loo")

Topic		Replies	Views
Error with multiple cores for leave-one-group-out kfold Modeling rstan , techniques , performance , loo	1	536	May 2, 2021
Problem parallelizing rstanarm::loo() General loo	13	5183	October 20, 2023
Loo::kfold How is the task parallelized? rstanarm loo	5	1222	March 25, 2019
Error using kfold on stan_gamm4 object Modeling loo	6	526	May 13, 2020
LOO error when k_threshold = 0.7 fits model that drops sole observation of factor level rstanarm loo	8	1060	August 30, 2018

Problems running LOO on >1 core

Related topics