Parallel the same model fitting for differen data

Hi Stan community,

I am trying to fit the same model (using cmdstanr) for different data inputs data_list_all[[idx]]. In this case, I thought I could use parallel for loop to save some time. In the following code, I tried foreach and %dopar% from library(parallel) library(foreach) library(doParallel). It works when I decrease the data length of each data_list_all[[idx]] in each loop (for testing), but when using full length data, the fitting results of a few data_list_all[[idx]] were not properly saved, because cmdstan output files were not found in the temp folder. If I just run sequentially with full data size, everyone fitted very well.

Do you know what’s going on here? Maybe we have another better choice? My gut feeling is that parallel chains in Stan may not fully completable with foreach loop such that one finished chain for data_list_all[[2]] was overwritten by a chain for data_list_all[[6]], before other chains complete for data_list_all[[2]].

Another way I am thinking is to index parameters in the model and feed all data with the same indexing, as long as I don’t pool parameters over the data set, it should be identical to the for loop solution. So you think in this case I can tell Stan to use 4 cores per data_list_all[[idx]]?

Thank you very much :)

  fit_list_all <- foreach(
    idx = fit_idx
  ) %dopar% {
    mod$sample(
      data = data_list_all[[idx]], iter_warmup = 1000, iter_sampling = 1000,
      chains = 4, parallel_chains = 4, show_messages = F
    )
  }

You may be right, despite R randomly generating file names. You could try using the output_dir arguments or output_basename functions in the method sample() on a cmdstan_model in cmdstanr.

I’m pinging @jgabry, who should know the answer here.

Also, you want to be careful to not spawn more jobs than you have cores. Even then, I find on my rather beefy Xeon-based iMac Pro that it can’t run 8 chains in parallel nearly as fast as 1 sequentially. So you might not be getting a lot of gains from parallelization if you’re getting close to or exceeding the number of cores you have.

1 Like

I will try to specify cmdstan output folder. Hopefully @jonah has better solutions.

You are absolutely right, this already happened to me. I do the following to avoid this issue.

parallel::detectCores()
n.cores <- parallel::detectCores() - 2

@jonah Do you know why this might be happening? I’m seeing the same thing use furrr::future_map. I fit fine and I’m able to extract posterior samples fine but only if I don’t do that in parallel as well. So if I have a list of fit objects and run something like furrr::future_map(list_of_fits, \(f) spread_rvars(f, x[i])) I get errors that the csv files cannot be found.

What happens if you pass the output_basename parameter to cmstanr::sample such that each iteration of the parallel map or loop writes to a deterministically unique csv filename?

1 Like

I like @jsocolar’s suggestion. Curious if that resolves the issue.

Trying it now. Will let you know if I hit any problems.

2 Likes

It works. Thanks, all!

1 Like

Great, thanks for following up and letting us know!

1 Like