I have just started using RStan to work on some fairly simple dynamic factor models. I have data on a bunch of countries and would like to fit the same DFM on each country separately in a tidy way.
For this I arranged time series data in a nested dataframe - where countries are nesting variables and data are collapsed in a column of lists. Then I wanted to use purrr::map to apply the same model to every element of the column of lists, ie the data for a single country, with a custom function
## custom function
stan_map <- function(.data, .stan_mdl, ..., fct_vars=keep_vars){
# remove NAs and cols
clean_x <- .data %>%
select(
date,
all_of(fct_vars)
) %>%
na.omit()
# shape into list
mod_list <- list(
T = nrow(clean_x),
N = ncol(clean_x) - 2,
Y = clean_x[, c(-1, -2)]
)
# run stan model
stanout <- sampling(
object = .stan_mdl,
data = mod_list,
refresh = -1,
open_progress = F,
# extra options
...
)
return(stanout)
}
Then I use mutate
and map
to just add another column of lists where each element should be the output of the sampling, like this
# compile stan model from file
bdfm2 <- stan_model(
file = "./f2.stan",
model_name = "BDFM2",
verbose = T,
auto_write = T
)
# apply to individual countries
data_nest <- data_nest %>%
mutate(
stan_obj = map(
.f = stan_map,
.x = ctry_data,
.stan_mdl = bdfm2,
# for testing
iter = 1000,
chains = 1,
cores = 1
)
)
Now, for some reason R crashes without any error message or warning and I am left without any clues on how to correctly fit models in a tidy way with map
. My hunch is that it has something to do with parallelization and some clashes between purrr:map
and rstan
, also because the custom function alone works outside the mutate+map. I am running on Win10 with R 4.2.1 and RStan 2.26.23; I cannot use cmdstan since I am on a restricted corp machine.