Hello! This is my first post in the Stan forums, so I’m not sure if I’ve chosen the most appropriate tags to categorize the issue—apologies if not!
I’d like to share an issue I’ve encountered with the moment_match
method in the loo
package. If I set cores = 1
, the method works as expected and adjusts problematic Pareto k values. But if I run the same call with multiple cores, moment_match
is silently skipped and the result is identical to a regular loo()
call—giving the false impression that moment matching was applied but failed to improve the fit.
Below is a reproducible example using the same code from the vignette Avoiding model refits in leave-one-out cross-validation with moment matching. I’m using cmdstanr
here, but the same behavior happens with rstan
. The Stan model used is identical to the one in the vignette, except that I renamed the offset
variable to shift
, as offset
is a reserved name in Stan. The full model code is attached at the end of this post.
# Load packages
library(cmdstanr)
library(loo)
# Set seed for reproducible results
seed <- 9547
set.seed(seed)
# Prepare data
data(roaches, package = "rstanarm")
roaches$roach1 <- sqrt(roaches$roach1)
y <- roaches$y
x <- roaches[,c("roach1", "treatment", "senior")]
shift <- log(roaches[,"exposure2"])
n <- dim(x)[1]
k <- dim(x)[2]
# Stan data list
standata <- list(
N = n,
K = k,
x = as.matrix(x),
y = y,
shift = shift,
beta_prior_scale = 2.5,
alpha_prior_scale = 5.0)
# Compile stan model with cmdstanr
stanmodel <- cmdstan_model(stan_file = "MM_test.stan", force_recompile = TRUE)
# Fit model
fit <- stanmodel$sample(
data = standata,
seed = seed,
refresh = 500
)
As you can see in the following R chunk, loo()
produces some high Pareto k diagnostics, and moment_match
corrects this when run in single-threaded mode. But if I run the same call with multiple cores, it silently returns the uncorrected loo()
result—giving the false impression that moment_match
fails to improve things.
> # Raw loo
> (loo1 <- fit$loo())
Computed from 4000 by 262 log-likelihood matrix.
Estimate SE
elpd_loo -5455.2 692.6
p_loo 250.5 53.0
looic 10910.5 1385.3
------
MCSE of elpd_loo is NA.
MCSE and ESS estimates assume MCMC draws (r_eff in [0.5, 1.0]).
Pareto k diagnostic values:
Count Pct. Min. ESS
(-Inf, 0.7] (good) 245 93.5% 58
(0.7, 1] (bad) 10 3.8% <NA>
(1, Inf) (very bad) 7 2.7% <NA>
See help('pareto-k-diagnostic') for details.
Aviso:
Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.
>
> # Moment matching
> fit$loo(moment_match = TRUE)
Computed from 4000 by 262 log-likelihood matrix.
Estimate SE
elpd_loo -5477.4 699.8
p_loo 272.6 61.7
looic 10954.8 1399.6
------
MCSE of elpd_loo is 0.4.
MCSE and ESS estimates assume MCMC draws (r_eff in [0.5, 1.0]).
All Pareto k estimates are good (k < 0.7).
See help('pareto-k-diagnostic') for details.
>
> # Parallelized moment matching
> fit$loo(moment_match = TRUE, cores = 6)
Computed from 4000 by 262 log-likelihood matrix.
Estimate SE
elpd_loo -5455.2 692.6
p_loo 250.5 53.0
looic 10910.5 1385.3
------
MCSE of elpd_loo is NA.
MCSE and ESS estimates assume MCMC draws (r_eff in [0.5, 1.0]).
Pareto k diagnostic values:
Count Pct. Min. ESS
(-Inf, 0.7] (good) 245 93.5% 58
(0.7, 1] (bad) 10 3.8% <NA>
(1, Inf) (very bad) 7 2.7% <NA>
See help('pareto-k-diagnostic') for details.
Aviso:
Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.
In addition to this, when using moment_match
in parallel, I’m unable to pass external R objects (e.g., a data frame). As noted in the loo
vignette on large datasets, sometimes it’s more efficient to compute the log-likelihood outside of Stan. If we choose this approach, we also need to supply custom likelihood functions for moment_match
. Continuing with the same example, the following code shows how to do this using cmdstanr
instead of rstan
:
# Log‐likelihood for observation i
ll_fun_i <- function(x, i, data = standata, ...) {
# Posterior draws
draws <- x$draws(variables = c("beta", "intercept"), format = "draws_array")
# Parameter arrays
betas_array <- draws[,,grep("^beta", dimnames(draws)[[3]])]
intercept_array <- draws[,,"intercept"]
# Empty linear prediction array
mu_list <- vector(mode = "list", length = dim(draws)[2])
for(j in 1:dim(draws)[2]){
# Prepare Beta and X matrices
Beta <- betas_array[,j,,drop = TRUE]
X <- matrix(data$x[i, ], ncol = 1)
# Compute linear prediction
mu_list[[j]] <- Beta %*% X + c(intercept_array[,j,]) + data$shift[i]
}
# Generate a S x P matrix
mu_mat <- do.call(cbind, mu_list)
# Compute log-likelihood
dpois(x = data$y[i], lambda = exp(mu_mat), log = TRUE)
}
# Log-Likelihoo based on unconstrain parameters
log_lik_i_upars <- function(x, upars, i, data = standata, ...) {
# Constrain pars
pars <- apply(upars, 1, x$constrain_variables)
# Beta matrix and intercept vector
Beta <- do.call(rbind, lapply(pars, function(x) x$beta))
intercept <- sapply(pars, function(x) x$intercept)
# Compute linear prediction
mu_vec <- Beta %*% matrix(data$x[i, ], ncol = 1) + intercept + data$shift[i]
# Compute log-likelihood
c(dpois(x = data$y[i], lambda = exp(mu_vec), log = TRUE))
}
# Hand-made moment matching
loo_mm <- loo::loo_moment_match.default(
x = fit,
loo = loo1,
post_draws = function(x, ...) { x$draws(format = "draws_matrix") },
log_lik_i = ll_fun_i,
unconstrain_pars = function(x, pars, ...) { x$unconstrain_draws(format = "draws_matrix") },
log_prob_upars = function(x, upars, ...) { apply(upars, 1, x$log_prob) },
log_lik_i_upars = log_lik_i_upars, cores = 4, data = standata)
With cores = 1
, everything works: moment_match
runs correctly, and my external data can be accessed. But increasing the number of cores breaks both behaviors: moment_match
is not applied, and I get an error saying that my data object is not found. I’ve tried various workarounds (passing data
explicitly, omitting it and relying on ...
, etc.), but it seems the problem isn’t just how the arguments are passed.
> # Cores = 1 and without specifying data
> loo_mm <- loo::loo_moment_match.default(
+ x = fit,
+ loo = loo1,
+ post_draws = function(x, ...) { x$draws(format = "draws_matrix") },
+ log_lik_i = ll_fun_i,
+ unconstrain_pars = function(x, pars, ...) { x$unconstrain_draws(format = "draws_matrix") },
+ log_prob_upars = function(x, upars, ...) { apply(upars, 1, x$log_prob) },
+ log_lik_i_upars = log_lik_i_upars, cores = 1)
> loo_mm
Computed from 4000 by 262 log-likelihood matrix.
Estimate SE
elpd_loo -5477.4 699.8
p_loo 272.6 61.7
looic 10954.8 1399.6
------
MCSE of elpd_loo is 0.4.
MCSE and ESS estimates assume MCMC draws (r_eff in [0.5, 1.0]).
All Pareto k estimates are good (k < 0.7).
See help('pareto-k-diagnostic') for details.
> # Cores > 1 and specifying data
> loo_mm_2 <- loo::loo_moment_match.default(
+ x = fit,
+ loo = loo1,
+ post_draws = function(x, ...) { x$draws(format = "draws_matrix") },
+ log_lik_i = ll_fun_i,
+ unconstrain_pars = function(x, pars, ...) { x$unconstrain_draws(format = "draws_matrix") },
+ log_prob_upars = function(x, upars, ...) { apply(upars, 1, x$log_prob) },
+ log_lik_i_upars = log_lik_i_upars, cores = 4, data = standata)
Error en checkForRemoteErrors(val):
4 nodes produced errors; first error: objeto 'standata' no encontrado
Finally, I’m including my full R session info below (I’m running this on Windows 11) in case it helps to diagnose the issue!
> sessionInfo()
R version 4.4.2 (2024-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)
Matrix products: default
locale:
[1] LC_COLLATE=Spanish_Spain.utf8 LC_CTYPE=Spanish_Spain.utf8 LC_MONETARY=Spanish_Spain.utf8 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.utf8
time zone: Europe/Madrid
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] loo_2.8.0 cmdstanr_0.8.1.9000
loaded via a namespace (and not attached):
[1] vctrs_0.6.5 cli_3.6.3 knitr_1.50 rlang_1.1.4 xfun_0.52 processx_3.8.6
[7] generics_0.1.3 tensorA_0.36.2.1 data.table_1.17.0 jsonlite_2.0.0 glue_1.8.0 backports_1.5.0
[13] distributional_0.5.0 ps_1.9.1 evaluate_1.0.3 tibble_3.2.1 abind_1.4-8 lifecycle_1.0.4
[19] compiler_4.4.2 posterior_1.6.1 Rcpp_1.0.14 pkgconfig_2.0.3 rstudioapi_0.17.1 R6_2.6.1
[25] utf8_1.2.4 pillar_1.10.2 parallel_4.4.2 magrittr_2.0.3 checkmate_2.3.2 tools_4.4.2
[31] withr_3.0.2 matrixStats_1.5.0
And this is the Stan model code
data {
int<lower=1> K;
int<lower=1> N;
matrix[N,K] x;
array[N] int y;
vector[N] shift;
real beta_prior_scale;
real alpha_prior_scale;
}
parameters {
vector[K] beta;
real intercept;
}
model {
y ~ poisson(exp(x * beta + intercept + shift));
beta ~ normal(0,beta_prior_scale);
intercept ~ normal(0,alpha_prior_scale);
}
generated quantities {
vector[N] log_lik;
for (n in 1:N)
log_lik[n] = poisson_lpmf(y[n] | exp(x[n] * beta + intercept + shift[n]));
}