Integer overflow when using loo_moment_match

Problem: Integer Overflow when using loo_moment_match()
github issue: https://github.com/stan-dev/loo/issues/151

What I did:

  • loo_moment_match for a stanfit model, following the vignette here, with all default parameters and functions.
  • Number of cores for parallelisation: 10

Expected behaviour: return the value with no exceptions

Actual behaviour: return an error message

Error in while (t < nrow(acov) - 5 && !is.nan(rho_hat_even + rho_hat_odd) &&  : 
  missing value where TRUE/FALSE needed
In addition: Warning message:
In N * N : NAs produced by integer overflow

Is the problem random? No

What I tried:

  • Reduce the number of cores didn’t work
  • Remove the high-dimensional random effect parameter didn’t work, the parameter is required

Session Info

R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8    
 [5] LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8    LC_PAPER=C.UTF-8       LC_NAME=C             
 [9] LC_ADDRESS=C           LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bayesplot_1.7.2      rstan_2.19.3         ggplot2_3.3.2        StanHeaders_2.21.0-5
[5] magrittr_1.5         data.table_1.12.8   

loaded via a namespace (and not attached):
 [1] tinytex_0.24       tidyselect_1.1.0   xfun_0.15          purrr_0.3.4        colorspace_1.4-1  
 [6] vctrs_0.3.1        generics_0.0.2     stats4_4.0.2       loo_2.3.1          rlang_0.4.7       
[11] pkgbuild_1.1.0     pillar_1.4.6       glue_1.4.1         withr_2.2.0        RColorBrewer_1.1-2
[16] readxl_1.3.1       matrixStats_0.56.0 lifecycle_0.2.0    plyr_1.8.6         munsell_0.5.0     
[21] gtable_0.3.0       cellranger_1.1.0   codetools_0.2-16   inline_0.3.15      GGally_2.0.0      
[26] callr_3.4.3        ps_1.3.3           parallel_4.0.2     fansi_0.4.1        Rcpp_1.0.4.6      
[31] backports_1.1.8    checkmate_2.0.0    scales_1.1.1       ggmcmc_1.4.1       RcppParallel_5.0.1
[36] gridExtra_2.3      processx_3.4.3     dplyr_1.0.0        grid_4.0.2         cli_2.0.2         
[41] tools_4.0.2        tibble_3.0.3       crayon_1.3.4       tidyr_1.1.0        pkgconfig_2.0.3   
[46] ellipsis_0.3.1     prettyunits_1.1.1  ggridges_0.5.2     lubridate_1.7.9    assertthat_0.2.1  
[51] reshape_0.8.8      rstudioapi_0.11    R6_2.4.1           compiler_4.0.2    

Hmm I gave this a run and it worked. Can you copy-paste code from the vignette to produce a regular .R file that’ll blow up for you? (edit: then I’ll run that to make sure we’re doing the same thing)

> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

...
other attached packages:
[1] loo_2.3.1            rstan_2.19.3         ggplot2_3.3.1       
[4] StanHeaders_2.21.0-5

Thanks,

These lines of code that I copied:

# create a named list of draws for use with rstan methods
.rstan_relist <- function(x, skeleton) {
  out <- utils::relist(x, skeleton)
  for (i in seq_along(skeleton)) {
    dim(out[[i]]) <- dim(skeleton[[i]])
  }
  out
}

# rstan helper function to get dims of parameters right
.create_skeleton <- function(pars, dims) {
  out <- lapply(seq_along(pars), function(i) {
    len_dims <- length(dims[[i]])
    if (len_dims < 1) return(0)
    return(array(0, dim = dims[[i]]))
  })
  names(out) <- pars
  out
}

# extract original posterior draws
post_draws_stanfit <- function(x, ...) {
  as.matrix(x)
}

# compute a matrix of log-likelihood values for the ith observation
# matrix contains information about the number of MCMC chains
log_lik_i_stanfit <- function(x, i, parameter_name = "log_lik", ...) {
  loo::extract_log_lik(x, parameter_name, merge_chains = FALSE)[, , i]
}

# transform parameters to the unconstraint space
unconstrain_pars_stanfit <- function(x, pars, ...) {
  skeleton <- .create_skeleton(x@sim$pars_oi, x@par_dims[x@sim$pars_oi])
  upars <- apply(pars, 1, FUN = function(theta) {
    rstan::unconstrain_pars(x, .rstan_relist(theta, skeleton))
  })
  # for one parameter models
  if (is.null(dim(upars))) {
    dim(upars) <- c(1, length(upars))
  }
  t(upars)
}

# compute log_prob for each posterior draws on the unconstrained space
log_prob_upars_stanfit <- function(x, upars, ...) {
  apply(upars, 1, rstan::log_prob, object = x,
        adjust_transform = TRUE, gradient = FALSE)
}

# compute log_lik values based on the unconstrained parameters
log_lik_i_upars_stanfit <- function(x, upars, i, parameter_name = "log_lik",
                                  ...) {
  S <- nrow(upars)
  out <- numeric(S)
  for (s in seq_len(S)) {
    out[s] <- rstan::constrain_pars(x, upars = upars[s, ])[[parameter_name]][i]
  }
  out
}

The function call:

mm_3h3a  <- loo::loo_moment_match(x = model_3h3a_19EI,
                             loo = loo_3h3a,
                             cores = 10,
                             post_draws = post_draws_stanfit,
                             log_lik_i = log_lik_i_stanfit,
                             unconstrain_pars = unconstrain_pars_stanfit,
                             log_prob_upars = log_prob_upars_stanfit,
                             log_lik_i_upars = log_lik_i_upars_stanfit)

I fit a model with some parameters of dim (260, 1) in four 40000-non-warmup-chains. I saw something look like N * N in the error message. Perhaps the large dimension size is what makes the number overflow?

Quite possibly. Can you share the model + data here to run this?

Thanks for reporting this.

This is an old bug which has been fixed in rhat paper code and posterior package, but we had missed to fix it also in loo package. See more at

3 Likes

Thanks , that is super helpful.