Difficulty comparing models with loo

rstan 2.19.3
I’m also on R 3.6.2 still. I’ve a bunch of half-done projects and so I don’t want to update to R 4.0+ until some are more done. Is that a problem here?

I don’t know. You can have R 3.6 and R 4 installed on the same machine without breaking anything, so that could be worth a try but I honestly don’t know what the problem is as I cannot reproduce the error.

1 Like

It’s likely that moment matching will not be able to help with that many problematic pareto_k values. Please post how you call brms (all br,s agruments such as formula, priors, etc) and tell how many observations you have (if hierarchical model then also how many group levels etc), so that I can give better informed suggestion what to do next.

1 Like

Ok I’ll try to find time to install it in coming days, although I’m backed up alot with deadlines. I have a paper under revision with about a month of cpu time behind it and I’m terrified to change anything until the reviews comeback in case I break anything 😅… I really really really don’t want to run it all again from scratch 😂😇

Sure no problem - there are 278 individuals(groups levels), and in total there are 745 observations. I’m interested in comparing different outcome measures and so I’m using multivariate models and the residual correlations are of primary interest. The model specifications are below - they are basically mixed effects models for each outcome (based on apriori knowledge). I tuned the priors based on prior predictive checks to be weakly informative.

fit1 <- brm(
    mvbind(out_fvc, out_svc, out_snip, out_peak) ~ days_from_baseline + (days_from_baseline | uin),
    data = df2, chains = 4, cores = 4,
    prior = c(set_prior("normal(0, 0.1)", class = "b", resp = c("outfvc", "outsvc","outsnip", "outpeak" )),
              set_prior("normal(3.2, 0.1)", class = "Intercept", resp = "outfvc" ),
              set_prior("normal(3.1, 0.2)", class = "Intercept", resp = "outsvc" ),
              set_prior("normal(65, 20)", class = "Intercept", resp = "outsnip" ),
              set_prior("student_t(3, 350, 100)", class = "Intercept", resp = "outpeak" )
    sample_prior = "no", save_all_pars = TRUE )

fit2 <- brm(
    mvbind(out_fvc, out_svc, out_snip, out_peak) ~ days_from_baseline + (days_from_baseline | p | uin),
    data = df2, chains = 4, cores = 4,
    prior = c(set_prior("normal(0, 0.1)", class = "b", resp = c("outfvc", "outsvc","outsnip", "outpeak" )),
              set_prior("normal(3.2, 0.1)", class = "Intercept", resp = "outfvc" ),
              set_prior("normal(3.1, 0.2)", class = "Intercept", resp = "outsvc" ),
              set_prior("normal(65, 20)", class = "Intercept", resp = "outsnip" ),
              set_prior("student_t(3, 350, 100)", class = "Intercept", resp = "outpeak" )
    sample_prior = "no", save_all_pars = TRUE )

Would increasing the cores used and /or iterations help with the pareto_k values (I don’t have a deep understanding here)

If some uin groups have just one or a few observations, then removing an observation can affect a lot the posterior of the corresponding uin intercept and coefficient parameters. On average there is 2.7 observations to influence the posterior of 2 local parameters (or since you are using mvbind are there 1+4 parameters?), so there has to be many groups where you have just 1 or 2 observations to influence the posterior of 2 local parameters. When you remove 1 out of 1 observation then the posterior of the local parameters is just the prior which is likely to much wider. When you remove 1 out of 2 observations then you are removing half of the data for those local parameters and it’s still likely that the posterior will change a lot. See related random effect model loo example at Roaches loo case study

Unlikely. See also questions 16 and 17 in CV-FAQ


Thanks for those thoughts @avehtari - that makes alot of sense. Alas yes quite a few only have one observation (blame covid for stopping data collection early 😢). I will look through those links thank you.

1 Like

Hi @paul.buerkner just to come back to you on this. I dug out my back-up computer to test this. It had been wiped, so under Mojave 10.14.6, I did a clean install of R 3.6.3, installed rstan and brms then ran my reprex above -> the problem was still there. I updated to R 4.0.2, reinstalled rstan and brms and ran the models again -> problem resolved, loo ran fine 👍 Now, while I’m at it install Anaconda and test out a few python-y things on my todo list without having to worry about this: https://xkcd.com/1987/

@avehtari reading the Roaches loo case study I have been trying out kfold CV with my models. I was wondering about two things:

  1. When I run kfold, the p_kfold comes out as NA - is this a worry?
  2. I see that I have the option with kfold to split the dataset by group - would you advise to use this for hierarchical models like mine above?
1 Like

The case study gives also NA for p_kfold, because computing that requires also running the model with full data. I don’t remember why we report it like that, and maybe we should run the full data model, too, to show proper values. I should make an issue of this.

It depends what is your goal. I discuss this in Cross-validation for hierarchical models case study with link also to a video with more explanation.


Very helpful thanks - the video too 👍

1 Like

I ran into the same problem just now and found this thread. I already learned something from @avehtari’s comments but just wanted to share my experience in case it really is a bug.

I am running brms 2.13.5, rstan 2.21.2, R version 4.0.2 on a Ubuntu 18.4 workstation.

I am trying to compare multiple model specifications using only a subset of my data, which might explain the high pareto_k values that I am experiencing. My full dataset has about 20k observations but needs some days to run which is why I am using a 1k sample for trying out model specifications

I adapted a simulated MRP example of Lauren Kennedy to produce this example that produces the error on my machine. I’m sorry that it’s a long code, but I hope it reproduces the error.

Error in .update_pars(x, upars = upars, ...) : 
  length(new_samples) == nrow(pars) is not TRUE
In addition: Warning message:
Some Pareto k diagnostic values are slightly high. See help('pareto-k-diagnostic') for details.
Error: Moment matching failed. Perhaps you did not set 'save_all_pars' to TRUE when fitting your model?

Okay, I just tried to reproduce the error on my Laptop running Windows 10, R 4.0.2, and brms 2.13.0. Would not reproduce. I updated to brms 2.13.5 and now I got the error again on my Laptop as well.

Edit: I might’ve updated other packages as well, not quite sure. I’m just reinstalling Brms 2.13.0 and will see if I can reproduce it again with that version.


#Define function that simulates mrp data ####
simulate_mrp_data <- function(n) {
  J <- c(2, 3, 5, 3, 400) # male or not, jobstatus, age (<18, 18-29, 30-49, 50-64, >64),migration background, 
  poststrat <- as.data.frame(array(NA, c(prod(J), length(J)+1))) # Columns of post-strat matrix, plus one for size
  colnames(poststrat) <- c("male", "job", "age","migback", "state",'N')
  count <- 0
  for (i1 in 1:J[1]){ # for i1 in 1:2 (i.e. for both genders)
    for (i2 in 1:J[2]){
      for (i3 in 1:J[3]){
        for (i4 in 1:J[4]){
          for (i5 in 1:J[5]){ # 1:J[5] is 1:50 because J[5] is 50
            count <- count + 1
            # Fill them in so we know what category we are referring to
            poststrat[count, 1:5] <- c(i1-1, i2, i3,i4,i5) 
  # Proportion in each sample in the population (Values based on Zensus 2011 DE)
  p_male <- c(0.512, 0.488)
  p_job <- c(0.515, 0.025, 0.459)
  p_age <- c(0.164,0.142,0.285,0.204,0.206)
  p_migback <- c(.808,.116,.76)
  p_state_tmp <- runif(n = 400, min = 10,max = 20)
  p_state <- p_state_tmp/sum(p_state_tmp)
  poststrat$N <- 0
  for (j in 1:prod(J)){ # 82e6 is a random massive population size here, about Germany
    poststrat$N[j] <- round(82e5 * p_male[poststrat[j,1]+1] * # get the value for each j in poststrat and multiply by the proportions in the sample
                              p_job[poststrat[j,2]] * p_age[poststrat[j,3]] * 
                              p_migback[poststrat[j,4]] * p_state[poststrat[j,5]]) #Adjust the N to be the number observed in each category in each group
  # Now let's adjust for the probability of response depending on characteristics
  p_response_baseline <- 0.01
  p_response_male <- c(2, 0.8) / 2.8
  p_response_job <- c(1, 1.2, 2.5) / 4.7
  p_response_age <- c(0.5, 0.4, 1, 1.5,  3) / 6.4
  p_response_migback <- c(1, 0.8, 0.6) / 2.4
  p_response_state <- rbeta(n = 400, shape1 =  1, shape2 =  1)
  p_response_state <- p_response_state / sum(p_response_state)
  p_response <- rep(NA, prod(J))
  for (j in 1:prod(J)) {
    p_response[j] <-
      p_response_baseline * p_response_male[poststrat[j, 1] + 1] *
      p_response_job[poststrat[j, 2]] * p_response_age[poststrat[j, 3]] *
      p_response_migback[poststrat[j, 4]] * p_response_state[poststrat[j, 5]]
  people <- sample( prod(J), size = n, replace = TRUE, prob = poststrat$N * p_response) #sample random people depending on the proportions and response rate of that cell
  ## For respondent i, people[i] is that person's poststrat cell, (Which cell does that person in our sample stem from)
  ## some number between 1 and 32
  n_cell <- rep(NA, prod(J))
  for (j in 1:prod(J)) {
    n_cell[j] <- sum(people == j)
  coef_male <- c(0, -1)
  coef_job <- c(1, -1, 0.5)
  coef_age <- c(2, 0 , -1, -0.5, 1)
  coef_migback <- c(0, -0.5, -1)
  coef_state <- c(0, round(rnorm(399, 0, 1.5), 1))
  true_popn <- data.frame(poststrat[, 1:5], lsat = rep(NA, prod(J))) # get the first columns from the cell matrix, define a new one with variable of interest NAs
  for (j in 1:prod(J)) {
    true_popn$lsat_influence[j] <- sum(  # Influence on life satisfaction depending on subgroup
      coef_male[poststrat[j, 1] + 1] +
        coef_job[poststrat[j, 2]] + coef_age[poststrat[j, 3]] +
        coef_migback[poststrat[j, 4]] + coef_state[poststrat[j, 5]]
  true_popn$lsat <- round(rtruncnorm(n = 1, a = 0, b = 10, sd = 2, mean = 6 + true_popn$lsat_influence),0) #generate a life satisfaction value for each observation between 0 and 10
  #male or not, jobstatus, age, migback, state
  y <- round(true_popn$lsat[people] , 0) #for our sample "people" we get the life satisfaction from cell that the person belongs to
  male <- poststrat[people, 1]
  job <- poststrat[people, 2]
  age <- poststrat[people, 3]
  migback <- poststrat[people, 4]
  state <- poststrat[people, 5]
  sample <- data.frame(lsat = y, 
                       male, age, job, migback, state, 
                       id = 1:length(people)) #create a dataframe from our people sample
  #Make all numeric:
  for (i in 1:ncol(poststrat)) {
    poststrat[, i] <- as.numeric(poststrat[, i])
  for (i in 1:ncol(true_popn)) {
    true_popn[, i] <- as.numeric(true_popn[, i])
  for (i in 1:ncol(sample)) {
    sample[, i] <- as.numeric(sample[, i])
    sample = sample,
    poststrat = poststrat,
    true_popn = true_popn

# Generate sample data using the function ####
mrp_sim <- simulate_mrp_data(n=1000)
sample <- mrp_sim[["sample"]]
sample$state <- factor(sample$state, levels=1:400)

prior4 <- c(prior(normal(6,1), class = Intercept),
            prior(normal(0,0.75), class = b),
            prior(normal(0,0.75), class = sd),
            prior(exponential(0.5), class = sigma),
            prior(lkj(1), class = cor))

fit <- brm(
  formula = lsat ~ male + age + job + migback + (1 + male + age + job + migback | state),
  data = sample, family = gaussian,
  control=list(adapt_delta=.8, max_treedepth=15),
  chains = 8, cores = 8, iter = 4000, warmup = 1000,
  prior = prior4, save_all_pars = TRUE)

fit <- add_criterion(fit, "waic")
fit <- add_criterion(fit, "loo", moment_match = TRUE)

Any updates on this issue?
I’m having the exact same problem with versions “brms_2.13.5” and “rstan_2.21.2”. It happens on both R 4.0.2 (windows 10) and 3.6.3 (CentOS Linux 7). I did a fresh rstan/brms install on both systems. In my case setting “save_all_pars=TRUE” does not help either.

AndreasR: Did downgrading to 2.13.0 help in your case?

Can you post a minimal reproducible example that causes the error with R 4.0.x? I have still trouble reproducing this myself.

Here is code and data (both attached) that gives me the error. It might not be the “very minimal” example, but sufficiently minimal for you to start debugging, I hope.

Also my sessionInfo(), maybe useful:

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)

Matrix products: default

[1] LC_COLLATE=Finnish_Finland.1252  LC_CTYPE=Finnish_Finland.1252    LC_MONETARY=Finnish_Finland.1252
[4] LC_NUMERIC=C                     LC_TIME=Finnish_Finland.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] brms_2.13.5 Rcpp_1.0.5 

loaded via a namespace (and not attached):
 [1] Brobdingnag_1.2-6    jsonlite_1.7.0       gtools_3.8.2         StanHeaders_2.21.0-6 RcppParallel_5.0.2  
 [6] threejs_0.3.3        shiny_1.5.0          assertthat_0.2.1     stats4_4.0.2         backports_1.1.7     
[11] pillar_1.4.6         lattice_0.20-41      glue_1.4.1           digest_0.6.25        checkmate_2.0.0     
[16] promises_1.1.1       colorspace_1.4-1     htmltools_0.5.0      httpuv_1.5.4         Matrix_1.2-18       
[21] plyr_1.8.6           dygraphs_1.1.1.6     pkgconfig_2.0.3      rstan_2.21.2         purrr_0.3.4         
[26] xtable_1.8-4         mvtnorm_1.1-1        scales_1.1.1         processx_3.4.3       later_1.1.0.1       
[31] tibble_3.0.3         bayesplot_1.7.2      generics_0.0.2       ggplot2_3.3.2        ellipsis_0.3.1      
[36] DT_0.15              withr_2.2.0          shinyjs_1.1          cli_2.0.2            magrittr_1.5        
[41] crayon_1.3.4         mime_0.9             ps_1.3.4             fansi_0.4.1          nlme_3.1-148        
[46] xts_0.12-0           pkgbuild_1.1.0       colourpicker_1.0     rsconnect_0.8.16     tools_4.0.2         
[51] loo_2.3.1            prettyunits_1.1.1    lifecycle_0.2.0      matrixStats_0.56.0   stringr_1.4.0       
[56] V8_3.2.0             munsell_0.5.0        callr_3.4.3          compiler_4.0.2       rlang_0.4.7         
[61] grid_4.0.2           ggridges_0.5.2       rstudioapi_0.11      htmlwidgets_1.5.1    crosstalk_1.1.0.1   
[66] igraph_1.2.5         miniUI_0.1.1.1       base64enc_0.1-3      codetools_0.2-16     gtable_0.3.0        
[71] inline_0.3.15        abind_1.4-5          curl_4.3             markdown_1.1         reshape2_1.4.4      
[76] R6_2.4.1             gridExtra_2.3        rstantools_2.1.1     zoo_1.8-8            bridgesampling_1.0-0
[81] dplyr_1.0.2          fastmap_1.0.1        shinystan_2.5.0      shinythemes_1.1.2    stringi_1.4.6       
[86] parallel_4.0.2       vctrs_0.3.2          tidyselect_1.1.0     coda_0.19-3         

minimal_example.R (1.3 KB) my_testdata.csv (3.2 KB)

@paul.buerkner when I run this with cmdstan backend I get (on the line where we add criteria LOO):

Error: Backend 'rstan' is required for this method.

And when I run with backend="rstan" I get:

Error: $ operator not defined for this S4 class

so something is fishy…

Thanks! So many different error messages :-D

1 Like

Good news. I can finally reproduce the error and will work on a fix later on today.


I fixed at least 2 issues with loo_moment_match that were brms related. Depending on the model you may still need to install loo from github as well to avoid at least one further error.


What is the latest version of LOO? Is it 2.3.1?

Yeah that’s the latest CRAN release. The version on GitHub has a few bug fixes that still need to be released, so it could be advisable to install from GitHub: remotes::install_github("stan-dev/loo").

Ah ok yeah this is what I was wondering. Thank you!

1 Like