Brm summary failing with categorical response, multilevel model


I’m new to categorical models, and I wanted to check if I should be using a slightly different syntax for group-level effects. I have samples (e.g. a box containing items) from which I exhaustively sample all items, each being a different outcome (12 categories). The total number of items can differ among samples, and I have 5 independent samples per treatment (control vs impact).

When I fit the model structure

model <- brms::brm(response ~ treatment + (1 | sample_id), data = data,
                   family = "categorical", chains = 4, cores = 4, iter = 1e4,
                   sample_prior = "yes")

It seems to fit fine, but then I get the following error message:

> summary(model)
Error: Duplicate variable names are not allowed in draws objects.
The following variable names are duplicates:

Looking at model$fit and stancode(model) I can see that there are separate group-level standard deviations being estimated for each item (makes sense), however they all contain the same parameter name which I think is what’s causing the issue?

Did I code this incorrectly? Thanks for your help! Full reprex below.

rstan::rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

data_as_string <- "response,treatment,sample_id\n A,control,ID_1\n C,control,ID_1\n C,control,ID_1\n D,control,ID_1\n E,control,ID_1\n B,control,ID_1\n C,control,ID_1\n D,control,ID_1\n E,control,ID_1\n D,control,ID_1\n C,control,ID_2\n C,control,ID_2\n C,control,ID_2\n E,control,ID_2\n E,control,ID_2\n C,control,ID_2\n C,control,ID_3\n C,control,ID_3\n C,control,ID_3\n C,control,ID_3\n C,control,ID_4\n C,control,ID_4\n C,control,ID_4\n D,control,ID_4\n D,control,ID_4\n F,control,ID_4\n C,control,ID_4\n G,control,ID_4\n C,control,ID_4\n D,control,ID_4\n F,control,ID_4\n F,control,ID_4\n C,control,ID_4\n D,control,ID_4\n C,control,ID_4\n F,control,ID_4\n H,control,ID_4\n D,control,ID_4\n D,control,ID_4\n D,control,ID_4\n C,control,ID_5\n C,control,ID_5\n E,control,ID_5\n E,control,ID_5\n E,control,ID_5\n C,control,ID_5\n I,control,ID_5\n D,control,ID_5\n C,control,ID_5\n G,impact,ID_6\n C,impact,ID_6\n E,impact,ID_6\n C,impact,ID_6\n F,impact,ID_6\n G,impact,ID_6\n C,impact,ID_6\n C,impact,ID_6\n D,impact,ID_6\n D,impact,ID_6\n C,impact,ID_6\n C,impact,ID_6\n D,impact,ID_6\n D,impact,ID_6\n D,impact,ID_6\n G,impact,ID_6\n C,impact,ID_6\n D,impact,ID_6\n C,impact,ID_6\n G,impact,ID_7\n C,impact,ID_7\n D,impact,ID_7\n C,impact,ID_7\n J,impact,ID_8\n D,impact,ID_8\n C,impact,ID_8\n C,impact,ID_8\n D,impact,ID_8\n D,impact,ID_8\n C,impact,ID_8\n C,impact,ID_8\n C,impact,ID_8\n E,impact,ID_8\n D,impact,ID_8\n C,impact,ID_9\n E,impact,ID_9\n G,impact,ID_9\n C,impact,ID_9\n E,impact,ID_9\n C,impact,ID_9\n C,impact,ID_9\n E,impact,ID_9\n C,impact,ID_9\n E,impact,ID_9\n L,impact,ID_9\n D,impact,ID_10\n M,impact,ID_10\n D,impact,ID_10\n D,impact,ID_10\n C,impact,ID_10\n C,impact,ID_10\n D,impact,ID_10\n D,impact,ID_10\n D,impact,ID_10\n E,impact,ID_10\n C,impact,ID_10\n C,impact,ID_10\n E,impact,ID_10\n D,impact,ID_10\n C,impact,ID_10\n D,impact,ID_10\n D,impact,ID_10\n D,impact,ID_10\n D,impact,ID_10\n"

data <- read.csv(text = data_as_string, sep = ",", header = TRUE,
                 stringsAsFactors = FALSE)

model <- brms::brm(response ~ treatment + (1 | sample_id), data = data,
                   family = "categorical", chains = 4, cores = 4, iter = 1e4,
                   sample_prior = "yes")
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] brms_2.16.1 Rcpp_1.0.7 

loaded via a namespace (and not attached):
  [1] nlme_3.1-152         matrixStats_0.60.1   xts_0.12.1           threejs_0.3.3        rstan_2.21.3        
  [6] tensorA_0.36.2       tools_4.1.0          backports_1.2.1      utf8_1.2.2           R6_2.5.1            
 [11] DT_0.19              DBI_1.1.1            mgcv_1.8-35          projpred_2.0.2       colorspace_2.0-2    
 [16] tidyselect_1.1.1     gridExtra_2.3        prettyunits_1.1.1    processx_3.5.2       Brobdingnag_1.2-6   
 [21] emmeans_1.6.0        curl_4.3.2           compiler_4.1.0       cli_3.0.1            shinyjs_2.0.0       
 [26] sandwich_3.0-1       colourpicker_1.1.0   posterior_1.1.0      scales_1.1.1         dygraphs_1.1.1.6    
 [31] checkmate_2.0.0      mvtnorm_1.1-2        ggridges_0.5.3       callr_3.7.0          StanHeaders_2.21.0-7
 [36] stringr_1.4.0        digest_0.6.27        minqa_1.2.4          base64enc_0.1-3      pkgconfig_2.0.3     
 [41] htmltools_0.5.2      lme4_1.1-27.1        fastmap_1.1.0        htmlwidgets_1.5.4    rlang_0.4.11        
 [46] shiny_1.6.0          farver_2.1.0         generics_0.1.0       jsonlite_1.7.2       zoo_1.8-9           
 [51] crosstalk_1.1.1      gtools_3.9.2         dplyr_1.0.7          distributional_0.2.2 inline_0.3.19       
 [56] magrittr_2.0.1       loo_2.4.1            bayesplot_1.8.1      Matrix_1.3-3         munsell_0.5.0       
 [61] fansi_0.5.0          abind_1.4-5          lifecycle_1.0.0      multcomp_1.4-17      stringi_1.7.4       
 [66] MASS_7.3-54          pkgbuild_1.2.0       plyr_1.8.6           grid_4.1.0           parallel_4.1.0      
 [71] promises_1.2.0.1     crayon_1.4.1         miniUI_0.1.1.1       lattice_0.20-44      splines_4.1.0       
 [76] ps_1.6.0             pillar_1.6.2         igraph_1.2.6         boot_1.3-28          estimability_1.3    
 [81] markdown_1.1         shinystan_2.5.0      codetools_0.2-18     reshape2_1.4.4       stats4_4.1.0        
 [86] rstantools_2.1.1     glue_1.4.2           V8_3.4.2             RcppParallel_5.1.4   vctrs_0.3.8         
 [91] nloptr_1.2.2.2       httpuv_1.6.2         gtable_0.3.0         purrr_0.3.4          assertthat_0.2.1    
 [96] ggplot2_3.3.5        mime_0.11            xtable_1.8-4         coda_0.19-4          later_1.3.0         
[101] survival_3.2-11      rsconnect_0.8.24     tibble_3.1.4         shinythemes_1.2.0    gamm4_0.2-6         
[106] TH.data_1.0-10       ellipsis_0.3.2       bridgesampling_1.1-2

I think this has something to do with the sample_prior = "yes" argument. I ran your code, changing all the variables to factors, and running your model I got the same error. When I removed the sample_prior = "yes" argument and ran the model again, then it worked fine. The default for that argument is “no”.

Thanks @jd_c, setting sample_prior = "no" also worked on my end. But using the default for sample_prior makes it slightly lengthier (code-wise) to compare posteriors and priors, i.e. I wouldn’t be able to do it via plot(hypothesis(...))… so I wonder if this is expected behaviour?

I don’t use the hypothesis() method, so I wouldn’t no. But read the note about intercepts in the Details of the hypothesis() section of the brms manual. Not sure if that is related.
@paul.buerkner would know

Paul has fixed it on GitHub