Convergence issues with item response model with hurdle negative binomial response distribution

I’m trying to fit an item response model (Bürkner, 2020) using brms. I’d like to fit a two-parameter IRT model that estimates difficulty/easiness (i.e., eta) and discrimination (i.e., alpha). The items are count data that have zero-inflation and heavy skewness/overdispersion, such that the standard deviation is greater than the mean. A negative binomial response distribution appears to fit the data better than a poisson distribution. I’ve been trying to model the response distribution with a hurdle negative binomial model to account for the zero-inflation, but am having difficulty getting the model to converge.

Because of difficulties getting the model to converge, I tried simulating data for which the model should be able to fit. However, I’m still having difficulty with model convergence (e.g., high R-hat values) even with the simulated data. A reproducible example is below. I simulated the items from a Poisson distribution (I’m not sure how to simulate correlated data from a negative binomial distribution).

Here is the code for the model:

brms_formula <- bf(
  response ~ exp(logalpha) * eta,
  eta ~ 1 + predictor + (1 | item) + (1 | subid),
  logalpha ~ 1 + predictor + (1 | item),
  shape ~ 1 + predictor,
  hu ~ 1 + predictor,
  nl = TRUE
)

model_fit <- brm(
  formula = brms_formula,
  data = simData,
  family = brmsfamily("hurdle_negbinomial"),
  chains = 1,
  iter = 10000,
  init = 0,
  seed = 52242,
  control = list(adapt_delta = 0.85),
  backend = "cmdstanr"
)

Here is the reprodicible example:

library("simstudy")
library("brms")
#> Loading required package: Rcpp
#> Loading 'brms' package (version 2.20.4). Useful instructions
#> can be found by typing help('brms'). A more detailed introduction
#> to the package is available through vignette('brms_overview').
#> 
#> Attaching package: 'brms'
#> The following object is masked from 'package:stats':
#> 
#>     ar
library("cmdstanr")
#> This is cmdstanr version 0.6.1
#> - CmdStanR documentation and vignettes: mc-stan.org/cmdstanr
#> - CmdStan path: D:/Documents/.cmdstan/cmdstan-2.33.1
#> - CmdStan version: 2.33.1
library("tidyverse")

# Simulate Data
set.seed(52242)

lambdas <- c(1, 8, 10, 12, 20)

simData <- genCorGen(
  1000,
  nvars = length(lambdas),
  params1 = lambdas, 
  dist = "poisson",
  rho = .6, 
  corstr = "cs",
  wide = FALSE)

simData <- simData %>% 
  rename(
    item = period,
    subid = id,
    response = X
  )

simData$predictor <- simData$response + rnorm(nrow(simData), sd = 20)

cor(simData)
#>                 subid      item   response   predictor
#> subid     1.000000000 0.0000000 0.01872772 0.002656721
#> item      0.000000000 1.0000000 0.85882741 0.279491061
#> response  0.018727723 0.8588274 1.00000000 0.324219762
#> predictor 0.002656721 0.2794911 0.32421976 1.000000000

# Model syntax
brms_formula <- bf(
  response ~ exp(logalpha) * eta,
  eta ~ 1 + predictor + (1 | item) + (1 | subid),
  logalpha ~ 1 + predictor + (1 | item),
  shape ~ 1 + predictor,
  hu ~ 1 + predictor,
  nl = TRUE
)

# Model priors
get_prior(
  brms_formula,
  data = simData,
  family = brmsfamily("hurdle_negbinomial")
)
#>                 prior     class      coef group resp  dpar    nlpar lb ub
#>                (flat)         b                                 eta      
#>                (flat)         b Intercept                       eta      
#>                (flat)         b predictor                       eta      
#>  student_t(3, 0, 2.5)        sd                                 eta  0   
#>  student_t(3, 0, 2.5)        sd            item                 eta  0   
#>  student_t(3, 0, 2.5)        sd Intercept  item                 eta  0   
#>  student_t(3, 0, 2.5)        sd           subid                 eta  0   
#>  student_t(3, 0, 2.5)        sd Intercept subid                 eta  0   
#>                (flat)         b                            logalpha      
#>                (flat)         b Intercept                  logalpha      
#>                (flat)         b predictor                  logalpha      
#>  student_t(3, 0, 2.5)        sd                            logalpha  0   
#>  student_t(3, 0, 2.5)        sd            item            logalpha  0   
#>  student_t(3, 0, 2.5)        sd Intercept  item            logalpha  0   
#>                (flat)         b                         hu               
#>                (flat)         b predictor               hu               
#>        logistic(0, 1) Intercept                         hu               
#>                (flat)         b                      shape               
#>                (flat)         b predictor            shape               
#>  student_t(3, 0, 2.5) Intercept                      shape               
#>        source
#>       default
#>  (vectorized)
#>  (vectorized)
#>       default
#>  (vectorized)
#>  (vectorized)
#>  (vectorized)
#>  (vectorized)
#>       default
#>  (vectorized)
#>  (vectorized)
#>       default
#>  (vectorized)
#>  (vectorized)
#>       default
#>  (vectorized)
#>       default
#>       default
#>  (vectorized)
#>       default

# Fit the model
model_fit <- brm(
  formula = brms_formula,
  data = simData,
  family = brmsfamily("hurdle_negbinomial"),
  chains = 1,
  iter = 10000,
  init = 0,
  seed = 52242,
  control = list(adapt_delta = 0.85),
  backend = "cmdstanr"
)
#> In file included from stan/lib/stan_math/stan/math/prim/prob/von_mises_lccdf.hpp:5,
#>                  from stan/lib/stan_math/stan/math/prim/prob/von_mises_ccdf_log.hpp:4,
#>                  from stan/lib/stan_math/stan/math/prim/prob.hpp:356,
#>                  from stan/lib/stan_math/stan/math/prim.hpp:16,
#>                  from stan/lib/stan_math/stan/math/rev.hpp:14,
#>                  from stan/lib/stan_math/stan/math.hpp:19,
#>                  from stan/src/stan/model/model_header.hpp:4,
#>                  from C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.hpp:2:
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp: In function 'stan::return_type_t<T_x, T_sigma, T_l> stan::math::von_mises_cdf(const T_x&, const T_mu&, const T_k&)':
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp:194: note: '-Wmisleading-indentation' is disabled from this point onwards, since column-tracking was disabled due to the size of the code/headers
#>   194 |       if (cdf_n < 0.0)
#>       |
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp:194: note: adding '-flarge-source-files' will allow for more column-tracking support, at the expense of compilation time and memory
#> Start sampling
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: neg_binomial_2_log_lpmf: Precision parameter is inf, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 71, column 6 to line 73, column 53) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 195, column 6 to column 82)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: neg_binomial_2_log_lpmf: Precision parameter is inf, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 71, column 6 to line 73, column 53) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 195, column 6 to column 82)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: neg_binomial_2_log_lpmf: Precision parameter is inf, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 71, column 6 to line 73, column 53) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 195, column 6 to column 82)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: neg_binomial_2_log_lpmf: Precision parameter is inf, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 71, column 6 to line 73, column 53) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 195, column 6 to column 82)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: neg_binomial_2_log_lpmf: Precision parameter is inf, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 71, column 6 to line 73, column 53) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 195, column 6 to column 82)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: neg_binomial_2_log_lpmf: Precision parameter is inf, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 71, column 6 to line 73, column 53) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 195, column 6 to column 82)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: neg_binomial_2_log_lpmf: Precision parameter is 0, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 71, column 6 to line 73, column 53) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 195, column 6 to column 82)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: neg_binomial_2_log_lpmf: Precision parameter is inf, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 71, column 6 to line 73, column 53) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 195, column 6 to column 82)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: neg_binomial_2_log_lpmf: Precision parameter is 0, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 71, column 6 to line 73, column 53) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 195, column 6 to column 82)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: neg_binomial_2_log_lpmf: Log location parameter is inf, but must be finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 71, column 6 to line 73, column 53) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 195, column 6 to column 82)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: neg_binomial_2_log_lpmf: Precision parameter is 0, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 71, column 6 to line 73, column 53) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpK82p8f/model-cec4452265f.stan', line 195, column 6 to column 82)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Warning: 5000 of 5000 (100.0%) transitions ended with a divergence.
#> See https://mc-stan.org/misc/warnings for details.

# Model summary
summary(model_fit)
#> Warning: Parts of the model have not converged (some Rhats are > 1.05). Be
#> careful when analysing the results! We recommend running more iterations and/or
#> setting stronger priors.
#> Warning: There were 5000 divergent transitions after warmup. Increasing
#> adapt_delta above 0.85 may help. See
#> http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#>  Family: hurdle_negbinomial 
#>   Links: mu = log; shape = log; hu = logit 
#> Formula: response ~ exp(logalpha) * eta 
#>          eta ~ 1 + predictor + (1 | item) + (1 | subid)
#>          logalpha ~ 1 + predictor + (1 | item)
#>          shape ~ 1 + predictor
#>          hu ~ 1 + predictor
#>    Data: simData (Number of observations: 5000) 
#>   Draws: 1 chains, each with iter = 10000; warmup = 5000; thin = 1;
#>          total post-warmup draws = 5000
#> 
#> Group-Level Effects: 
#> ~item (Number of levels: 5) 
#>                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
#> sd(eta_Intercept)          1.00      0.00     1.00     1.00 1.39        2
#> sd(logalpha_Intercept)     1.00      0.00     1.00     1.00 1.97        1
#>                        Tail_ESS
#> sd(eta_Intercept)            20
#> sd(logalpha_Intercept)       10
#> 
#> ~subid (Number of levels: 1000) 
#>                   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sd(eta_Intercept)     1.00      0.00     1.00     1.00 1.72        2       30
#> 
#> Population-Level Effects: 
#>                    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> shape_Intercept       -2.52      0.01    -2.53    -2.50 1.76        2       12
#> hu_Intercept           0.05      0.00     0.05     0.06 1.52        2       11
#> eta_Intercept          0.08      0.00     0.08     0.08 1.89        1       17
#> eta_predictor          0.13      0.00     0.12     0.13 1.44        4       24
#> logalpha_Intercept     0.02      0.00     0.01     0.02 1.98        1       20
#> logalpha_predictor    -0.02      0.00    -0.02    -0.02 1.70        2       17
#> shape_predictor        0.23      0.00     0.23     0.23 1.76        2       11
#> hu_predictor          -0.01      0.00    -0.01    -0.01 1.44        2       11
#> 
#> Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).

sessionInfo()
#> R version 4.3.1 (2023-06-16 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.utf8 
#> [2] LC_CTYPE=English_United States.utf8   
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.utf8    
#> 
#> time zone: America/Chicago
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] lubridate_1.9.3 forcats_1.0.0   stringr_1.5.0   dplyr_1.1.3    
#>  [5] purrr_1.0.2     readr_2.1.4     tidyr_1.3.0     tibble_3.2.1   
#>  [9] ggplot2_3.4.4   tidyverse_2.0.0 cmdstanr_0.6.1  brms_2.20.4    
#> [13] Rcpp_1.0.11     simstudy_0.7.0 
#> 
#> loaded via a namespace (and not attached):
#>   [1] gridExtra_2.3        inline_0.3.19        rlang_1.1.1         
#>   [4] magrittr_2.0.3       matrixStats_1.0.0    compiler_4.3.1      
#>   [7] loo_2.6.0            callr_3.7.3          vctrs_0.6.4         
#>  [10] reshape2_1.4.4       pkgconfig_2.0.3      crayon_1.5.2        
#>  [13] fastmap_1.1.1        backports_1.4.1      ellipsis_0.3.2      
#>  [16] utf8_1.2.3           threejs_0.3.3        promises_1.2.1      
#>  [19] rmarkdown_2.25       tzdb_0.4.0           markdown_1.11       
#>  [22] ps_1.7.5             xfun_0.40            reprex_2.0.2        
#>  [25] jsonlite_1.8.7       later_1.3.1          styler_1.10.2       
#>  [28] parallel_4.3.1       prettyunits_1.2.0    R6_2.5.1            
#>  [31] dygraphs_1.1.1.6     stringi_1.7.12       StanHeaders_2.26.28 
#>  [34] rstan_2.32.3         knitr_1.44           zoo_1.8-12          
#>  [37] base64enc_0.1-3      R.utils_2.12.2       mvnfast_0.2.8       
#>  [40] bayesplot_1.10.0     timechange_0.2.0     httpuv_1.6.11       
#>  [43] Matrix_1.6-1.1       R.cache_0.16.0       igraph_1.5.1        
#>  [46] tidyselect_1.2.0     rstudioapi_0.15.0    abind_1.4-5         
#>  [49] yaml_2.3.7           codetools_0.2-19     miniUI_0.1.1.1      
#>  [52] curl_5.1.0           processx_3.8.2       pkgbuild_1.4.2      
#>  [55] lattice_0.21-9       plyr_1.8.9           shiny_1.7.5.1       
#>  [58] withr_2.5.1          bridgesampling_1.1-2 posterior_1.4.1     
#>  [61] coda_0.19-4          evaluate_0.22        RcppParallel_5.1.7  
#>  [64] xts_0.13.1           pillar_1.9.0         tensorA_0.36.2      
#>  [67] checkmate_2.2.0      DT_0.30              stats4_4.3.1        
#>  [70] shinyjs_2.1.0        distributional_0.3.2 generics_0.1.3      
#>  [73] hms_1.1.3            rstantools_2.3.1.1   munsell_0.5.0       
#>  [76] scales_1.2.1         gtools_3.9.4         xtable_1.8-4        
#>  [79] glue_1.6.2           tools_4.3.1          shinystan_2.6.0     
#>  [82] data.table_1.14.8    colourpicker_1.3.0   fs_1.6.3            
#>  [85] mvtnorm_1.2-3        grid_4.3.1           QuickJSR_1.0.7      
#>  [88] crosstalk_1.2.0      colorspace_2.1-0     nlme_3.1-163        
#>  [91] cli_3.6.1            fansi_1.0.5          Brobdingnag_1.2-9   
#>  [94] V8_4.4.0             gtable_0.3.4         R.methodsS3_1.8.2   
#>  [97] digest_0.6.33        htmlwidgets_1.6.2    farver_2.1.1        
#> [100] htmltools_0.5.6.1    R.oo_1.25.0          lifecycle_1.0.3     
#> [103] mime_0.12            shinythemes_1.2.0

Created on 2023-10-21 with reprex v2.0.2

I’m not sure if I need to make an adjustment to the model, priors, and/or response distribution. In the models with my actual (non-simulated) data, I’ve also further increased adapt_delta, but no luck. I had similar convergence difficulties when fitting a model to my actual data using other response distributions (e.g., zero-inflated negative binomial, hurdle/zero-inflated poisson). I’m hopeful that working through and solving the issue with fitting the model to these simulated data will help me make the necessary adjustments to fit the model to my actual data. I’m a bit new to the hurdle negative binomial model, so any help and feedback would be greatly appreciated.

I’ve copied/pasted part of your summary() output. Notice that brm() defaults to the log link for the \mu model, yet in your brms_formula() line, you have already applied exp() on the right side of the equation for your intercept. If you look through Paul’s vignette here, you’ll see you might be unintentionally applying the link function twice. So the first step might be setting family = brmsfamily("hurdle_negbinomial", link = "identity") to use the identify link for the \mu model.

Two more thoughts:

  • In my experience, default priors don’t tend to work well with non-linear models, so you’ll want to use priors that are at least weakly informative to help guide the model.
  • Have you had success with much simpler versions of the model, say without the complications of the conditional shape and hu parameters, and perhaps with even fewer random effects? Whenever I’m building up to a complicated model, I always start simple first, which helps me shake out any unexpected bugs.

Hi Solomon,

Thank you for the very helpful response. I was able to get the simulated model to converge by removing the predictors of the shape and hu parameters (as you suggested) and by increasing the number of iterations to 30,000. That’s great–thank you! (Updated reproducible example below.) I am now trying to see if I can get the model with my actual data to fit my implementing these changes.

Here is the updated code for the model:

brms_formula <- bf(
  response ~ exp(logalpha) * eta,
  eta ~ 1 + predictor + (1 | item) + (1 | subid),
  logalpha ~ 1 + predictor + (1 | item),
  #shape ~ 1 + predictor,
  #hu ~ 1 + predictor,
  nl = TRUE
)

model_fit <- brm(
  formula = brms_formula,
  data = simData,
  family = brmsfamily("hurdle_negbinomial"),
  chains = 1,
  iter = 30000, # previously 10000
  init = 0,
  seed = 52242,
  backend = "cmdstanr"
)

A few follow-up questions.

  1. Do you have any guidance for setting priors in these kinds of models? I’m relatively new to setting priors in Bayesian models, and it’s hard for me to know what the estimated parameters will/should be in these nonlinear models, so it’s difficult to think about what would be appropriate priors for each parameter that are at least weakly informative. Any guidance on how to go about setting weakly informative priors for these models would be greatly appreciated.
  2. Are you sure that I’m applying the link function twice? Is there a way to confirm? The post you linked to suggested that it is important to use link = identity for the bernoulli family when the nonlinear predictor term already contains the desired link function, in that case, the inverse logit link. This is similar to the example on p. 34 of Bürkner’s (2020) white paper. However, on pp. 27–28 of the white paper, Paul provides an example of a Bernoulli model with a logit link that has a nonlinear predictor term with the exp() function, similar to my model. Moreover, I get the model to converge with the default log link, but the model does not converge when I set link = identity. The error when I set link = identity is below:
#> Start sampling
#> Chain 1 Exception: Exception: neg_binomial_2_lpmf: Location parameter is 0, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpAHPnjP/model-220435b24831.stan', line 16, column 6 to line 17, column 47) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpAHPnjP/model-220435b24831.stan', line 169, column 6 to column 66)
#> Chain 1 Initialization failed.
#> Warning: No chains finished successfully. Unable to retrieve the fit.

See the reproducible example below:

library("simstudy")
library("brms")
#> Loading required package: Rcpp
#> Loading 'brms' package (version 2.20.4). Useful instructions
#> can be found by typing help('brms'). A more detailed introduction
#> to the package is available through vignette('brms_overview').
#> 
#> Attaching package: 'brms'
#> The following object is masked from 'package:stats':
#> 
#>     ar
library("cmdstanr")
#> This is cmdstanr version 0.6.1
#> - CmdStanR documentation and vignettes: mc-stan.org/cmdstanr
#> - CmdStan path: D:/Documents/.cmdstan/cmdstan-2.33.1
#> - CmdStan version: 2.33.1
library("tidyverse")

# Simulate Data
set.seed(52242)

lambdas <- c(1, 8, 10, 12, 20)

simData <- genCorGen(
  1000,
  nvars = length(lambdas),
  params1 = lambdas, 
  dist = "poisson",
  rho = .6, 
  corstr = "cs",
  wide = FALSE)

simData <- simData %>% 
  rename(
    item = period,
    subid = id,
    response = X
  )

simData$predictor <- simData$response + rnorm(nrow(simData), sd = 20)

cor(simData)
#>                 subid      item   response   predictor
#> subid     1.000000000 0.0000000 0.01872772 0.002656721
#> item      0.000000000 1.0000000 0.85882741 0.279491061
#> response  0.018727723 0.8588274 1.00000000 0.324219762
#> predictor 0.002656721 0.2794911 0.32421976 1.000000000

# Model syntax
brms_formula <- bf(
  response ~ exp(logalpha) * eta,
  eta ~ 1 + predictor + (1 | item) + (1 | subid),
  logalpha ~ 1 + predictor + (1 | item),
  #shape ~ 1 + predictor,
  #hu ~ 1 + predictor,
  nl = TRUE
)

# Model priors
get_prior(
  brms_formula,
  data = simData,
  family = brmsfamily("hurdle_negbinomial")
)
#>                 prior class      coef group resp dpar    nlpar lb ub
#>            beta(1, 1)    hu                                     0  1
#>     gamma(0.01, 0.01) shape                                     0   
#>                (flat)     b                                eta      
#>                (flat)     b Intercept                      eta      
#>                (flat)     b predictor                      eta      
#>  student_t(3, 0, 2.5)    sd                                eta  0   
#>  student_t(3, 0, 2.5)    sd            item                eta  0   
#>  student_t(3, 0, 2.5)    sd Intercept  item                eta  0   
#>  student_t(3, 0, 2.5)    sd           subid                eta  0   
#>  student_t(3, 0, 2.5)    sd Intercept subid                eta  0   
#>                (flat)     b                           logalpha      
#>                (flat)     b Intercept                 logalpha      
#>                (flat)     b predictor                 logalpha      
#>  student_t(3, 0, 2.5)    sd                           logalpha  0   
#>  student_t(3, 0, 2.5)    sd            item           logalpha  0   
#>  student_t(3, 0, 2.5)    sd Intercept  item           logalpha  0   
#>        source
#>       default
#>       default
#>       default
#>  (vectorized)
#>  (vectorized)
#>       default
#>  (vectorized)
#>  (vectorized)
#>  (vectorized)
#>  (vectorized)
#>       default
#>  (vectorized)
#>  (vectorized)
#>       default
#>  (vectorized)
#>  (vectorized)

# Fit the model
model_fit <- brm(
  formula = brms_formula,
  data = simData,
  family = brmsfamily("hurdle_negbinomial"),
  chains = 1,
  iter = 30000, # previously 10000
  init = 0,
  seed = 52242,
  backend = "cmdstanr"
)
#> In file included from stan/lib/stan_math/stan/math/prim/prob/von_mises_lccdf.hpp:5,
#>                  from stan/lib/stan_math/stan/math/prim/prob/von_mises_ccdf_log.hpp:4,
#>                  from stan/lib/stan_math/stan/math/prim/prob.hpp:356,
#>                  from stan/lib/stan_math/stan/math/prim.hpp:16,
#>                  from stan/lib/stan_math/stan/math/rev.hpp:14,
#>                  from stan/lib/stan_math/stan/math.hpp:19,
#>                  from stan/src/stan/model/model_header.hpp:4,
#>                  from C:/Users/ITPETE~1/AppData/Local/Temp/RtmpAHPnjP/model-2204322d7042.hpp:2:
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp: In function 'stan::return_type_t<T_x, T_sigma, T_l> stan::math::von_mises_cdf(const T_x&, const T_mu&, const T_k&)':
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp:194: note: '-Wmisleading-indentation' is disabled from this point onwards, since column-tracking was disabled due to the size of the code/headers
#>   194 |       if (cdf_n < 0.0)
#>       |
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp:194: note: adding '-flarge-source-files' will allow for more column-tracking support, at the expense of compilation time and memory
#> Start sampling
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: gamma_lpdf: Random variable is 0, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpAHPnjP/model-2204322d7042.stan', line 135, column 2 to column 43)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: gamma_lpdf: Random variable is 0, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpAHPnjP/model-2204322d7042.stan', line 135, column 2 to column 43)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: gamma_lpdf: Random variable is 0, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpAHPnjP/model-2204322d7042.stan', line 135, column 2 to column 43)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: gamma_lpdf: Random variable is 0, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpAHPnjP/model-2204322d7042.stan', line 135, column 2 to column 43)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: gamma_lpdf: Random variable is 0, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpAHPnjP/model-2204322d7042.stan', line 135, column 2 to column 43)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: neg_binomial_2_log_lpmf: Log location parameter is inf, but must be finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpAHPnjP/model-2204322d7042.stan', line 52, column 6 to line 53, column 53) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpAHPnjP/model-2204322d7042.stan', line 169, column 6 to column 70)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Warning: 7870 of 15000 (52.0%) transitions ended with a divergence.
#> See https://mc-stan.org/misc/warnings for details.
#> Warning: 7130 of 15000 (48.0%) transitions hit the maximum treedepth limit of 10.
#> See https://mc-stan.org/misc/warnings for details.

# Model summary
summary(model_fit)
#> Warning: Parts of the model have not converged (some Rhats are > 1.05). Be
#> careful when analysing the results! We recommend running more iterations and/or
#> setting stronger priors.
#> Warning: There were 7870 divergent transitions after warmup. Increasing
#> adapt_delta above 0.8 may help. See
#> http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#>  Family: hurdle_negbinomial 
#>   Links: mu = log; shape = identity; hu = identity 
#> Formula: response ~ exp(logalpha) * eta 
#>          eta ~ 1 + predictor + (1 | item) + (1 | subid)
#>          logalpha ~ 1 + predictor + (1 | item)
#>    Data: simData (Number of observations: 5000) 
#>   Draws: 1 chains, each with iter = 30000; warmup = 15000; thin = 1;
#>          total post-warmup draws = 15000
#> 
#> Group-Level Effects: 
#> ~item (Number of levels: 5) 
#>                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
#> sd(eta_Intercept)          9.35     10.24     2.96    37.31 2.12        1
#> sd(logalpha_Intercept)     0.68      0.38     0.42     1.74 1.94        2
#>                        Tail_ESS
#> sd(eta_Intercept)            20
#> sd(logalpha_Intercept)       18
#> 
#> ~subid (Number of levels: 1000) 
#>                   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sd(eta_Intercept)     1.51      1.37     0.69     5.29 2.12        1       20
#> 
#> Population-Level Effects: 
#>                    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> eta_Intercept         16.06     16.48     6.81    62.09 1.98        2       20
#> eta_predictor          0.00      0.01    -0.00     0.02 1.65        7       13
#> logalpha_Intercept    -1.59      0.68    -3.34    -0.95 1.96        2       24
#> logalpha_predictor     0.00      0.00     0.00     0.00 1.54       15       14
#> 
#> Family Specific Parameters: 
#>       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> shape  1019.82    196.16   772.64  1545.88 1.77        4       16
#> hu        0.07      0.00     0.06     0.08 1.92        3       15
#> 
#> Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).

# Change link of response distribution to "identity"
model_fit2 <- brm(
  formula = brms_formula,
  data = simData,
  family = brmsfamily("hurdle_negbinomial", link = "identity"),
  chains = 1,
  iter = 30000,
  init = 0,
  seed = 52242,
  backend = "cmdstanr"
)
#> In file included from stan/lib/stan_math/stan/math/prim/prob/von_mises_lccdf.hpp:5,
#>                  from stan/lib/stan_math/stan/math/prim/prob/von_mises_ccdf_log.hpp:4,
#>                  from stan/lib/stan_math/stan/math/prim/prob.hpp:356,
#>                  from stan/lib/stan_math/stan/math/prim.hpp:16,
#>                  from stan/lib/stan_math/stan/math/rev.hpp:14,
#>                  from stan/lib/stan_math/stan/math.hpp:19,
#>                  from stan/src/stan/model/model_header.hpp:4,
#>                  from C:/Users/ITPETE~1/AppData/Local/Temp/RtmpAHPnjP/model-220435b24831.hpp:2:
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp: In function 'stan::return_type_t<T_x, T_sigma, T_l> stan::math::von_mises_cdf(const T_x&, const T_mu&, const T_k&)':
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp:194: note: '-Wmisleading-indentation' is disabled from this point onwards, since column-tracking was disabled due to the size of the code/headers
#>   194 |       if (cdf_n < 0.0)
#>       |
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp:194: note: adding '-flarge-source-files' will allow for more column-tracking support, at the expense of compilation time and memory
#> Start sampling
#> Chain 1 Exception: Exception: neg_binomial_2_lpmf: Location parameter is 0, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpAHPnjP/model-220435b24831.stan', line 16, column 6 to line 17, column 47) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpAHPnjP/model-220435b24831.stan', line 169, column 6 to column 66)
#> Chain 1 Initialization failed.
#> Warning: No chains finished successfully. Unable to retrieve the fit.
#> Error in cmdstanr::read_cmdstan_csv(out$output_files(), variables = "", : Assertion on 'files' failed: No file provided.

# Model summary
summary(model_fit2)
#> Error in eval(expr, envir, enclos): object 'model_fit2' not found

sessionInfo()
#> R version 4.3.1 (2023-06-16 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.utf8 
#> [2] LC_CTYPE=English_United States.utf8   
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.utf8    
#> 
#> time zone: America/Chicago
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] lubridate_1.9.3 forcats_1.0.0   stringr_1.5.0   dplyr_1.1.3    
#>  [5] purrr_1.0.2     readr_2.1.4     tidyr_1.3.0     tibble_3.2.1   
#>  [9] ggplot2_3.4.4   tidyverse_2.0.0 cmdstanr_0.6.1  brms_2.20.4    
#> [13] Rcpp_1.0.11     simstudy_0.7.0 
#> 
#> loaded via a namespace (and not attached):
#>   [1] gridExtra_2.3        inline_0.3.19        rlang_1.1.1         
#>   [4] magrittr_2.0.3       matrixStats_1.0.0    compiler_4.3.1      
#>   [7] loo_2.6.0            callr_3.7.3          vctrs_0.6.4         
#>  [10] reshape2_1.4.4       pkgconfig_2.0.3      crayon_1.5.2        
#>  [13] fastmap_1.1.1        backports_1.4.1      ellipsis_0.3.2      
#>  [16] utf8_1.2.4           threejs_0.3.3        promises_1.2.1      
#>  [19] rmarkdown_2.25       tzdb_0.4.0           markdown_1.11       
#>  [22] ps_1.7.5             xfun_0.40            reprex_2.0.2        
#>  [25] jsonlite_1.8.7       later_1.3.1          styler_1.10.2       
#>  [28] parallel_4.3.1       prettyunits_1.2.0    R6_2.5.1            
#>  [31] dygraphs_1.1.1.6     stringi_1.7.12       StanHeaders_2.26.28 
#>  [34] rstan_2.32.3         knitr_1.44           zoo_1.8-12          
#>  [37] base64enc_0.1-3      R.utils_2.12.2       mvnfast_0.2.8       
#>  [40] bayesplot_1.10.0     timechange_0.2.0     httpuv_1.6.12       
#>  [43] Matrix_1.6-1.1       R.cache_0.16.0       igraph_1.5.1        
#>  [46] tidyselect_1.2.0     rstudioapi_0.15.0    abind_1.4-5         
#>  [49] yaml_2.3.7           codetools_0.2-19     miniUI_0.1.1.1      
#>  [52] curl_5.1.0           processx_3.8.2       pkgbuild_1.4.2      
#>  [55] lattice_0.22-5       plyr_1.8.9           shiny_1.7.5.1       
#>  [58] withr_2.5.1          bridgesampling_1.1-2 posterior_1.4.1     
#>  [61] coda_0.19-4          evaluate_0.22        RcppParallel_5.1.7  
#>  [64] xts_0.13.1           pillar_1.9.0         tensorA_0.36.2      
#>  [67] checkmate_2.3.0      DT_0.30              stats4_4.3.1        
#>  [70] shinyjs_2.1.0        distributional_0.3.2 generics_0.1.3      
#>  [73] hms_1.1.3            rstantools_2.3.1.1   munsell_0.5.0       
#>  [76] scales_1.2.1         gtools_3.9.4         xtable_1.8-4        
#>  [79] glue_1.6.2           tools_4.3.1          shinystan_2.6.0     
#>  [82] data.table_1.14.8    colourpicker_1.3.0   fs_1.6.3            
#>  [85] mvtnorm_1.2-3        grid_4.3.1           QuickJSR_1.0.7      
#>  [88] crosstalk_1.2.0      colorspace_2.1-0     nlme_3.1-163        
#>  [91] cli_3.6.1            fansi_1.0.5          Brobdingnag_1.2-9   
#>  [94] V8_4.4.0             gtable_0.3.4         R.methodsS3_1.8.2   
#>  [97] digest_0.6.33        htmlwidgets_1.6.2    farver_2.1.1        
#> [100] htmltools_0.5.6.1    R.oo_1.25.0          lifecycle_1.0.3     
#> [103] mime_0.12            shinythemes_1.2.0

Created on 2023-10-27 with reprex v2.0.2

Before we address your first question, I think we should nail down the second one. To show you my point, let’s use a simulated data set of Poisson counts with a mean of 5.

# load packages
library(tidyverse)
library(brms)

# simulate Poisson data
set.seed(1)

dat <- data.frame(count = rpois(n = 200, lambda = 5))

# visualize
dat %>%
  ggplot(aes(count)) +
  geom_bar()

First I’ll fit an intercpet-only Poisson model using the conventional log link by way of family = poisson(link = "log").

model1 <- brm(
  data = dat,
  family = poisson(link = "log"),
  count ~ 1,
  cores = 4, seed = 1
)

Second, I’ll fit the same basic model, but this time using the non-linear syntax and family = poisson(link = "identity").

model2 <- brm(
  data = dat,
  family = poisson(link = "identity"),
  bf(count ~ exp(logalpha),
     logalpha ~ 1,
     nl = TRUE),
  cores = 4, seed = 1
)

Here are the summary results for the two models.

summary(model1)
summary(model2)
Family: poisson 
  Links: mu = log 
Formula: count ~ 1 
   Data: dat (Number of observations: 200) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept     1.63      0.03     1.57     1.69 1.00     1317     2032

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

 Family: poisson 
  Links: mu = identity 
Formula: count ~ exp(logalpha) 
         logalpha ~ 1
   Data: dat (Number of observations: 200) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Population-Level Effects: 
                   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
logalpha_Intercept     1.63      0.03     1.57     1.69 1.00     1477     2399

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Notice that the posterior mean for the sole parameter in both models, 1.63, is on the log scale. If we exponentiate that value, we get 5 (with a little simulation variance).

exp(1.63)
[1] 5.103875

See? When you use the formula syntax of count ~ exp(logalpha), you’re effectively using the log link already. Thus you want to set link = "identity" for models of that kind.

Thank you for taking the time to put together that very helpful response and example; it definitely convinced me that the model was applying the link twice. Do you have thoughts about how to get the model to run when using link = identity? It’s interesting that, for whatever reason, it converges when using link = log but it doesn’t even initiate the chain when using link = identity. I also tried simplifying the model by removing predictor and the random effects for subid and item, but still received the same error. Do you have a sense if it an issue of the priors, the starting values, or how the model is formulated (or something else)? Any guidance on how to proceed would be greatly appreciated. Thanks for your help working through this!

Here’s the model syntax:

brms_formula <- bf(
  response ~ exp(logalpha) * eta,
  eta ~ 1 + predictor + (1 | item) + (1 | subid),
  logalpha ~ 1 + predictor + (1 | item),
  #shape ~ 1 + predictor,
  #hu ~ 1 + predictor,
  nl = TRUE
)

model_fit <- brm(
  formula = brms_formula,
  data = simData,
  family = brmsfamily("hurdle_negbinomial", link = "identity"),
  chains = 1,
  iter = 30000,
  init = 0,
  seed = 52242,
  backend = "cmdstanr"
)

Here are the priors:

                prior class      coef group resp dpar    nlpar lb ub       source
           beta(1, 1)    hu                                     0  1      default
    gamma(0.01, 0.01) shape                                     0         default
               (flat)     b                                eta            default
               (flat)     b Intercept                      eta       (vectorized)
               (flat)     b predictor                      eta       (vectorized)
 student_t(3, 0, 7.4)    sd                                eta  0         default
 student_t(3, 0, 7.4)    sd            item                eta  0    (vectorized)
 student_t(3, 0, 7.4)    sd Intercept  item                eta  0    (vectorized)
 student_t(3, 0, 7.4)    sd           subid                eta  0    (vectorized)
 student_t(3, 0, 7.4)    sd Intercept subid                eta  0    (vectorized)
               (flat)     b                           logalpha            default
               (flat)     b Intercept                 logalpha       (vectorized)
               (flat)     b predictor                 logalpha       (vectorized)
 student_t(3, 0, 7.4)    sd                           logalpha  0         default
 student_t(3, 0, 7.4)    sd            item           logalpha  0    (vectorized)
 student_t(3, 0, 7.4)    sd Intercept  item           logalpha  0    (vectorized)

Here is the error:

Chain 1 Exception: Exception: neg_binomial_2_lpmf: Location parameter is 0, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/Rtmp0W7zlL/model-678828961.stan', line 16, column 6 to line 17, column 47) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/Rtmp0W7zlL/model-678828961.stan', line 169, column 6 to column 66)
Chain 1 Initialization failed.
Error in cmdstanr::read_cmdstan_csv(out$output_files(), variables = "",  : 
  Assertion on 'files' failed: No file provided.
In addition: Warning message:
No chains finished successfully. Unable to retrieve the fit.

When I remove init = 0, I get the following error:

Chain 1 Exception: Exception: neg_binomial_2_lpmf: Location parameter is -17.9456, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/Rtmp0W7zlL/model-67885c2a6e46.stan', line 16, column 6 to line 17, column 47) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/Rtmp0W7zlL/model-67885c2a6e46.stan', line 169, column 6 to column 66)
...
Chain 1 Initialization between (-2, 2) failed after 100 attempts. 
Chain 1  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
Chain 1 Initialization failed.

Here is the reproducible example:

library("simstudy")
library("brms")
#> Loading required package: Rcpp
#> Loading 'brms' package (version 2.20.4). Useful instructions
#> can be found by typing help('brms'). A more detailed introduction
#> to the package is available through vignette('brms_overview').
#> 
#> Attaching package: 'brms'
#> The following object is masked from 'package:stats':
#> 
#>     ar
library("cmdstanr")
#> This is cmdstanr version 0.6.1
#> - CmdStanR documentation and vignettes: mc-stan.org/cmdstanr
#> - CmdStan path: D:/Documents/.cmdstan/cmdstan-2.33.1
#> - CmdStan version: 2.33.1
library("tidyverse")

# Simulate Data
set.seed(52242)

lambdas <- c(1, 8, 10, 12, 20)

simData <- genCorGen(
  1000,
  nvars = length(lambdas),
  params1 = lambdas, 
  dist = "poisson",
  rho = .6, 
  corstr = "cs",
  wide = FALSE)

simData <- simData %>% 
  rename(
    item = period,
    subid = id,
    response = X
  )

simData$predictor <- simData$response + rnorm(nrow(simData), sd = 20)

cor(simData)
#>                 subid      item   response   predictor
#> subid     1.000000000 0.0000000 0.01872772 0.002656721
#> item      0.000000000 1.0000000 0.85882741 0.279491061
#> response  0.018727723 0.8588274 1.00000000 0.324219762
#> predictor 0.002656721 0.2794911 0.32421976 1.000000000

# Model syntax
brms_formula <- bf(
  response ~ exp(logalpha) * eta,
  eta ~ 1 + predictor + (1 | item) + (1 | subid),
  logalpha ~ 1 + predictor + (1 | item),
  #shape ~ 1 + predictor,
  #hu ~ 1 + predictor,
  nl = TRUE
)

# Model priors
get_prior(
  brms_formula,
  data = simData,
  family = brmsfamily("hurdle_negbinomial", link = "identity")
)
#>                 prior class      coef group resp dpar    nlpar lb ub
#>            beta(1, 1)    hu                                     0  1
#>     gamma(0.01, 0.01) shape                                     0   
#>                (flat)     b                                eta      
#>                (flat)     b Intercept                      eta      
#>                (flat)     b predictor                      eta      
#>  student_t(3, 0, 7.4)    sd                                eta  0   
#>  student_t(3, 0, 7.4)    sd            item                eta  0   
#>  student_t(3, 0, 7.4)    sd Intercept  item                eta  0   
#>  student_t(3, 0, 7.4)    sd           subid                eta  0   
#>  student_t(3, 0, 7.4)    sd Intercept subid                eta  0   
#>                (flat)     b                           logalpha      
#>                (flat)     b Intercept                 logalpha      
#>                (flat)     b predictor                 logalpha      
#>  student_t(3, 0, 7.4)    sd                           logalpha  0   
#>  student_t(3, 0, 7.4)    sd            item           logalpha  0   
#>  student_t(3, 0, 7.4)    sd Intercept  item           logalpha  0   
#>        source
#>       default
#>       default
#>       default
#>  (vectorized)
#>  (vectorized)
#>       default
#>  (vectorized)
#>  (vectorized)
#>  (vectorized)
#>  (vectorized)
#>       default
#>  (vectorized)
#>  (vectorized)
#>       default
#>  (vectorized)
#>  (vectorized)

# Fit the model
model_fit <- brm(
  formula = brms_formula,
  data = simData,
  family = brmsfamily("hurdle_negbinomial", link = "identity"),
  chains = 1,
  iter = 30000,
  init = 0,
  seed = 52242,
  backend = "cmdstanr"
)
#> In file included from stan/lib/stan_math/stan/math/prim/prob/von_mises_lccdf.hpp:5,
#>                  from stan/lib/stan_math/stan/math/prim/prob/von_mises_ccdf_log.hpp:4,
#>                  from stan/lib/stan_math/stan/math/prim/prob.hpp:356,
#>                  from stan/lib/stan_math/stan/math/prim.hpp:16,
#>                  from stan/lib/stan_math/stan/math/rev.hpp:14,
#>                  from stan/lib/stan_math/stan/math.hpp:19,
#>                  from stan/src/stan/model/model_header.hpp:4,
#>                  from C:/Users/ITPETE~1/AppData/Local/Temp/RtmpiSUlUn/model-77d016284462.hpp:2:
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp: In function 'stan::return_type_t<T_x, T_sigma, T_l> stan::math::von_mises_cdf(const T_x&, const T_mu&, const T_k&)':
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp:194: note: '-Wmisleading-indentation' is disabled from this point onwards, since column-tracking was disabled due to the size of the code/headers
#>   194 |       if (cdf_n < 0.0)
#>       |
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp:194: note: adding '-flarge-source-files' will allow for more column-tracking support, at the expense of compilation time and memory
#> Start sampling
#> Chain 1 Exception: Exception: neg_binomial_2_lpmf: Location parameter is 0, but must be positive finite! (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpiSUlUn/model-77d016284462.stan', line 16, column 6 to line 17, column 47) (in 'C:/Users/ITPETE~1/AppData/Local/Temp/RtmpiSUlUn/model-77d016284462.stan', line 169, column 6 to column 66)
#> Chain 1 Initialization failed.
#> Warning: No chains finished successfully. Unable to retrieve the fit.
#> Error in cmdstanr::read_cmdstan_csv(out$output_files(), variables = "", : Assertion on 'files' failed: No file provided.

# Model summary
summary(model_fit)
#> Error in eval(expr, envir, enclos): object 'model_fit' not found

sessionInfo()
#> R version 4.3.1 (2023-06-16 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.utf8 
#> [2] LC_CTYPE=English_United States.utf8   
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.utf8    
#> 
#> time zone: America/Chicago
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] lubridate_1.9.3 forcats_1.0.0   stringr_1.5.0   dplyr_1.1.3    
#>  [5] purrr_1.0.2     readr_2.1.4     tidyr_1.3.0     tibble_3.2.1   
#>  [9] ggplot2_3.4.4   tidyverse_2.0.0 cmdstanr_0.6.1  brms_2.20.4    
#> [13] Rcpp_1.0.11     simstudy_0.7.0 
#> 
#> loaded via a namespace (and not attached):
#>   [1] gridExtra_2.3        inline_0.3.19        rlang_1.1.1         
#>   [4] magrittr_2.0.3       matrixStats_1.0.0    compiler_4.3.1      
#>   [7] loo_2.6.0            callr_3.7.3          vctrs_0.6.4         
#>  [10] reshape2_1.4.4       pkgconfig_2.0.3      crayon_1.5.2        
#>  [13] fastmap_1.1.1        backports_1.4.1      ellipsis_0.3.2      
#>  [16] utf8_1.2.4           threejs_0.3.3        promises_1.2.1      
#>  [19] rmarkdown_2.25       tzdb_0.4.0           markdown_1.11       
#>  [22] ps_1.7.5             xfun_0.40            reprex_2.0.2        
#>  [25] jsonlite_1.8.7       later_1.3.1          styler_1.10.2       
#>  [28] parallel_4.3.1       prettyunits_1.2.0    R6_2.5.1            
#>  [31] dygraphs_1.1.1.6     stringi_1.7.12       StanHeaders_2.26.28 
#>  [34] rstan_2.32.3         knitr_1.44           zoo_1.8-12          
#>  [37] base64enc_0.1-3      R.utils_2.12.2       mvnfast_0.2.8       
#>  [40] bayesplot_1.10.0     timechange_0.2.0     httpuv_1.6.12       
#>  [43] Matrix_1.6-1.1       R.cache_0.16.0       igraph_1.5.1        
#>  [46] tidyselect_1.2.0     rstudioapi_0.15.0    abind_1.4-5         
#>  [49] yaml_2.3.7           codetools_0.2-19     miniUI_0.1.1.1      
#>  [52] curl_5.1.0           processx_3.8.2       pkgbuild_1.4.2      
#>  [55] lattice_0.22-5       plyr_1.8.9           shiny_1.7.5.1       
#>  [58] withr_2.5.1          bridgesampling_1.1-2 posterior_1.4.1     
#>  [61] coda_0.19-4          evaluate_0.22        RcppParallel_5.1.7  
#>  [64] xts_0.13.1           pillar_1.9.0         tensorA_0.36.2      
#>  [67] checkmate_2.3.0      DT_0.30              stats4_4.3.1        
#>  [70] shinyjs_2.1.0        distributional_0.3.2 generics_0.1.3      
#>  [73] hms_1.1.3            rstantools_2.3.1.1   munsell_0.5.0       
#>  [76] scales_1.2.1         gtools_3.9.4         xtable_1.8-4        
#>  [79] glue_1.6.2           tools_4.3.1          shinystan_2.6.0     
#>  [82] data.table_1.14.8    colourpicker_1.3.0   fs_1.6.3            
#>  [85] mvtnorm_1.2-3        grid_4.3.1           QuickJSR_1.0.7      
#>  [88] crosstalk_1.2.0      colorspace_2.1-0     nlme_3.1-163        
#>  [91] cli_3.6.1            fansi_1.0.5          Brobdingnag_1.2-9   
#>  [94] V8_4.4.0             gtable_0.3.4         R.methodsS3_1.8.2   
#>  [97] digest_0.6.33        htmlwidgets_1.6.2    farver_2.1.1        
#> [100] htmltools_0.5.6.1    R.oo_1.25.0          lifecycle_1.0.3     
#> [103] mime_0.12            shinythemes_1.2.0

Created on 2023-10-27 with reprex v2.0.2

I think I have some concrete suggestions, but before that, we might back up even further. Putting the simulated data aside, tell me more about your actual data. What kinds of items are you using that they’re giving you count data?

Good question. The data are reported counts of how often a person engaged in each of various behaviors during the past year (e.g., how often the person steals, gets into fights, etc). The data are zero-inflated and heavily skewed. I’d be happy to provide some example histograms or other summary information, if it’d be helpful.

Yeah, a couple faceted histograms might be helpful.

Here are a few examples of histograms for some of the items. There are more items than just these, but this should give you an idea of the kinds of distributions in the data. You’ll notice a few things:

  • zero is the most common response (zero-inflation)
  • the count data are heavily positively skewed
  • the variance is greater than the mean, suggesting a negative binomial would fit better than a poisson, unless it’s an overdispersion poisson
  • the mean (i.e., average count) differs considerably from item to item (i.e., the x-axes in the plots below have widely differing limits from plot to plot), suggesting that it will be important to include a random effect of item
  • there are some extreme values; we’re not sure whether or not we should truncate these, in order to get the model to fit










Preamble

Okay, I’m going to recommend a couple things and give you a concrete example of my recommendation.

First, I recommend you go with a zero-inflated negative-binomial likelihood (ZINB), rather than a hurdle likelihood. I’m a psychologist and from a behavioral perspective, I don’t think a zero count is fundamentally different from a non-zero count. E.g., I can be a smoker or a drinker, but I just didn’t happen to smoke or dink on a given day or week.

Second, I don’t think you need to use the non-linear syntax for your use case at all. You can get what you want with a good old distributional model.

What went wrong before?

Before we get moving with the solution, I believe the reason you were having problems with the last model is because you were modeling Poisson data as if they were hurdle-NB data. This is difficult because the dispersion parameter (called shape or \phi) wants to go to \infty and the default priors aren’t reining it in well. There’s a similar problem with the zero-inflation parameter (called zi or just z), which wants to go to zero.

So to avoid these difficulties, lets simulate proper zero-inflated negative-binomial data.

Simulate zero-inflated negative-binomial data

Base R doesn’t have a function to simulate ZINB data, and it doesn’t even really have a function to simulate simple NB data, either. But, the MASS package does have a rnegbin() function for simulating NB data, and we can use a little hack to add in a zero-inflation component.

Say we want NB data with a mean of 5 and a dispersion of 1. Let’s further say these data are a mixture of those NB counts and an extra column of 0’s, where 75% are normal NB counts, and the other 25% are 0’s. Here’s how to do that.

# load all packages we'll use
library(tidyverse)
library(brms)
library(tidybayes)

# define the population parameters
mu <- 5
phi <- 1
zi <- 0.25

# how many do you want?
n_count <- 1000

# simulate NB counts
set.seed(1)

data.frame(count = MASS::rnegbin(n = n_count, 
                                 mu = mu,
                                 theta = phi)) %>% 
    # here's the hack where we add in extra 0's
    mutate(count = ifelse(rbinom(n = n(), 
                                 size = 1, 
                                 prob = zi) == 1, 
                          0, count)) %>% 
    
    # plot the results
    ggplot(aes(x = count)) +
    geom_histogram(binwidth = 2)

Now we’re ready to expand from this little example to define our model, simulate data based on that model, and fit a Bayesian model to those data with brm().

Define a zero-inflated negative-binomial model

As a place to start, I propose we simulate data based on the model:

\begin{align} \text{count}_{pi} & \sim \operatorname{ZINB}(\mu_{pi}, \phi_i, z_i) \\ \log (\mu_{pi}) & = \beta_0 + u_{0p} + v_{0i} \\ \log(\phi_i) & = \gamma_0 + v_{1i} \\ \operatorname{logit}(z_i) & = \delta_0 + v_{2i} \\ \begin{bmatrix} u_{0p} \\ v_{0i} \\ v_{1i} \\ v_{2i} \end{bmatrix} & \sim \operatorname{Normal} \left ( \mathbf 0, \begin{bmatrix} \sigma_{u_0}^2 \\ 0 & \sigma_{v_0}^2 \\ 0 & 0 & \sigma_{v_1}^2 \\ 0 & 0 & 0 & \sigma_{v_2}^2 \end{bmatrix} \right ), \end{align}

where your counts vary by p persons and i items. Here the \beta_0, \gamma_0, and \delta_0 parameters are the population parameters. The u_{0p} term captures the person-specific deviations from the grand mean \beta_0. The v_{0p}, v_{1p} and v_{2p} terms capture the item-specific deviations for the grand mean \beta_0, \gamma_0 and \delta_0 parameters, respectively. The bottom line of the equation shows how all 4 deviation terms are modeled multivariate normal with exact zero means and a diagonal variance/covariance matrix.

Also notice we’re using link functions, which are the same links as brm() will use by default when we eventually fit a Bayesian version of this model.

Simulate data based on our zero-inflated negative-binomial model

Since we’re simulating data according to that model, I’m going to pick population values I suspect will be in the ballpark of what you’ll see in your real-world data. Here are those values.

# define population parameters
mu <- 5
phi <- 1
zi <- 0.25

b0 <- log(mu)
g0 <- log(phi)
d0 <- qlogis(zi)

sigma_u0 <- 1

sigma_v0 <- 1
sigma_v1 <- 0.5
sigma_v2 <- 0.5

Play around with different data-generating values on your own.

Now we have our population values, we’ll need to decide how many synthetic persons and items we’d like. Here’s what I used. Use different numbers, as desired.

# how many persons and items?
n_person <- 100
n_item   <- 10

Before we simulate our counts, we’ll first want a data frame with synthetic persons, items, and columns with all the data-generating parameter values. As a first step, let’s save a dat.poplation data frame that contains our population parameters

# save population parameters
dat.poplation <- data.frame(
  b0 = b0,
  g0 = g0,
  d0 = d0
)

Now we want a person-level data frame, which we’ll call dat.person.

# simulate and save persons and their parameters
set.seed(1)

dat.person <- data.frame(
    person = 1:n_person %>% factor(),
    u0 = rnorm(n = n_person, mean = 0, sd = sigma_u0)
)

Next an item-level data frame we’ll call dat.item.

# simulate and save items and their parameters
set.seed(1)

dat.item <- data.frame(
    item = 1:n_item %>% factor(),
    v0 = rnorm(n = n_item, mean = 0, sd = sigma_v0),
    v1 = rnorm(n = n_item, mean = 0, sd = sigma_v1),
    v2 = rnorm(n = n_item, mean = 0, sd = sigma_v2)
)

Combine all the mini data frames into a whole called dat.

# combine all the previous data sets into a single data frame
dat <- crossing(
    person = 1:n_person %>% factor(),
    item = 1:n_item %>% factor()
) %>% 
    bind_cols(dat.poplation) %>% 
    left_join(dat.person, by = "person") %>% 
    left_join(dat.item, by = "item")

We’re finally ready to simulate ZINB counts.

# use the population, persion, and item parameters to simulate counts
set.seed(1)

dat <- dat %>% 
    mutate(count = MASS::rnegbin(n = n(), 
                                 # NB mean model
                                 mu = exp(b0 + u0 + v0), 
                                 # NB dispersion model
                                 theta = exp(g0 + v1))) %>% 
    mutate(count = ifelse(rbinom(n = n(), 
                                 size = 1, 
                                 # mixture zero-inflation model
                                 prob = plogis(d0 + v2)) == 1, 
                          0, count))

Take a look at those hot ZINB count data with histograms, faceted by item.

dat %>% 
    ggplot(aes(x = count)) +
    geom_histogram(binwidth = 10) +
    facet_wrap(~ item)

Looks like a dream, doesn’t it?

Fit a zero-inflated negative-binomial model with brm()

Here’s how to model those data with brm(). I’ve added weakly-regularizing priors. Adjust your priors as desired. On my computer (2023 MacBook Pro) this took less than a minute to run.

my_fit <- brm(
    data = dat,
    family = zero_inflated_negbinomial(),
    bf(count ~ 1 + (1 | person) + (1 | item),
       shape ~ 1 + (1 | item),
       zi ~ 1 + (1 | item)),
    # priors for population parameters
    prior = prior(normal(log(5), 1), class = Intercept) +
        prior(normal(log(1), 1), class = Intercept, dpar = shape) +
        prior(normal(-1.098612, 1), class = Intercept, dpar = zi) +  # qlogis(0.25) = -1.098612
        # priors for level-2 SD parameters
        prior(exponential(1), class = sd) +
        prior(exponential(1), class = sd, dpar = shape) +
        prior(exponential(1), class = sd, dpar = zi),
    cores = 4, seed = 1
)

Check the model summary.

print(my_fit)
 Family: zero_inflated_negbinomial 
  Links: mu = log; shape = log; zi = logit 
Formula: count ~ 1 + (1 | person) + (1 | item) 
         shape ~ 1 + (1 | item)
         zi ~ 1 + (1 | item)
   Data: dat (Number of observations: 1000) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Group-Level Effects: 
~item (Number of levels: 10) 
                    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)           0.81      0.22     0.50     1.38 1.00     1116     2035
sd(shape_Intercept)     0.53      0.23     0.17     1.07 1.01      946     1511
sd(zi_Intercept)        0.47      0.22     0.12     1.00 1.01     1022      919

~person (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.86      0.08     0.71     1.04 1.00      827     1934

Population-Level Effects: 
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept           1.86      0.27     1.32     2.40 1.00      618     1177
shape_Intercept     0.13      0.22    -0.27     0.60 1.00     1238     1948
zi_Intercept       -0.99      0.20    -1.43    -0.61 1.00     1981     2166

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Looks good so far. Here’s a pp-check.

set.seed(1)
pp_check(my_fit, type = "hist", ndraws = 8, binwidth = 25)

Not bad!

And here’s one way you might display the item-level parameters.

parameters_levels <- c("mu", "phi", "italic(z)")

dat %>% 
    distinct(item) %>% 
    # by default, add_epred_draws() returns results back on the original count metric
    add_epred_draws(my_fit, 
                    # here we indicate we only want item-level results
                    re_formula = ~ (1 | item),
                    # here we request informaiton about the non-mean parameters, too
                    dpar = c("shape", "zi")) %>% 
    rename(mu = .epred,
           phi = shape,
           `italic(z)` = zi) %>% 
    pivot_longer(mu:`italic(z)`, names_to = "parameters") %>% 
    mutate(parameters = factor(parameters, parameters_levels)) %>% 
    group_by(item, parameters) %>% 
    mean_qi(value) %>% 
    
    ggplot(aes(x = value, xmin = .lower, xmax = .upper, y = item)) +
    geom_pointrange() +
    xlim(0, NA) +
    facet_wrap(~ parameters, labeller = label_parsed, scales = "free_x")

Next steps

You can generalize this model in various ways, like adding person-specific deviation parameters for the \phi and/or z models, or allowing the population parameters to differ by some experimental condition. To learn more about the zero-inflated negative-binomial parameters, see Section #4 here. The only thing to keep in mind with the equations at that site is their \alpha parameter is equal to the inverse of our brms \phi parameter, meaning \alpha = 1 / \phi.

1 Like

Thanks very much, Solomon, for the detailed simulated example and explanations—those are very helpful. In addition to the three parameters from the zero-inflated negative binomial model (mu, phi, and zi), would it be possible to estimate traditional IRT parameters similar to a 2PL model: 1) easiness (i.e., \text{difficulty} \times -1) and 2) discrimination? The easiness and discrimination parameters are helpful for characterizing the functioning of items from IRT models, for evaluating item/test information and reliability, and for testing whether the items show differential item functioning (DIF) in terms of discrimination as a function of age/sex, etc.

According to p. 27 of Burkner’s white paper, the two-parameter (i.e., easiness and discrimination) model for a logistic model is as follows:

P(y = 1) = \mu = \text{logistic}(\alpha_i(\theta_p + \xi_i))

He noted, “There are multiple ways to force \alpha_i i to be positive, one of which is to model it on the log-scale, that is to estimate log \alpha_i and then exponentiating the result to obtain the actual discrimination via \alpha_i = exp(log \alpha_i)…The parameter eta represents the sum of person parameter and item easiness, whereas logalpha represents the log discrimination.

That was my goal to estimate with the following lines of the model formula (based on his code example on p. 27):

response ~ exp(logalpha) * eta,
eta ~ 1 + (1 | item) + (1 | subid),
logalpha ~ 1 + (1 | item),
nl = TRUE

However, I’m not sure how well this generalizes to the negative binomial model. I tried to adapt your simulated example using the following code, but the model did not converge, suggesting that—if this is the correct formulation—I might need different priors:

bf(
  count ~ exp(logalpha) * eta,
  eta ~ 1 + (1 | item) + (1 | person),
  logalpha ~ 1 + (1 | item),
  shape ~ 1 + (1 | item),
  zi ~ 1 + (1 | item),
  nl = TRUE
)

Is it be possible to estimate item easiness and discrimination in this framework? Thanks again for all your help with this!

No, I don’t think so. The thing is the 2PL you’re presenting is in the terms of the binomial likelihood where you’re using the logit link and so on. That’s just not going to translate directly to other likelihoods. With unbounded counts, you have mean-centric parameters which correspond roughly to the easiness parameters, and you have dispersion-centric parameters which correspond roughly to the discrimination parameters. And of course with hurdle and/or zero-inflation likelihoods, you have parameters for those extra zero’s, which are entirely different from the simple 2PL framework. But regardless of the likelihood, you do still have person parameters and item parameters.

That makes sense that we can’t use the same binomial parametrization of easiness and discrimination when estimating negative binomial models. And I can see how the mean-centric parameter (mu) is similar to easiness. However, I don’t believe that the dispersion-centric parameter (shape) is similar to discrimination. An item’s discrimination represents the strength of the association between the item scores and a person’s latent ability estimate (theta; \theta), not the variability of responses. If possible, it would be nice to be able to estimate how strongly each item is correlated with the latent trait. This can help inform item selection, for instance, whether an item needs to be dropped at a given age because it isn’t correlated with the latent trait.

It looks like there are recent extensions of count models that allow estimation of a discrimination parameter. For instance, Beisemann (2022) adapted the Conway-Maxwell-Poisson model to allow estimation of a discrimination parameter (i.e., Two-Parameter Conway-Maxwell-Poisson model; 2PCMPM). Myszkowski & Storme (2021) adapted the Poisson Counts model to allow estimation of a discrimination parameter (2PPCM). The relevant R package for estimating these models is here: https://github.com/mbsmn/countirt.

Beisemann (2022, pp. 413–414) described the estimation of item difficulty and discrimination in the two-parameter Conway-Maxwell-Poisson model:

one assumes that the expected number of counts \mu_{ij} given by person i in response to an item j depends on the item parameters \alpha_j and d_j and the person’s latent ability \theta_{ij} (all on the log scale) as follows…

\mu_{ij} = \text{exp}(a_j(\theta_i - d_j))

Magnus and Garnier-Villarreal (2022, pp. 263–264) describe estimation of the discrimination and location parameters in a Poisson model. In the zero-inflated Poisson, a zero can arise from either the zero state or the Poisson process, whereas a non-zero response can arise only from the Poisson process:

\begin{align} P(Y_{ij} = 0) &= p_{ij} + (1 - p_{ij})\text{exp}(-\lambda_{ij}) \\ P(Y_{ij} = i_ij) &= \frac{(1 - p_{ij})\text{exp}(-\lambda_{ij}\lambda_{ij}^{y_{ij}})}{y_{ij}!}, y_{ij} = 1, 2, ... \end{align}

Then they note: “Wang (2010) uses a logit link function and Bernoulli conditional response distribution to model person i ’s probability of belonging to the zero state for item j (p_{ij}) and a log link function and Poisson conditional response distribution to model the expected count (\lambda_{ij}) given that someone does not belong to the zero state. Each model component has a set of item discrimination and location parameters, as well as a single common latent variable \theta_i that represents substance dependency:

\begin{align} p_{ij} &= \frac{\text{exp}[a_{0j}(\theta_i - b_{0j})]}{1 + \text{exp}[a_{0j}(\theta_i - b_{0j})]} \\ \lambda_{ij} &= \text{exp}[a_{1j}(\theta_i - b_{1j})] \end{align}

In this case, the a_j is the item discrimination and the b_j is the item difficulty. They provide their Stan code here: OSF.

Thus, it seems possible to estimate item discrimination for count models in Stan. Is it possible to extend this to estimate item discrimination in brms? It’d be nice to be able to stay in the brms framework, if possible.

I can see why people use the language of “latent traits” with binary and ordinal data, given the link functions used, and I totally get the latent variable theories associated with conventional IRT analyses. But I just don’t think that language translates well to models fit with other likelihoods when you’re using a multilevel framework like Paul presented in his paper. It just seems like a bad fit. You might need others to weigh in, here.

Thanks, Solomon, and sounds good—I’ll wait for others to chime in.

Just so we’re referring to the same thing…in your example, we could extract the estimates of the latent factor scores (i.e., people’s levels on the latent factor) using posterior_linpred() based on the following code (adapted from Guido’s post here):

library(data.table)

my_fit_posterior <- posterior_linpred(my_fit)

tmp = data.table(t(my_fit_posterior))
setnames(tmp,names(tmp), paste0("iter.",1:ncol(tmp)))
complete.dat = data.table(dat)[complete.cases(dat)]
plp_wide = cbind(complete.dat, tmp)

plp_long = 
  melt(plp_wide,
       id.vars = names(dat),
       variable.name = "iter")

plp_long.ID.iter = 
  plp_long[,list(latent = mean(value)),
           by = c("person","iter")]

plp.long.ID = 
  plp_long.ID.iter[,list(m = mean(latent),
                              CI05 = quantile(latent,.05),
                              CI95 = quantile(latent,.95)),
                        by = c("person")]

Items differ in their “validity” (or association) with respect to the latent factor scores (i.e., item discrimination).