Reduce_sum with occupancy model

I am working through the tutorial, Reduce Sum: A Minimal Example (mc-stan.org). I am able to get the tutorial to work using using Docker with cmdstanr_0.5.1.9000 and Stan 2.29.

I am in the process of converting my own hierarchical occupancy model to run reduce_sum. Conceptually, I loop over units (e.g., sites or visits) and I am trying to use reduce_sum over each unit.

When I use this function without reduce_sum(), the model works for units 1 to n_obs_psi. (This is the index where I am trying to use reduce sums).

 target += unit_prob_loop(
   any_samples, 1, n_obs_psi, y, k_samples, any_seen,
   log_psi, log1m_psi, logit_p, start_idx, end_idx);

However, when I try to use this function with reduce_sum():

  target += 
     reduce_sum(unit_prob_loop, any_samples, grainsize,
       y, k_samples, any_seen,
       log_psi, log1m_psi, logit_p, start_idx, end_idx);

Following the example, I though I just needed to place the start and end variables as the 2nd and 3rd inputs and then drop them. The model compiles, but I get an error when I try to use the model:

Running MCMC with 2 parallel chains, with 4 thread(s) per chain...

Chain 1 Unrecoverable error evaluating the log probability at the initial value. 
Chain 1 Exception: Exception: array[uni, ...] index: accessing element out of range. index 201 out of range; expecting index to be between 1 and 1 (in '/home/rstudio/Documents/.//occupancy_function.stan', line 74, column 4, included from 
Chain 1 Exception: Exception: array[uni, ...] index: accessing element out of range. index 201 out of range; expecting index to be between 1 and 1 (in '/home/rstudio/Documents/.//occupancy_function.stan', line 74, column 4, included from
Chain 1 '/tmp/RtmpRcZsmT/model-4463abae0c1.stan', line 3, column 0) (in '/tmp/RtmpRcZsmT/model-4463abae0c1.stan', line 155, column 2 to line 158, column 56) 
Chain 1 Exception: Exception: array[uni, ...] index: accessing element out of range. index 201 out of range; expecting index to be between 1 and 1 (in '/home/rstudio/Documents/.//occupancy_function.stan', line 74, column 4, included from
Chain 1 '/tmp/RtmpRcZsmT/model-4463abae0c1.stan', line 3, column 0) (in '/tmp/RtmpRcZsmT/model-4463abae0c1.stan', line 155, column 2 to line 158, column 56)
Chain 2 Unrecoverable error evaluating the log probability at the initial value. 
Chain 2 Exception: Exception: array[uni, ...] index: accessing element out of range. index 101 out of range; expecting index to be between 1 and 1 (in '/home/rstudio/Documents/.//occupancy_function.stan', line 74, column 4, included from 
Chain 2 Exception: Exception: array[uni, ...] index: accessing element out of range. index 101 out of range; expecting index to be between 1 and 1 (in '/home/rstudio/Documents/.//occupancy_function.stan', line 74, column 4, included from
Chain 2 '/tmp/RtmpRcZsmT/model-4463abae0c1.stan', line 3, column 0) (in '/tmp/RtmpRcZsmT/model-4463abae0c1.stan', line 155, column 2 to line 158, column 56) 
Chain 2 Exception: Exception: array[uni, ...] index: accessing element out of range. index 101 out of range; expecting index to be between 1 and 1 (in '/home/rstudio/Documents/.//occupancy_function.stan', line 74, column 4, included from
Chain 2 '/tmp/RtmpRcZsmT/model-4463abae0c1.stan', line 3, column 0) (in '/tmp/RtmpRcZsmT/model-4463abae0c1.stan', line 155, column 2 to line 158, column 56)
Warning: Chain 1 finished unexpectedly!

Warning: Chain 2 finished unexpectedly!

Warning: Use read_cmdstan_csv() to read the results of the failed chains.
Warning messages:
1: All chains finished unexpectedly! Use the $output(chain_id) method for more information.
 
2: No chains finished successfully. Unable to retrieve the fit. 
> 

I am already using the Onion method for the model and this example motivated me to encapsulate code into function. I have attached code for a reproducible example:

Session info:

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C             
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
 [1] ggpubr_0.4.0        ggthemes_4.2.4      GGally_2.1.2       
 [4] forcats_0.5.1       stringr_1.4.0       dplyr_1.0.8        
 [7] purrr_0.3.4         readr_2.1.2         tidyr_1.2.0        
[10] tibble_3.1.6        ggplot2_3.3.5       tidyverse_1.3.1    
[13] cmdstanr_0.5.1.9000

loaded via a namespace (and not attached):
 [1] nlme_3.1-157            matrixStats_0.61.0     
 [3] fs_1.5.2                lubridate_1.8.0        
 [5] RColorBrewer_1.1-3      httr_1.4.2             
 [7] rstan_2.29.2.9000       tensorA_0.36.2         
 [9] tools_4.1.2             backports_1.4.1        
[11] utf8_1.2.2              R6_2.5.1               
[13] DBI_1.1.2               colorspace_2.0-3       
[15] withr_2.5.0             tidyselect_1.1.2       
[17] gridExtra_2.3           prettyunits_1.1.1      
[19] processx_3.5.3          curl_4.3.2             
[21] compiler_4.1.2          cli_3.2.0              
[23] rvest_1.0.2             xml2_1.3.3             
[25] posterior_1.2.1         scales_1.1.1           
[27] checkmate_2.0.0         callr_3.7.0            
[29] StanHeaders_2.29.2.9000 minqa_1.2.4            
[31] pkgconfig_2.0.3         lme4_1.1-29            
[33] dbplyr_2.1.1            rlang_1.0.2            
[35] readxl_1.4.0            rstudioapi_0.13        
[37] farver_2.1.0            generics_0.1.2         
[39] jsonlite_1.8.0          car_3.0-12             
[41] distributional_0.3.0    inline_0.3.19          
[43] magrittr_2.0.3          loo_2.5.1              
[45] Matrix_1.4-1            Rcpp_1.0.8.3           
[47] munsell_0.5.0           fansi_1.0.3            
[49] abind_1.4-5             lifecycle_1.0.1        
[51] stringi_1.7.6           carData_3.0-5          
[53] MASS_7.3-56             pkgbuild_1.3.1         
[55] plyr_1.8.7              grid_4.1.2             
[57] parallel_4.1.2          crayon_1.5.1           
[59] lattice_0.20-45         haven_2.4.3            
[61] splines_4.1.2           hms_1.1.1              
[63] knitr_1.38              ps_1.6.0               
[65] pillar_1.7.0            boot_1.3-28            
[67] ggsignif_0.6.3          codetools_0.2-18       
[69] stats4_4.1.2            reprex_2.0.1           
[71] glue_1.6.2              V8_4.1.0               
[73] data.table_1.14.2       RcppParallel_5.1.5     
[75] modelr_0.1.8            nloptr_2.0.0           
[77] vctrs_0.4.0             tzdb_0.3.0             
[79] cellranger_1.1.0        gtable_0.3.0           
[81] reshape_0.8.8           assertthat_0.2.1       
[83] xfun_0.30               broom_0.7.12           
[85] rstatix_0.7.0           ellipsis_0.3.2  
1 Like

In unit_prob_loop you are indexing into any_samples with idx, but any_samples has been sliced by reduce_sum, so the valid indices are 1 to 1 + end - start.

2 Likes

@jsocolar thank you.

For anyone following along, II had to setup a second index in unit_prob_loop():

  for (idx in 1:(1 + end - start)) {
    // loop over units
    int idx_non_slice = idx - 1 + start;
    target_temp += unit_occupancy_prob_lpmf(
      y |
      k_samples,
      any_samples[idx],
      any_seen[idx_non_slice],
      log_psi[idx_non_slice],
      log1m_psi[idx_non_slice],
      logit_p,
      start_idx[idx_non_slice],
      end_idx[idx_non_slice]);
  }

I’ll post here to help the Stan community if I end up changing this code.