Model fitting (compilation) error in "wythamewa" model

The problem is as follows: I’m trying to reproduce the results of Aplin, Sheldon & McElreath (2017) by using “wythamewa” package, i.e., fitting a stan model via “ewa_fit” function to the data (all provided within the package).
While the model starts precompiling, the model fitting eventually fails at the compilation stage, producing the following messages and warnings:

Error in compileCode(f, code, language = language, verbose = verbose) : 
                   from file166c284c27c5.cpp:14:C:/Users/User/Documents/R/win-library/4.0/StanHeaders/include/stan/math/rev/core/set_zero_all_adjoints.hpp: 
At global scope:C:/Users/User/Documents/R/win-library/4.0/StanHeaders/include/stan/math/rev/core/set_zero_all_adjoints.hpp:14:13: 
warning: 'void stan::math::set_zero_all_adjoints()' defined but not used [-Wunused-function] static void set_zero_all_adjoints() 
{             ^~~~~~~~~~~~~~~~~~~~~make: *** [C:/PROGRA~1/R/R-40~1.2/etc/x64/Makeconf:229: file166c284c27c5.o] Error 1
In addition: Warning message:
In system(paste(CXX, ARGS), ignore.stdout = TRUE, ignore.stderr = TRUE) :
  'C:/rtools40/usr/mingw_/bin/g++' not found
Error in sink(type = "output") : invalid connection

My sessionInfo():

Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    
system code page: 1251

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] wythamewa_1.03            rethinking_2.13           rstan_2.21.2              ggplot2_3.3.2             StanHeaders_2.21.0-6     
[6] RcppArmadillo_0.9.900.3.0 Rcpp_1.0.5                inline_0.3.16            

loaded via a namespace (and not attached):
 [1] pillar_1.4.6       compiler_4.0.2     prettyunits_1.1.1  tools_4.0.2        pkgbuild_1.1.0     lattice_0.20-41    jsonlite_1.7.1    
 [8] lifecycle_0.2.0    tibble_3.0.3       gtable_0.3.0       pkgconfig_2.0.3    rlang_0.4.7        cli_2.0.2          rstudioapi_0.11   
[15] curl_4.3           mvtnorm_1.1-1      loo_2.3.1          coda_0.19-3        gridExtra_2.3      withr_2.3.0        dplyr_1.0.2       
[22] generics_0.0.2     vctrs_0.3.4        stats4_4.0.2       grid_4.0.2         tidyselect_1.1.0   glue_1.4.2         R6_2.4.1          
[29] processx_3.4.4     fansi_0.4.1        purrr_0.3.4        callr_3.4.4        magrittr_1.5       MASS_7.3-51.6      codetools_0.2-16  
[36] matrixStats_0.56.0 scales_1.1.1       ps_1.3.4           ellipsis_0.3.1     assertthat_0.2.1   shape_1.4.5        colorspace_1.4-1  
[43] V8_3.2.0           RcppParallel_5.0.2 munsell_0.5.0      crayon_1.3.4      

I have browsed through several discussions that seemed relevant to me as a novice to Stan (e.g., “Error in compileCode” kind), yet none of the proposed solutions (e.g., clean installation into default locations without spaces in names) solved the problem thus far .

Any help will be greatly appreciated & thanks for reading!

Hi, I don’t have Windows myself but @andrjohns has helping a lot of people with their Windows issues.

1 Like

Hi Sean,

Sorry I missed this! The first thing to check is your RStan installation. Can you first run:

library(rstan)
example(stan_model, run.dontrun=T,verbose=T)

And post any lines that start with error:?

Next, we need to check how RStan and other packages are configured on your system. Can you post the output from:

Sys.getenv("PATH")
Sys.getenv("BINPREF")
readLines("~/.R/Makevars.win")
readLines("~/.Rprofile")
readLines("~/.Renviron")
devtools::session_info("rstan")

Thanks for the response!

Running

results in no error messages, only warnings (the second is harmless, I think):
Warning messages:

1: In find.package(package, lib.loc, verbose = verbose) :
package ‘base’ found more than once, using the first from
“C:/PROGRA~1/R/R-40~1.2/library/base”,
“C:/Program Files/R/R-4.0.2/library/base”
2: In system(paste(CXX, ARGS), ignore.stdout = TRUE, ignore.stderr = TRUE) :
‘C:/rtools40/usr/mingw_/bin/g++’ not found

As for the following checks:

Sys.getenv("PATH")
[1] "C:\\rtools40\\usr\\bin;C:\\Program Files\\R\\R-4.0.2\\bin\\i386;C:\\Program Files (x86)\\Common Files\\Oracle\\Java\\javapath_target_67096062;C:\\Windows\\SysWOW64;C:\\Windows;C:\\Windows\\SysWOW64\\wbem;C:\\Windows\\SysWOW64\\WindowsPowerShell\\v1.0;C:\\Windows\\System32\\OpenSSH\\;C:\\Program Files\\Calibre2;C:\\Users\\User\\AppData\\Local\\Microsoft\\WindowsApps"

Sys.getenv("BINPREF")
[1] ""

readLines("~/.R/Makevars.win")
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
  cannot open file 'C:/Users/User/Documents/.R/Makevars.win': No such file or directory

readLines("~/.Rprofile")
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
  cannot open file 'C:/Users/User/Documents/.Rprofile': No such file or directory

readLines("~/.Renviron")
[1] "PATH=\"${RTOOLS40_HOME}\\usr\\bin;${PATH}\""

devtools::session_info("rstan")
- Session info ------------------------------------------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 4.0.2 (2020-06-22)
 os       Windows 10 x64              
 system   i386, mingw32               
 ui       RStudio                     
 language (EN)                        
 collate  English_United States.1252  
 ctype    English_United States.1252  
 tz       Asia/Jerusalem              
 date     2020-09-29                  

- Packages ----------------------------------------------------------------------------------------------------------------------------------
 ! package      * version   date       lib source        
   assertthat     0.2.1     2019-03-21 [1] CRAN (R 4.0.2)
   backports      1.1.10    2020-09-15 [1] CRAN (R 4.0.2)
   BH             1.72.0-3  2020-01-08 [1] CRAN (R 4.0.0)
   callr          3.4.4     2020-09-07 [1] CRAN (R 4.0.2)
   checkmate      2.0.0     2020-02-06 [1] CRAN (R 4.0.2)
   cli            2.0.2     2020-02-28 [1] CRAN (R 4.0.2)
   colorspace     1.4-1     2019-03-18 [1] CRAN (R 4.0.2)
   crayon         1.3.4     2017-09-16 [1] CRAN (R 4.0.2)
   curl           4.3       2019-12-02 [1] CRAN (R 4.0.2)
   desc           1.2.0     2018-05-01 [1] CRAN (R 4.0.2)
   digest         0.6.25    2020-02-23 [1] CRAN (R 4.0.2)
   ellipsis       0.3.1     2020-05-15 [1] CRAN (R 4.0.2)
   evaluate       0.14      2019-05-28 [1] CRAN (R 4.0.2)
   fansi          0.4.1     2020-01-08 [1] CRAN (R 4.0.2)
   farver         2.0.3     2020-01-16 [1] CRAN (R 4.0.2)
   ggplot2        3.3.2     2020-06-19 [1] CRAN (R 4.0.2)
   glue           1.4.2     2020-08-27 [1] CRAN (R 4.0.2)
   gridExtra      2.3       2017-09-09 [1] CRAN (R 4.0.2)
   gtable         0.3.0     2019-03-25 [1] CRAN (R 4.0.2)
   inline         0.3.16    2020-09-06 [1] CRAN (R 4.0.2)
   isoband        0.2.2     2020-06-20 [1] CRAN (R 4.0.2)
   jsonlite       1.7.1     2020-09-07 [1] CRAN (R 4.0.2)
   labeling       0.3       2014-08-23 [1] CRAN (R 4.0.0)
   lattice        0.20-41   2020-04-02 [2] CRAN (R 4.0.2)
   lifecycle      0.2.0     2020-03-06 [1] CRAN (R 4.0.2)
   loo            2.3.1     2020-07-14 [1] CRAN (R 4.0.2)
   magrittr       1.5       2014-11-22 [1] CRAN (R 4.0.2)
   MASS           7.3-51.6  2020-04-26 [2] CRAN (R 4.0.2)
   Matrix         1.2-18    2019-11-27 [2] CRAN (R 4.0.2)
   matrixStats    0.56.0    2020-03-13 [1] CRAN (R 4.0.2)
   mgcv           1.8-31    2019-11-09 [2] CRAN (R 4.0.2)
   munsell        0.5.0     2018-06-12 [1] CRAN (R 4.0.2)
   nlme           3.1-148   2020-05-24 [2] CRAN (R 4.0.2)
   pillar         1.4.6     2020-07-10 [1] CRAN (R 4.0.2)
   pkgbuild       1.1.0     2020-07-13 [1] CRAN (R 4.0.2)
   pkgconfig      2.0.3     2019-09-22 [1] CRAN (R 4.0.2)
   pkgload        1.1.0     2020-05-29 [1] CRAN (R 4.0.2)
   praise         1.0.0     2015-08-11 [1] CRAN (R 4.0.2)
   prettyunits    1.1.1     2020-01-24 [1] CRAN (R 4.0.2)
   processx       3.4.4     2020-09-03 [1] CRAN (R 4.0.2)
   ps             1.3.4     2020-08-11 [1] CRAN (R 4.0.2)
   R6             2.4.1     2019-11-12 [1] CRAN (R 4.0.2)
   RColorBrewer   1.1-2     2014-12-07 [1] CRAN (R 4.0.0)
   Rcpp           1.0.5     2020-07-06 [1] CRAN (R 4.0.2)
   RcppEigen      0.3.3.7.0 2019-11-16 [1] CRAN (R 4.0.2)
 D RcppParallel   5.0.2     2020-06-24 [1] CRAN (R 4.0.2)
   rlang          0.4.7     2020-07-09 [1] CRAN (R 4.0.2)
   rprojroot      1.3-2     2018-01-03 [1] CRAN (R 4.0.2)
   rstan          2.21.2    2020-07-27 [1] CRAN (R 4.0.2)
   rstudioapi     0.11      2020-02-07 [1] CRAN (R 4.0.2)
   scales         1.1.1     2020-05-11 [1] CRAN (R 4.0.2)
   StanHeaders    2.21.0-6  2020-08-16 [1] CRAN (R 4.0.2)
   testthat       2.3.2     2020-03-02 [1] CRAN (R 4.0.2)
   tibble         3.0.3     2020-07-10 [1] CRAN (R 4.0.2)
   utf8           1.1.4     2018-05-24 [1] CRAN (R 4.0.2)
   V8             3.2.0     2020-06-19 [1] CRAN (R 4.0.2)
   vctrs          0.3.4     2020-08-29 [1] CRAN (R 4.0.2)
   viridisLite    0.3.0     2018-02-01 [1] CRAN (R 4.0.2)
   withr          2.3.0     2020-09-22 [1] CRAN (R 4.0.2)

[1] C:/Users/User/Documents/R/win-library/4.0
[2] C:/Program Files/R/R-4.0.2/library

 D -- DLL MD5 mismatch, broken installation.

Alright, so it looks like RStan is working fine, but there’s an issue with the model being called by ewa_fit. I’m not familiar with the package, would you be able to post the code to run the model and reproduce the error?

Posting the exact Stan code can be a bit tricky in this case, so I’ve attached the links to the package (wythamewa) into original post (reattaching here).

After installing the package

install.packages(“remotes”)
remotes::install_github(“rmcelreath/wythamewa”)

I’ve loaded the data:

data(“WythamUnequal”)
datx ← WythamUnequal

Then ran:

lms ← list(
“mu[1] + a_bird[bird[i],1] + b_age[1]*age[i]”,
“mu[2] + a_bird[bird[i],2] + b_age[2]*age[i]”,
“mu[3] + a_bird[bird[i],3] + b_age[3]*age[i]”,
“mu[4] + a_bird[bird[i],4] + b_age[4]*age[i]”
)
links ← c(“logit”, “logit”, “log”, “logit”, “”)
prior ← "
mu ~ normal(0,1);
diff_hi ~ cauchy(0,1);
b_age ~ normal(0,1);
to_vector(z_bird) ~ normal(0,1);
L_Rho_bird ~ lkj_corr_cholesky(3);
sigma_bird ~ exponential(2);"
mod1 ← ewa_def( model=lms , prior=prior , link=links , data=datx )

cat(mod1$stan_code)

cat(mod1$stan_code) reproduces the default Stan model code I want to fit:

data(template_raw)

Then I ran ewa_fit with the following code, which results in the error described in the original post:

model_name ← “Sva_Gva_Lva_Pva”

lms ← list(
“mu[1] + a_bird[bird[i],1] + b_age[1]*age[i]”,
“mu[2] + a_bird[bird[i],2] + b_age[2]*age[i]”,
“mu[3] + a_bird[bird[i],3] + b_age[3]*age[i]”,
“mu[4] + a_bird[bird[i],4] + b_age[4]*age[i]”
)
links ← c(“logit”, “logit”, “log”, “logit”, “”)

prior ← "
mu ~ normal(0,1);
diff_hi ~ cauchy(0,1);
b_age ~ normal(0,1);
to_vector(z_bird) ~ normal(0,1);
L_Rho_bird ~ lkj_corr_cholesky(3);
sigma_bird ~ exponential(2);"

mod1 ← ewa_def( model=lms , prior=prior , link=links )
set.seed(1)
m ← ewa_fit( mod1 , warmup=500 , iter=1000 , chains=3 , cores=3 ,
control=list( adapt_delta=0.99 , max_treedepth=12 ) )

Thanks! I’ll dig into it later this afternoon and let you know

This is an error we’ve seen before which is specific to this RStan version (2.21). You can workaround this by running the model through cmdstanR. It looks like the rethinking package has the option to specify using cmdstanR as the backend, but the wythamewa package is too old for that. As a workaround, you can run the model through cmdstanR and then read the results into the stanfit format that would be returned by ewa_fit.

First up, install cmdstanR and cmdstan by following the instructions here: https://mc-stan.org/cmdstanr/articles/cmdstanr.html. If you run into errors with the installation, have a look at the instructions over here as well: Error caused by missing stan header

Then, setup your priors & model as usual:

    model_name <- "Sva_Gva_Lva_Pva"

    lms <- list(
    "mu[1] + a_bird[bird[i],1] + b_age[1]*age[i]",
    "mu[2] + a_bird[bird[i],2] + b_age[2]*age[i]",
    "mu[3] + a_bird[bird[i],3] + b_age[3]*age[i]",
    "mu[4] + a_bird[bird[i],4] + b_age[4]*age[i]"
    )
    links <- c("logit", "logit", "log", "logit", "")

    prior <- "
    mu ~ normal(0,1);
    diff_hi ~ cauchy(0,1);
    b_age ~ normal(0,1);
    to_vector(z_bird) ~ normal(0,1);
    L_Rho_bird ~ lkj_corr_cholesky(3);
    sigma_bird ~ exponential(2);"

    mod1 <- ewa_def( model=lms , prior=prior , link=links )
    set.seed(1)

Then compile and run your model through cmdstanR:

cmd_mod = cmdstan_model(write_stan_tempfile(mod1$stan_code))
cmd_samp = cmd_mod$sample(
    data = your_data,
    num_samples = 500,
    num_warmup = 500,
    num_chains = 3,
    num_cores = 3,
    adapt_delta = 0.99,
    max_treedepth = 12,
    validate_csv = FALSE
)

Finally, read the results back in to the same type of object returned by ewa_fit:

m = rstan::read_stan_csv(cmd_samp$output_files())

First, thanks for the swift replies!

As for the workaround, after installing cmdstanR I’ve ran
cmd_mod = cmdstan_model(write_stan_tempfile(mod1$stan_code))
yet it returned the following error:

Compiling Stan program...
-
Syntax error in 'C:/Users/User/AppData/Local/Temp/RtmpasdJJx/model-46a062561abe.stan', line 35, column 5 to column 11, parsing error:
   -------------------------------------------------
    33:      
    34:      // priors
    35:      !priors!
                    ^
    36:      // init attractions to preference for low
    37:      for ( i in 1:N_birds ) {
   -------------------------------------------------

Ill-formed expression. Found identifier. There are many ways to complete this to a well-formed expression.


mingw32-make.exe: *** [make/program:53: C:/Users/User/AppData/Local/Temp/RtmpasdJJx/model-46a062561abe.hpp] Error 1
Error: An error occured during compilation! See the message above for more information.
In addition: Warning message:
write_stan_tempfile() is deprecated. Please use write_stan_file() instead.

running cmd_samp afterwards results in sampling failure:


Chain 1 Exception: variable does not exist; processing stage=data initialization; variable name=agej; base type=double (in 'C:/Users/User/AppData/Local/Temp/RtmpasdJJx/model-46a0444065db.stan', line 11, column 4 to column 23)
Warning: Chain 1 finished unexpectedly!

Chain 2 Exception: variable does not exist; processing stage=data initialization; variable name=agej; base type=double (in 'C:/Users/User/AppData/Local/Temp/RtmpasdJJx/model-46a0444065db.stan', line 11, column 4 to column 23)
Warning: Chain 2 finished unexpectedly!

Chain 3 Exception: variable does not exist; processing stage=data initialization; variable name=agej; base type=double (in 'C:/Users/User/AppData/Local/Temp/RtmpasdJJx/model-46a0444065db.stan', line 11, column 4 to column 23)
Warning: Chain 3 finished unexpectedly!

Warning: Use read_cmdstan_csv() to read the results of the failed chains.
Warning messages:
1: In cmd_mod$sample(data = datx, num_samples = 500, num_warmup = 500,  :
  'num_cores' is deprecated. Please use 'parallel_chains' instead.
2: In cmd_mod$sample(data = datx, num_samples = 500, num_warmup = 500,  :
  'num_chains' is deprecated. Please use 'chains' instead.
3: In cmd_mod$sample(data = datx, num_samples = 500, num_warmup = 500,  :
  'num_warmup' is deprecated. Please use 'iter_warmup' instead.
4: In cmd_mod$sample(data = datx, num_samples = 500, num_warmup = 500,  :
  'num_samples' is deprecated. Please use 'iter_sampling' instead.
5: All chains finished unexpectedly!
 
6: No chains finished successfully. Unable to retrieve the fit. 

It looks like the priors block wasn’t correctly picked up by ewa_def.

If you run:

lms <- list(
    "mu[1] + a_bird[bird[i],1] + b_age[1]*age[i]",
    "mu[2] + a_bird[bird[i],2] + b_age[2]*age[i]",
    "mu[3] + a_bird[bird[i],3] + b_age[3]*age[i]",
    "mu[4] + a_bird[bird[i],4] + b_age[4]*age[i]"
)
links <- c("logit", "logit", "log", "logit", "")

prior <- "
    mu ~ normal(0,1);
    diff_hi ~ cauchy(0,1);
    b_age ~ normal(0,1);
    to_vector(z_bird) ~ normal(0,1);
    L_Rho_bird ~ lkj_corr_cholesky(3);
    sigma_bird ~ exponential(2);"

mod1 <- ewa_def( model=lms , prior=prior , link=links )

And then run:

cat(mod1$stan_code)

You should be able to see a section in the model block with:

    // priors
    
    mu ~ normal(0,1);
    diff_hi ~ cauchy(0,1);
    b_age ~ normal(0,1);
    to_vector(z_bird) ~ normal(0,1);
    L_Rho_bird ~ lkj_corr_cholesky(3);
    sigma_bird ~ exponential(2);
    // init attractions to preference for low
    for ( i in 1:N_birds ) {
        A[i,1] = 0;
        A[i,2] = A_init[i];    // should be STRENGTH_PREF_PREVEXP
    }
1 Like

Thanks, the compilation seems to complete successfully this time:

Compiling Stan program...
Warning message:
write_stan_tempfile() is deprecated. Please use write_stan_file() instead.

However, the sampling with cmd_samp leads to the same error as above:

Chain 1 Exception: variable does not exist; processing stage=data initialization; variable name=agej; base type=double (in 'C:/Users/User/AppData/Local/Temp/RtmpasdJJx/model-46a02fbc6ee2.stan', line 11, column 4 to column 23)
Warning: Chain 1 finished unexpectedly!

Chain 2 Exception: variable does not exist; processing stage=data initialization; variable name=agej; base type=double (in 'C:/Users/User/AppData/Local/Temp/RtmpasdJJx/model-46a02fbc6ee2.stan', line 11, column 4 to column 23)
Warning: Chain 2 finished unexpectedly!

Chain 3 Exception: variable does not exist; processing stage=data initialization; variable name=agej; base type=double (in 'C:/Users/User/AppData/Local/Temp/RtmpasdJJx/model-46a02fbc6ee2.stan', line 11, column 4 to column 23)
Warning: Chain 3 finished unexpectedly!

Warning: Use read_cmdstan_csv() to read the results of the failed chains.
Warning messages:
1: In cmd_mod$sample(data = datx, num_samples = 500, num_warmup = 500,  :
  'num_cores' is deprecated. Please use 'parallel_chains' instead.
2: In cmd_mod$sample(data = datx, num_samples = 500, num_warmup = 500,  :
  'num_chains' is deprecated. Please use 'chains' instead.
3: In cmd_mod$sample(data = datx, num_samples = 500, num_warmup = 500,  :
  'num_warmup' is deprecated. Please use 'iter_warmup' instead.
4: In cmd_mod$sample(data = datx, num_samples = 500, num_warmup = 500,  :
  'num_samples' is deprecated. Please use 'iter_sampling' instead.
5: All chains finished unexpectedly!
 
6: No chains finished successfully. Unable to retrieve the fit.

Does it indicate a problem in the “wythamewa” model itself? (i.e., specification of a variable not provided in the data)

It looks the ewa_def function does some custom data pre-processing. So you need to use the data from that function in the cmdstanr call:

cmd_samp = cmd_mod$sample(
    data = mod1$data,
    num_samples = 500,
    num_warmup = 500,
    num_chains = 3,
    num_cores = 3,
    adapt_delta = 0.99,
    max_treedepth = 12,
    validate_csv = FALSE
)

Also, it’s probably best practice not to use the deprecated functions (I’m a bit behind on cmdstanR!), so the workflow should be:

cmd_mod = cmdstan_model(write_stan_file(mod1$stan_code))
cmd_samp = cmd_mod$sample(
    data = mod1$data,
    iter_sampling = 500,
    iter_warmup = 500,
    chains = 3,
    parallel_chains = 3,
    adapt_delta = 0.99,
    max_treedepth = 12,
    validate_csv = FALSE
)

m = rstan::read_stan_csv(cmd_samp$output_files())
1 Like

The workaround with cmdstanR did the job.

All 3 chains finished successfully.
Mean chain execution time: 12583.6 seconds.
Total execution time: 15676.8 seconds.

Many thanks for your help, it’s deeply appreciated!
Sean

1 Like

No worries, glad that we could get you sorted!

1 Like