Compiled code from Stan and Rcpp

I’m migrating this from a StackOverflow discussion, because I think it will be easier to have a discussion of what’s happening here. The issue is a package that has compiled code from both representations of Stan models and Rcpp functions. A minimally reproducible example can be found in my rcppstan repository. After adding infrastructure from the rstantools package to support Stan models, I get this error from devtools::check():

#> ❯ checking R code for possible problems ... NOTE
#>   meanC: no visible binding for global variable ‘_rcppstan_meanC’
#>   Undefined global functions or variables:
#>     _rcppstan_meanC

And indeed, when I try to call the R function that uses the meanC function, I get an error saying Error in meanC(x) : object '_rcppstan_meanC' not found.

From what I can tell, here is what is changing when I modify the package to work with rstan, and thus the likely cause.

  1. When only using Rcpp, the following is in the src/RcppExports.cpp:

     static const R_CallMethodDef CallEntries[] = {
         {"_rcpptest_timesTwo", (DL_FUNC) &_rcpptest_timesTwo, 1},
         {NULL, NULL, 0}
     };
    
     RcppExport void R_init_rcpptest(DllInfo *dll) {
         R_registerRoutines(dll, NULL, CallEntries, NULL, NULL);
         R_useDynamicSymbols(dll, FALSE);
     }
    
  2. When Stan is incorporated, that code is no longer generated in the src/RcppExports.cpp file. Instead, it appears that this is being handles by the src/init.cpp file created by the rstantools package. The relevant chunk from that file is here:

     static const R_CallMethodDef CallEntries[] = {
       {NULL, NULL, 0}
     };
    
     void attribute_visible R_init_rcppstan(DllInfo *dll) {
       // next line is necessary to avoid a NOTE from R CMD check
       R_registerRoutines(dll, NULL, CallEntries, NULL, NULL);
       R_useDynamicSymbols(dll, TRUE); // necessary for .onLoad() to work
     }
    

@bgoodri has been very helpful on the StackOverflow question, and suggested I remove src/init.cpp. However, this caused a new error:

make: *** No rule to make target 'init.o', needed by 'rcppstan.so'. Stop.
rm stan_files/uni_irt.cc
ERROR: compilation failed for package 'rcppstan'

@bgoodri further suggested changing the line in src/Makevars{.win} from:
OBJECTS = $(SOURCES:.stan=.o) init.o
to just
OBJECTS = $(SOURCES:.stan=.o)

This solved the compilation problem, but has possibly added more confusion. Following these edits, this code now appears in src/RcppExports.cpp as expected:

static const R_CallMethodDef CallEntries[] = {
    {"_rcppstan_meanC", (DL_FUNC) &_rcppstan_meanC, 1},
    {NULL, NULL, 0}
};

RcppExport void R_init_rcppstan(DllInfo *dll) {
    R_registerRoutines(dll, NULL, CallEntries, NULL, NULL);
    R_useDynamicSymbols(dll, FALSE);
}

However, I now get new errors in devtools::check():

#> ❯ checking R code for possible problems ... NOTE
#>   meanC: no visible binding for global variable ‘_rcppstan_meanC’
#>   Undefined global functions or variables:
#>     _rcppstan_meanC
#> 
#> ❯ checking compiled code ... NOTE
#>   File ‘rcppstan/libs/rcppstan.so’:
#>     Found no calls to: ‘R_registerRoutines’, ‘R_useDynamicSymbols’
#>   
#>   It is good practice to register native routines and to disable symbol
#>   search.
#>   
#>   See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual.

It appears that even though the registration is happening in src/RcppExports.cpp, it is for some reason not being recognized when the package builds/installs. The updated repo with all of these change is here: rcppstan. Any help would be greatly appreciated! Thanks!

I think I should have said to change OBJECTS = $(SOURCES:.stan=.o) init.o to OBJECTS = $(SOURCES:.stan=.o) RcppExports.o.

This introduces a new installation error:

installing to /Users/jakethompson/R/rcppstan/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
Error in dyn.load(dllfile) : 
  unable to load shared object '/Users/jakethompson/Documents/GIT/packages/rcppstan/src/rcppstan.so':
  dlopen(/Users/jakethompson/Documents/GIT/packages/rcppstan/src/rcppstan.so, 6): Symbol not found: __Z5meanCN4Rcpp6VectorILi14ENS_15PreserveStorageEEE
  Referenced from: /Users/jakethompson/Documents/GIT/packages/rcppstan/src/rcppstan.so
  Expected in: flat namespace
 in /Users/jakethompson/Documents/GIT/packages/rcppstan/src/rcppstan.so

That undefined symbol is meanC(Rcpp::Vector<14, Rcpp::PreserveStorage>), which I think is coming from the Rcpp stuff rather than the Stan stuff. Can you successfully compile and load just the Rcpp stuff?

Yes, this was my mistake. I needed to regenerate src/rcppstan.so. Now the package builds correctly. I’ve now reached what I think is the last hurdle. The package builds and installs fine, but when attached with library(rcppstan) throws this error:

Error: package or namespace load failed for ‘rcppstan’ in .doLoadActions(where, attach):
 error in load action .__A__.1 for package rcppstan: is(module, "character"): object 'm' not found

I believe this is because src/RcppExports.cpp is using
R_useDynamicSymbols(dll, FALSE);
instead of what the rstantools package uses, which is:
R_useDynamicSymbols(dll, TRUE);

In rstantools the comments say that TRUE is necessary for .onLoad() to work, so I’m assuming this is the cause of the error. I have tried manually setting this to TRUE, but when the package is built, Rcpp regenerates src/RcppExports.cpp, and over rides my changes, putting FALSE back in place.

Do you know how stop the automatic over ride of src/RcppExports.cpp? Or is this a better question for the Rcpp devs?

Yeah. Those Stan things are Rcpp modules. So the question for the Rcpp devs is how to have Rcpp modules and Rcpp non-modules in the same package?

It is possible that you can take the module loading line out of .onLoad() in zzz.R and not load it until right before you call rstan::sampling.

It appears I was mistaken earlier. src/rcppstan.so was generating correctly originally. In an effort to compile only the Rcpp functions, I removed all of the Stan infrastructure, and built the package. This worked fine, with no errors as expected. Then I added back in the Stan files. When rebuilding the package, I got no errors, and assumed it was fixed. This, however, was not the case. When doing a fresh install of the package using devtools::install_github("wjakethompson/rcppstan"), I again get this error:

Error: package or namespace load failed for ‘rcppstan’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/Users/jakethompson/R/rcppstan/libs/rcppstan.so':
  dlopen(/Users/jakethompson/R/rcppstan/libs/rcppstan.so, 6): Symbol not found: __Z5meanCN4Rcpp6VectorILi14ENS_15PreserveStorageEEE
  Referenced from: /Users/jakethompson/R/rcppstan/libs/rcppstan.so
  Expected in: flat namespace
 in /Users/jakethompson/R/rcppstan/libs/rcppstan.so
Error: loading failed
Execution halted

After doing some more rigorous experimenting, it seems the problem is the Makevars files. If I have only the Rcpp functions present (no Stan scripts), the package builds fine. However, when I add back in src/Makevars{.win}, I again get the above error. Is there something in src/Makevars{.win} that would be causing this? I’m not familiar enough those files to understand exactly what they are doing.

Those Makevars files tell R what needs to get compiled and how. It seems that the procedures for making Stan packages and Rcpp packages are interfering with each other.

Yeah, I think the easiest and probably best solution for now anyway is to split off the Stan functionality into a separate package. This way those processes won’t interfere with each other. Thanks for all of you help!