Compiled code from Stan and Rcpp


#1

I’m migrating this from a StackOverflow discussion, because I think it will be easier to have a discussion of what’s happening here. The issue is a package that has compiled code from both representations of Stan models and Rcpp functions. A minimally reproducible example can be found in my rcppstan repository. After adding infrastructure from the rstantools package to support Stan models, I get this error from devtools::check():

#> ❯ checking R code for possible problems ... NOTE
#>   meanC: no visible binding for global variable ‘_rcppstan_meanC’
#>   Undefined global functions or variables:
#>     _rcppstan_meanC

And indeed, when I try to call the R function that uses the meanC function, I get an error saying Error in meanC(x) : object '_rcppstan_meanC' not found.

From what I can tell, here is what is changing when I modify the package to work with rstan, and thus the likely cause.

  1. When only using Rcpp, the following is in the src/RcppExports.cpp:

     static const R_CallMethodDef CallEntries[] = {
         {"_rcpptest_timesTwo", (DL_FUNC) &_rcpptest_timesTwo, 1},
         {NULL, NULL, 0}
     };
    
     RcppExport void R_init_rcpptest(DllInfo *dll) {
         R_registerRoutines(dll, NULL, CallEntries, NULL, NULL);
         R_useDynamicSymbols(dll, FALSE);
     }
    
  2. When Stan is incorporated, that code is no longer generated in the src/RcppExports.cpp file. Instead, it appears that this is being handles by the src/init.cpp file created by the rstantools package. The relevant chunk from that file is here:

     static const R_CallMethodDef CallEntries[] = {
       {NULL, NULL, 0}
     };
    
     void attribute_visible R_init_rcppstan(DllInfo *dll) {
       // next line is necessary to avoid a NOTE from R CMD check
       R_registerRoutines(dll, NULL, CallEntries, NULL, NULL);
       R_useDynamicSymbols(dll, TRUE); // necessary for .onLoad() to work
     }
    

@bgoodri has been very helpful on the StackOverflow question, and suggested I remove src/init.cpp. However, this caused a new error:

make: *** No rule to make target 'init.o', needed by 'rcppstan.so'. Stop.
rm stan_files/uni_irt.cc
ERROR: compilation failed for package 'rcppstan'

@bgoodri further suggested changing the line in src/Makevars{.win} from:
OBJECTS = $(SOURCES:.stan=.o) init.o
to just
OBJECTS = $(SOURCES:.stan=.o)

This solved the compilation problem, but has possibly added more confusion. Following these edits, this code now appears in src/RcppExports.cpp as expected:

static const R_CallMethodDef CallEntries[] = {
    {"_rcppstan_meanC", (DL_FUNC) &_rcppstan_meanC, 1},
    {NULL, NULL, 0}
};

RcppExport void R_init_rcppstan(DllInfo *dll) {
    R_registerRoutines(dll, NULL, CallEntries, NULL, NULL);
    R_useDynamicSymbols(dll, FALSE);
}

However, I now get new errors in devtools::check():

#> ❯ checking R code for possible problems ... NOTE
#>   meanC: no visible binding for global variable ‘_rcppstan_meanC’
#>   Undefined global functions or variables:
#>     _rcppstan_meanC
#> 
#> ❯ checking compiled code ... NOTE
#>   File ‘rcppstan/libs/rcppstan.so’:
#>     Found no calls to: ‘R_registerRoutines’, ‘R_useDynamicSymbols’
#>   
#>   It is good practice to register native routines and to disable symbol
#>   search.
#>   
#>   See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual.

It appears that even though the registration is happening in src/RcppExports.cpp, it is for some reason not being recognized when the package builds/installs. The updated repo with all of these change is here: rcppstan. Any help would be greatly appreciated! Thanks!


#2

I think I should have said to change OBJECTS = $(SOURCES:.stan=.o) init.o to OBJECTS = $(SOURCES:.stan=.o) RcppExports.o.


#3

This introduces a new installation error:

installing to /Users/jakethompson/R/rcppstan/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
Error in dyn.load(dllfile) : 
  unable to load shared object '/Users/jakethompson/Documents/GIT/packages/rcppstan/src/rcppstan.so':
  dlopen(/Users/jakethompson/Documents/GIT/packages/rcppstan/src/rcppstan.so, 6): Symbol not found: __Z5meanCN4Rcpp6VectorILi14ENS_15PreserveStorageEEE
  Referenced from: /Users/jakethompson/Documents/GIT/packages/rcppstan/src/rcppstan.so
  Expected in: flat namespace
 in /Users/jakethompson/Documents/GIT/packages/rcppstan/src/rcppstan.so

#4

That undefined symbol is meanC(Rcpp::Vector<14, Rcpp::PreserveStorage>), which I think is coming from the Rcpp stuff rather than the Stan stuff. Can you successfully compile and load just the Rcpp stuff?


#5

Yes, this was my mistake. I needed to regenerate src/rcppstan.so. Now the package builds correctly. I’ve now reached what I think is the last hurdle. The package builds and installs fine, but when attached with library(rcppstan) throws this error:

Error: package or namespace load failed for ‘rcppstan’ in .doLoadActions(where, attach):
 error in load action .__A__.1 for package rcppstan: is(module, "character"): object 'm' not found

I believe this is because src/RcppExports.cpp is using
R_useDynamicSymbols(dll, FALSE);
instead of what the rstantools package uses, which is:
R_useDynamicSymbols(dll, TRUE);

In rstantools the comments say that TRUE is necessary for .onLoad() to work, so I’m assuming this is the cause of the error. I have tried manually setting this to TRUE, but when the package is built, Rcpp regenerates src/RcppExports.cpp, and over rides my changes, putting FALSE back in place.

Do you know how stop the automatic over ride of src/RcppExports.cpp? Or is this a better question for the Rcpp devs?


#6

Yeah. Those Stan things are Rcpp modules. So the question for the Rcpp devs is how to have Rcpp modules and Rcpp non-modules in the same package?


#7

It is possible that you can take the module loading line out of .onLoad() in zzz.R and not load it until right before you call rstan::sampling.


#8

It appears I was mistaken earlier. src/rcppstan.so was generating correctly originally. In an effort to compile only the Rcpp functions, I removed all of the Stan infrastructure, and built the package. This worked fine, with no errors as expected. Then I added back in the Stan files. When rebuilding the package, I got no errors, and assumed it was fixed. This, however, was not the case. When doing a fresh install of the package using devtools::install_github("wjakethompson/rcppstan"), I again get this error:

Error: package or namespace load failed for ‘rcppstan’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/Users/jakethompson/R/rcppstan/libs/rcppstan.so':
  dlopen(/Users/jakethompson/R/rcppstan/libs/rcppstan.so, 6): Symbol not found: __Z5meanCN4Rcpp6VectorILi14ENS_15PreserveStorageEEE
  Referenced from: /Users/jakethompson/R/rcppstan/libs/rcppstan.so
  Expected in: flat namespace
 in /Users/jakethompson/R/rcppstan/libs/rcppstan.so
Error: loading failed
Execution halted

After doing some more rigorous experimenting, it seems the problem is the Makevars files. If I have only the Rcpp functions present (no Stan scripts), the package builds fine. However, when I add back in src/Makevars{.win}, I again get the above error. Is there something in src/Makevars{.win} that would be causing this? I’m not familiar enough those files to understand exactly what they are doing.


#9

Those Makevars files tell R what needs to get compiled and how. It seems that the procedures for making Stan packages and Rcpp packages are interfering with each other.


#10

Yeah, I think the easiest and probably best solution for now anyway is to split off the Stan functionality into a separate package. This way those processes won’t interfere with each other. Thanks for all of you help!