Struggling with user-defined custom c++ function for binning real data into sequence of integer counts

In brief, I’m trying to write a custom c++ function that takes a vector of reals and returns an array of counts given some parameters for data binning (start, end, bin-width). I’m new to c++ and to Stan and will be using rstan to implement various models that will make use of this counting function. In part, I’m also just using this task as a way to learn rstan/Stan/c++ and build experience designing highly customized Stan models. I’m now encountering a parsing error that I can’t figure out and I would appreciate some help (and any suggestions for improvements on any of the code below).

To begin, the error. In an R session I’m calling a script (see below) that builds a Stan model in order to simply test a custom c++ function (also below). The R code is stored in “count2stan.R”, and the c++ function definition is in “stancount.cpp”. When called, the R script produces the following error:

source(“./count2stan.R”)
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
error in ‘model10333a7f6ac4fb_counting’ at line 3, column 4


 1: 
 2: functions {
 3:     array[] int count( vector somereals,
       ^
 4:                     real minimum,

PARSER EXPECTED: “}”
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model ‘counting’ due to the above error.

Now the relevant code (though, see attached for files so you don’t have to copy-n-paste).
count2stan.R (1.1 KB)
stancount.cpp (722 Bytes)

R script:

library(rstan)

model_code <-
'
functions {
    array[] int count( vector somereals,
                    real minimum,
                    real maximum,
                    real delta );
}
data {
    int N;
    int Nbins;
    vector[N] somereals;
    real minimum;
    real maximum;
    real delta;
}
model {}
generated quantities {
    int y[Nbins] = count(somereals, minimum, maximum, delta);
}
'

somereals = c(1, 2, 3.5, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.99, 15.7, 16, 17, 24, 26)
N = length(somereals)
minimum = 5
maximum = 25
delta = 5
Nbins = floor((maximum - minimum) / delta)

mydata <- list(
    somereals = somereals,
    N = N,
    minimum = minimum,
    maximum = maximum,
    delta = delta,
    Nbins = Nbins
)

stancount = stan_model(model_code = model_code,
                        model_name = "counting",
                        allow_undefined = TRUE,
                        includes = paste0('\n#include "',
                                        file.path(getwd(), 'stancount.cpp'),
                                        '"\n'))

fit <- sampling(stancount, data = mydata)

#try(readLines(stanc(model_code=model_code, allow_undefined=T)$cppcode))

And CPP:

#include <stan/math.hpp>

Eigen::Matrix<typename boost::math::tools::promote_args<T0__, T1__, T2__, T3__>::type, Eigen::Dynamic, 1>
count(const Eigen::Matrix<T0__, Eigen::Dynamic, 1>& invec,
          const T1__& minimum,
          const T2__& maximum,
          const T3__& delta, std::ostream* pstream__)
{
    int invec_size = invec.size();
    int cntvec_size = floor((maximum - minimum) / delta);
    Eigen::VectorXi cntvec(cntvec_size) = Eigen::VectorXi::Zero(); //vector<int> cntvec(cntvec_size, 0);
    for(int i=0; i<invec_size; i++){
        if(invec[i] >= minimum && invec[i] < maximum){
            int idx = floor((invec[i] - minimum) / delta);
            cntvec[idx]++;
        }
    }
    return cntvec;
}

The parser accepts the Stan function declaration if I use “vector” or “int” as the return type, but throws the aforementioned error when using “array int”. With “vector” as the return type, I was able to call the following in order to see the c++ code generated and try to match my custom function signature to Stan’s c++ code,

try(readLines(stanc(model_code=model_code, allow_undefined=T)$cppcode))

I got this from “Interfacing with External C++ Code • rstan” and have been trying to follow that guide.

Any help here would be greatly appreciated!

Operating System: Ubuntu 20.04
Interface Version: latest rstan
Compiler/Toolkit: gcc

Sincerely,
Chris

2 Likes

@andrjohns

By “latest RStan” do you mean the version available on CRAN? If so, this version is quite old and will not accept the array keyword-syntax. You will have to try the older syntax, int[] count (...

Yes, I should have been clearer. Sorry about that! I installed from CRAN with install.packages(). I’ll try the older syntax, and maybe try installing the latest dev version of Stan instead of using CRAN.

If you want to try using the newer versions, you can find instructions here: Rstan Versioning - #2 by rok_cesnovar

Okay it’s compiling now. Thanks for the help. I installed the newer version and fixed up my c++ after reading Interfacing with External C++ Code a bit more carefully.

But, I’m now getting a DLL error when I try to call the new c++ function. See attached here for the source,

count2stan.R (1.2 KB)
stancount.cpp (498 Bytes)

And here’s the error,

make would use
g++ -std=gnu++14 -I"/usr/share/R/include" -DNDEBUG   -I"/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include/"  -I"/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/RcppEigen/include/"  -I"/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/RcppEigen/include/unsupported"  -I"/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/BH/include" -I"/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/StanHeaders/include/src/"  -I"/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/StanHeaders/include/"  -I"/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/RcppParallel/include/"  -I"/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1      -fpic  -g -O2 -fdebug-prefix-map=/build/r-base-apO4Ea/r-base-4.2.0=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2  -c file118f9e2f8fd9ba.cpp -o file118f9e2f8fd9ba.o
if test  "zfile118f9e2f8fd9ba.o" != "z"; then \
  echo g++ -std=gnu++14 -shared -L"/usr/lib/R/lib" -Wl,-Bsymbolic-functions -Wl,-z,relro -o file118f9e2f8fd9ba.so file118f9e2f8fd9ba.o  '/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/rstan/lib//libStanServices.a' -L'/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/StanHeaders/lib/' -lStanHeaders -L'/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/RcppParallel/lib/' -ltbb   -L"/usr/lib/R/lib" -lR; \
  g++ -std=gnu++14 -shared -L"/usr/lib/R/lib" -Wl,-Bsymbolic-functions -Wl,-z,relro -o file118f9e2f8fd9ba.so file118f9e2f8fd9ba.o  '/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/rstan/lib//libStanServices.a' -L'/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/StanHeaders/lib/' -lStanHeaders -L'/home/carleton/R/x86_64-pc-linux-gnu-library/4.2/RcppParallel/lib/' -ltbb   -L"/usr/lib/R/lib" -lR; \
fi
Error in dyn.load(libLFile) : 
  unable to load shared object '/tmp/RtmpVPnwCa/file118f9e2f8fd9ba.so':
  /tmp/RtmpVPnwCa/file118f9e2f8fd9ba.so: undefined symbol: _ZN38model118f9e7749f0af_counting_namespace5countIN5Eigen6MatrixIdLin1ELi1ELi0ELin1ELi1EEEdddEESt6vectorIiSaIiEERKT_RKT0_RKT1_RKT2_PSo

If I edit the model_code so that no calls to count() are made, the compilation finishes with no errors. Here’s the error-free version,

model_code <-
'
functions {
    array[] int count( vector rls,
                    real minimum,
                    real maximum,
                    real delta );
}
data {
    int N;
    int Nbins;
    vector[N] somereals;
    real minimum;
    real maximum;
    real delta;
}
transformed data {
    print(somereals);
    array[Nbins] int y;
    print(y);
}
model {}
generated quantities {}
'

Any ideas where to go from here?

At this point I’m going to defer to @andrjohns @hsbadr

I think I have made some progress again, but again got stuck. To move forward, I tried explicitly referring to cmath::floor() instead of just calling floor() in the cpp file. Again, I’ve included here the most recent code:

count2stan.R (1.2 KB)
stancount.cpp (513 Bytes)

That seems to have resolved the previous error (“undefined symbol”). Now the error I’m receiving is,

Compilation ERROR, function(s)/method(s) not created!
Error in compileCode(f, code, language = language, verbose = verbose) : 
  /home/carleton/R/x86_64-pc-linux-gnu-library/4.2/RcppEigen/include/Eigen/src/Core/ProductEvaluators.h:35:90:   required from ‘Eigen::internal::evaluator<Eigen::Product<Lhs, Rhs, Option> >::evaluator(const XprType&) [with Lhs = Eigen::Product<Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<double, double>, const Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>, const Eigen::Matrix<double, 1, -1> >, const Eigen::Transpose<Eigen::Matrix<double, -1, 1> > >, Eigen::Matrix<double, -1, -1>, 0>; Rhs = Eigen::Matrix<double, -1, 1>; int Options = 0; Eigen::internal::evaluator<Eigen::Product<Lhs, Rhs, Option> >::XprType = Eigen::Product<Eigen::Product<Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<double, double>, const Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>, const Eigen::Matrix<double, 1, -1> >, const Eigen::Transpose<Eigen::Matrix<double, -1, 1> > >, Eigen::Matrix<double, -1, -1>, 0>, Eigen::Matrix<double, -1, 1>, 0>]’/ho

I get this error as well when running the examples from Interfacing with External C++ Code. So, I’m guessing there’s some problem with the way my system is set up. The following seems to execute without error,

example(stan_model, package = "rstan", run.dontrun = TRUE)

so I’m guessing there’s just some problem with trying to use external c++ code and it must have something to do with my configuration specifically (since I’m assuming the examples in Interfacing with External C++ Code run for other people.

Any suggestions for where to look next would be most appreciated.