Parameters converted to `double` when compiling with external C++ and MPI map_rect()

Problem statement

I’m encountering a problem using external C++ code with MPI and map_rect(). Briefly, I can compile and link fine when I avoid map_rect(), but when I switch to using MPI + map_rect() I get linker errors.

An example of the several errors I get is:

ok_wheat_pheno_analytic.hpp:(.text._ZN39ok_wheat_pheno_analytic_model_namespace15continuous_dydtIN4stan4math3varEdS3_EEN5Eigen6MatrixIN5boost4math5tools12promote_argsIT_T0_T1_fffE4typeELin1ELi1ELi0ELin1ELi1EEERKNS5_ISA_Lin1ELi1ELi0ELin1ELi1EEERKNS5_ISB_Lin1ELi1ELi0ELin1ELi1EEERKSt6vectorISC_SaISC_EEPSo[_ZN39ok_wheat_pheno_analytic_model_namespace15continuous_dydtIN4stan4math3varEdS3_EEN5Eigen6MatrixIN5boost4math5tools12promote_argsIT_T0_T1_fffE4typeELin1ELi1ELi0ELin1ELi1EEERKNS5_ISA_Lin1ELi1ELi0ELin1ELi1EEERKNS5_ISB_Lin1ELi1ELi0ELin1ELi1EEERKSt6vectorISC_SaISC_EEPSo]+0x325): undefined reference to `boost::math::tools::promote_args<stan::math::var, double, double, double, float, float>::type ok_wheat_pheno_analytic_model_namespace::sigmoid<stan::math::var, double, double, double>(stan::math::var const&, double const&, double const&, double const&, std::ostream*)'

I have encountered and solved (thanks to @bbbales2!) a similar problem before. I believe that the linker error is arising because the linker is looking for (and not finding) a function with signature:

inline stan::math::var sigmoid(const stan::math::var& x,
                         double sen,
                         double mid_pt,
                         double rate,
                         std::ostream* pstream__){

However, I don’t understand why it would be looking for a function with that signature. The x argument is from data and so should be double, but I believe that because these data are unpacked from a large 1-D array and assigned to a local variable they get “promoted” to stan::math::var (that doesn’t trouble me much, although I’m open to suggestions to avoid it). All the other arguments (sen, mid_pt, and rate) are parameters being estimated and should, I believe, only ever be stan::math::var. Interestingly, the problem only manifests if I use:

  target += sum( map_rect(parallel_log_likelihood, theta_vec, dummy, x_r_training, x_i_training) );

The problem goes away if I replace the above code with the following:

  for(i in 1:Nshards){
    target += parallel_log_likelihood(theta_vec, dummy[i], x_r_training[i,], x_i_training[i,]);
  }

I could simply implement a C++ version that matches the signature it’s looking for, but this “solution” would gloss over what to me seems to be a more fundamental problem. Namely that map_rect() is somehow “demoting” parameters to double and thereby dropping gradient information.

I tried to create a minimal reproducible example of the problem, but have been unsuccessful so far.

Am I missing something? Am I somehow off-base in this interpretation? Any guidance would be much appreciated.

More details are provided below.

Implemented external C++ functions

I have defined 3 versions of the sigmoid() function in C++ the signatures of which are given here:

inline double sigmoid(double x,
                         double sen,
                         double mid_pt,
                         double rate,
                         std::ostream* pstream__){
inline stan::math::var sigmoid(double x,
                                  const stan::math::var& sen,
                                  const stan::math::var& mid_pt,
                                  const stan::math::var& rate,
                                  std::ostream* pstream__){
inline stan::math::var sigmoid(const stan::math::var& x,
                                   const stan::math::var& sen,
                                   const stan::math::var& mid_pt,
                                   const stan::math::var& rate,
                                   std::ostream* pstream__){

All three versions are in a header file, which I include via USER_HEADER when compiling with cmdstan. I don’t think I should even need the first version of the function (all double arguments) given my description above regarding the arguments. Thus, I think these function signatures should be sufficient.

Stan code

As mentioned briefly above, the only difference in Stan code between the two versions is in the model block where the code without MPI + map_rect() is:

  for(i in 1:Nshards){
    target += parallel_log_likelihood(theta_vec, dummy[i], x_r_training[i,], x_i_training[i,]);
  }

The code with MPI + map_rect() is:

  target += sum( map_rect(parallel_log_likelihood, theta_vec, dummy, x_r_training, x_i_training) );

I am literally commenting/uncommenting lines between running the compiler, so I’m very confident that this is the only difference.

In both cases, the arguments are the same: theta_vec is a vector that contains transformed parameters (derived using a non-centered parameterization), dummy is a length-Nshards 1-D array of length-0 vectors, x_r_training is a real 2-D array with Nshards rows, and x_i_training is an int 2-D array with Nshards rows.

parallel_log_likelihood is a function that calls run_model(), which calls euler_integrate_ode(), which calls continuous_dydt(), which calls the function in question sigmoid().

Can you show the code for continuous_dydt()?

Actually can you post the whole Stan model? Or is that not possible?

Here is the Stan code ok_wheat_pheno_analytic.stan (18.8 KB) and the header file continuous_functions.hpp (8.2 KB) used to define the external C++ functions. It’s a beast of a model, so please let me know if further explanation is needed. Thanks for the help!

I don’t know if it’s relevant, but in another linker error I noticed that the signature indicates that theta is being passed in as double to continuous_dydt().

This model compiled for me using develop cmdstan (git checkout --recursive https://github.com/stan-dev/cmdstan.git to get a copy).

I renamed things a little, but:

STANCFLAGS=--allow_undefined USER_HEADER=/home/bbales2/tmp/cmdstan-develop/cont.hpp make wheat

With this change in the likelihood:


  //for(i in 1:Nshards){                                                                                                                                                     
  //  target += parallel_log_likelihood(theta_vec, dummy[i], x_r_training[i,], x_i_training[i,]);                                                                            
  //}                                                                                                                                                                        
  target += sum( map_rect(parallel_log_likelihood, theta_vec, dummy, x_r_training, x_i_training) );

Am I doing the thing that failed for you?

It seems like you are doing what was failing for me. I will make sure I have the current develop version (did you mean git checkout --recurse-submodules?) and see if that fixes the problem.

Whoops I meant git clone --recursive https://github.com/stan-dev/cmdstan.git

Sadly, I’m still getting the same errors. The contents of my make/local are:

LDLIBS+=-lpthread
STAN_MPI=true
CXX=mpicxx
TBB_CXX_TYPE=gcc
STANC2=true

Does that help any?

I’m compiling without STANC2=TRUE. Can you try that?

Actually I’m not sure if I was. Trying again with STANC2=TRUE definitely turned on.

OOoo, also I’m not using mpicxx. I’ll just use a copy of your makefile.

Edit: You are right, with your makefile, I’m getting compile errors! Trying to get some other stuff done right now. I will make a note to investigate this (maybe in a couple days?)

You are right, with your makefile, I’m getting compile errors!

Well, I’m glad it isn’t just me. I’ve been struggling with this issue for a week or so and have not made any headway. I will be very thankful for any insight you can provide when you are able to look into it.

Do you have any suggestions for things I might look into to further clarify/isolate the problem?

Ouch

If you just add the signature so things compile, then you can compare the analytic gradients with finite difference ones by looking at the output of diagnose. Something like:

./mymodel diagnose data file=mydata.dat ...

The gradients can be quite wild far away from the solution, so I think you might be able to provide inits to diagnose to get everything under control.

The other thing you could do is modify the compiled model .hpp, remove the generic function signature, and then try to rebuild the model.

Because you aren’t changing the .stan file, the makefiles shouldn’t overwrite the .hpp, and then you might get more informative compile time errors (instead of link time ones which can be hard to work with).

1 Like

I tried your suggestion with rather unexpected results. I compiled the Stan code of the version using MPI and map_rect() as I described before:

I then edited the Stan-generated .hpp file to comment out the generic function signatures for the included external C++ code, like so:

/*
template <typename T0__, typename T1__, typename T2__>
typename boost::math::tools::promote_args<T0__, T1__, T2__>::type
dof_fun(const T0__& To,
            const T1__& rn,
            const T2__& rd, std::ostream* pstream__);

template <typename T0__, typename T1__, typename T2__>
typename boost::math::tools::promote_args<T0__, T1__, T2__>::type
A_fun(const T0__& To,
          const T1__& rn,
          const T2__& rd, std::ostream* pstream__);

template <typename T0__, typename T1__, typename T2__, typename T3__, typename T4__>
typename boost::math::tools::promote_args<T0__, T1__, T2__, T3__, typename boost::math::tools::promote_args<T4__>::type>::type
exp_tt(const T0__& T,
           const T1__& A,
           const T2__& rn,
           const T3__& rdenom,
           const T4__& dof, std::ostream* pstream__);

template <typename T0__, typename T1__, typename T2__, typename T3__>
typename boost::math::tools::promote_args<T0__, T1__, T2__, T3__>::type
sigmoid(const T0__& x,
            const T1__& sen,
            const T2__& mid_pt,
            const T3__& rate, std::ostream* pstream__);
*/

I then rebuilt the model and everything compiled. Preliminary checking indicates that sampling is working fine.

Does it have something to do with how namespaces are managed within MPI and/or map_rect()?

Do these results mean that the linker errors were spurious? Or should I still be concerned?

I didn’t think about this, but a double can implicitly promote to a var so this would compile.

From these lines in the code:

real wth[Nw, Nwv] = unpack_wth(x_i, x_r);
vector[Ncol] yini = rep_vector(0, Ncol);

It looks to me like the first argument to sigmoid and exp_tt is the only argument that is a function of the parameters of your model?

Like for these:

dydt[1] = exp_tt(wth[1], theta[9], theta[10], theta[11], theta[12]);
vrn_fac = sigmoid(y[1],theta[7], theta[6]/2, 10/theta[6]);

theta is never a function of parameters?

Thanks for looking at it again.

Yes, I think those lines do relate to a promotion to var as I mentioned in my original post:

That’s not the part that troubles me. It’s the fact that the other parameters are converted from var to double.

Actually, the opposite is true. Based solely on the function signature that the compiler is looking for, your statement would be true, but I don’t think that is what the compiler should be looking for. theta_vec is a vector of transformed parameters (defined in transformed parameters block) which is passed (via map_rect() to the theta argument of parallel_log_likelihood() on line 636 of ok_wheat_pheno_analytic.stan):

target += sum( map_rect(parallel_log_likelihood, theta_vec, dummy, x_r_training, x_i_training) );

The theta argument of parallel_log_likelihood() is then subsequently passed to the theta argument of run_model() on line 442:

  vector[Nobs] sim = run_model(theta, dummy, x_r, x_i); 

A subset of that theta is then passed to the theta argument of euler_integrate_ode() on line 418:

    output = euler_integrate_ode(yini, theta[p[1]:p[2]], t, wth[w_ind[s,1]:w_ind[s,2],], 1.);

That theta is passed on as the theta argument of continuous_dydt() on line 373:

     dy_dt = continuous_dydt(y, theta, forcings[i,]);

Within continuous_dydt() some elements of the theta argument are passed on to the sigmoid() function on line 346:

  vrn_fac = sigmoid(y[1],theta[7], theta[6]/2, 10/theta[6]);

and line 353:

  ppd_fac = sigmoid(wth[2], theta[13], theta[14], theta[15]);

It seems to me that a vector that starts out as var being passed to map_rect() should still be var when it gets to sigmoid(). Or am I missing something?

The fact that everything compiles when commenting out the generic function signatures indicates to me that maybe the compiler actually has everything it needs, but is getting confused into trying to compile permutations of the function signature that are not actually needed?

Gaah, you’re right, I got the theta thing exactly backwards.

Alright how about comment out the automatically generated definitions and also add this to your custom functions list:

inline stan::math::var sigmoid(const stan::math::var& x,
                               const double& sen,
                               const double& mid_pt,
                               const double& rate,
                               std::ostream* pstream__) = delete;

This will avoid the promotion and give you a C++ error and then maybe it will be possible to piece together the template logic (and at least find where things start looking weird)?

My compiler gives me this:

wheat.hpp: In instantiation of ‘Eigen::Matrix<typename boost::math::tools::promote_args<RT1, RT2, A>::type, -1, 1> wheat_model_namespace::continuous_dydt(const Eigen::Matrix<Scalar, -1, 1>&, const Eigen::Matrix<RhsScalar, -1, 1>&, const std::vector<T_l>&, std::ostream*) [with T0__ = stan::math::var; T1__ = double; T2__ = stan::math::var; typename boost::math::tools::promote_args<RT1, RT2, A>::type = stan::math::var; std::ostream = std::basic_ostream<char>]’:
wheat.hpp:2011:24:   required from ‘std::vector<std::vector<typename boost::math::tools::promote_args<T1, T2, T3, T4>::type> > wheat_model_namespace::euler_integrate_ode(const Eigen::Matrix<Scalar, -1, 1>&, const Eigen::Matrix<RhsScalar, -1, 1>&, const std::vector<int>&, const std::vector<std::vector<T_l> >&, const T4__&, std::ostream*) [with T0__ = stan::math::var; T1__ = double; T3__ = stan::math::var; T4__ = double; typename boost::math::tools::promote_args<T1, T2, T3, T4>::type = stan::math::var; std::ostream = std::basic_ostream<char>]’
wheat.hpp:2243:28:   required from ‘Eigen::Matrix<typename boost::math::tools::promote_args<RT1, RT2, A>::type, -1, 1> wheat_model_namespace::run_model(const Eigen::Matrix<Scalar, -1, 1>&, const Eigen::Matrix<RhsScalar, -1, 1>&, const std::vector<T_l>&, const std::vector<int>&, std::ostream*) [with T0__ = double; T1__ = stan::math::var; T2__ = double; typename boost::math::tools::promote_args<RT1, RT2, A>::type = stan::math::var; std::ostream = std::basic_ostream<char>]’
wheat.hpp:2414:16:   required from ‘Eigen::Matrix<typename boost::math::tools::promote_args<T1, T2>::type, -1, 1> wheat_model_namespace::parallel_log_likelihood(const Eigen::Matrix<Scalar, -1, 1>&, const Eigen::Matrix<RhsScalar, -1, 1>&, const std::vector<double>&, const std::vector<int>&, std::ostream*) [with T0__ = double; T1__ = stan::math::var; typename boost::math::tools::promote_args<T1, T2>::type = stan::math::var; std::ostream = std::basic_ostream<char>]’
wheat.hpp:2454:31:   required from ‘Eigen::Matrix<typename boost::math::tools::promote_args<T1, T2>::type, -1, 1> wheat_model_namespace::parallel_log_likelihood_functor__::operator()(const Eigen::Matrix<Scalar, -1, 1>&, const Eigen::Matrix<RhsScalar, -1, 1>&, const std::vector<double>&, const std::vector<int>&, std::ostream*) const [with T0__ = double; T1__ = stan::math::var; typename boost::math::tools::promote_args<T1, T2>::type = stan::math::var; std::ostream = std::basic_ostream<char>]’
stan/lib/stan_math/stan/math/rev/functor/map_rect_reduce.hpp:69:24:   required from ‘stan::math::matrix_d stan::math::internal::map_rect_reduce<F, double, stan::math::var>::operator()(const vector_d&, const vector_d&, const std::vector<double>&, const std::vector<int>&, std::ostream*) const [with F = wheat_model_namespace::parallel_log_likelihood_functor__; stan::math::matrix_d = Eigen::Matrix<double, -1, -1>; stan::math::vector_d = Eigen::Matrix<double, -1, 1>; std::ostream = std::basic_ostream<char>]’
stan/lib/stan_math/stan/math/prim/functor/mpi_parallel_call.hpp:299:24:   required from ‘stan::math::mpi_parallel_call<call_id, ReduceF, CombineF>::result_t stan::math::mpi_parallel_call<call_id, ReduceF, CombineF>::reduce_combine() [with int call_id = 1; ReduceF = stan::math::internal::map_rect_reduce<wheat_model_namespace::parallel_log_likelihood_functor__, double, stan::math::var>; CombineF = stan::math::internal::map_rect_combine<wheat_model_namespace::parallel_log_likelihood_functor__, double, stan::math::var>; stan::math::mpi_parallel_call<call_id, ReduceF, CombineF>::result_t = Eigen::Matrix<stan::math::var, -1, 1>]’
stan/lib/stan_math/stan/math/prim/functor/mpi_parallel_call.hpp:260:15:   required from ‘static void stan::math::mpi_parallel_call<call_id, ReduceF, CombineF>::distributed_apply() [with int call_id = 1; ReduceF = stan::math::internal::map_rect_reduce<wheat_model_namespace::parallel_log_likelihood_functor__, double, stan::math::var>; CombineF = stan::math::internal::map_rect_combine<wheat_model_namespace::parallel_log_likelihood_functor__, double, stan::math::var>]’
stan/lib/stan_math/stan/math/prim/functor/mpi_distributed_apply.hpp:40:42:   required from ‘void stan::math::mpi_distributed_apply<F>::run() const [with F = stan::math::mpi_parallel_call<1, stan::math::internal::map_rect_reduce<wheat_model_namespace::parallel_log_likelihood_functor__, double, stan::math::var>, stan::math::internal::map_rect_combine<wheat_model_namespace::parallel_log_likelihood_functor__, double, stan::math::var> >]’
wheat.hpp:3849:1:   required from here
wheat.hpp:1903:22: error: use of deleted function ‘stan::math::var wheat_model_namespace::sigmoid(const stan::math::var&, const double&, const double&, const double&, std::ostream*)’
     vrn_fac = sigmoid(
               ~~~~~~~^
                 rvalue(y, cons_list(index_uni(1), nil_index_list()), "y"),
                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                 rvalue(theta, cons_list(index_uni(7), nil_index_list()),
                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                   "theta"),
                   ~~~~~~~~~
                 (rvalue(theta, cons_list(index_uni(6), nil_index_list()),
                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                    "theta") / 2),
                    ~~~~~~~~~~~~~~
                 (10 /
                 ~~~~~ 
                   rvalue(theta, cons_list(index_uni(6), nil_index_list()),
                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                     "theta")), pstream__);
1 Like

I’ve got some NSF annual reporting “fires” to put out, but I’ll give it a shot later (tonight or maybe tomorrow?) and see if I can make sense of that output. Do you happen to know how to interpret/distinguish Scalar vs RhsScalar?