Vector-valued functions with manual gradients in external C++

Hi,

I’m trying to develop an ODE model which will eventually use values from an external C++ library to inform the derivatives (and gradients of this function). Hence, I’ve started by writing a very simple ODE model with the derivative function defined in an external C++ function.

I’ve successfully gotten the model to compile with gradients computed with autodiff, and have am now attempting to also manually specify the gradients in the C++ function using make_callback_var. However I’m a bit unsure how to assign the gradients of a vector-valued function wrt the vector input and have gotten the following error related to could not convert ‘stan::math::make_callback_var<Eigen::Matrix<double, -1, 1>& ... from ‘stan::math::var_value<Eigen::Matrix<double, -1, 1> >’ to ‘Eigen::Matrix<stan::math::var_value<double>, -1, 1>’.

Full error log is below, with Stan model and C++ header.

Any ideas what I might be doing wrong?

Thanks,
Alex

--- Compiling, linking C++ code ---
g++ -std=c++1y -pthread -D_REENTRANT -Wno-sign-compare -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include    -O3 -I src -I stan/src -I stan/lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.4.0 -I stan/lib/stan_math/lib/boost_1.78.0 -I stan/lib/stan_math/lib/sundials_6.1.1/include -I stan/lib/stan_math/lib/sundials_6.1.1/src/sundials    -DBOOST_DISABLE_ASSERTS          -c -Wno-ignored-attributes   -include /home/alecks/git/MinimalODEExample/odefunc.hpp -x c++ -o /home/alecks/git/MinimalODEExample/minimalODEExampleExternal.o /home/alecks/git/MinimalODEExample/minimalODEExampleExternal.hpp
In file included from stan/lib/stan_math/lib/boost_1.78.0/boost/multi_array/multi_array_ref.hpp:32,
                 from stan/lib/stan_math/lib/boost_1.78.0/boost/multi_array.hpp:34,
                 from stan/lib/stan_math/lib/boost_1.78.0/boost/numeric/odeint/algebra/multi_array_algebra.hpp:22,
                 from stan/lib/stan_math/lib/boost_1.78.0/boost/numeric/odeint.hpp:63,
                 from stan/lib/stan_math/stan/math/prim/functor/ode_rk45.hpp:9,
                 from stan/lib/stan_math/stan/math/prim/functor/integrate_ode_rk45.hpp:6,
                 from stan/lib/stan_math/stan/math/prim/functor.hpp:15,
                 from stan/lib/stan_math/stan/math/rev/fun.hpp:198,
                 from stan/lib/stan_math/stan/math/rev.hpp:10,
                 from stan/lib/stan_math/stan/math.hpp:19,
                 from stan/src/stan/model/model_header.hpp:4,
                 from /home/alecks/git/MinimalODEExample/odefunc.hpp:3,
                 from <command-line>:
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:180:45: warning: ‘template<class _Arg, class _Result> struct std::unary_function’ is deprecated [-Wdeprecated-declarations]
  180 |         : public boost::functional::detail::unary_function<typename unary_traits<Predicate>::argument_type,bool>
      |                                             ^~~~~~~~~~~~~~
In file included from /usr/include/c++/13.1.1/string:49,
                 from /usr/include/c++/13.1.1/bits/locale_classes.h:40,
                 from /usr/include/c++/13.1.1/bits/ios_base.h:41,
                 from /usr/include/c++/13.1.1/ios:44,
                 from /usr/include/c++/13.1.1/istream:40,
                 from /usr/include/c++/13.1.1/sstream:40,
                 from /usr/include/c++/13.1.1/complex:45,
                 from stan/lib/stan_math/lib/eigen_3.4.0/Eigen/Core:50,
                 from stan/lib/stan_math/lib/eigen_3.4.0/Eigen/Dense:1,
                 from stan/lib/stan_math/stan/math/prim/fun/Eigen.hpp:22,
                 from stan/lib/stan_math/stan/math/rev.hpp:4:
/usr/include/c++/13.1.1/bits/stl_function.h:117:12: note: declared here
  117 |     struct unary_function
      |            ^~~~~~~~~~~~~~
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:214:45: warning: ‘template<class _Arg1, class _Arg2, class _Result> struct std::binary_function’ is deprecated [-Wdeprecated-declarations]
  214 |         : public boost::functional::detail::binary_function<
      |                                             ^~~~~~~~~~~~~~~
/usr/include/c++/13.1.1/bits/stl_function.h:131:12: note: declared here
  131 |     struct binary_function
      |            ^~~~~~~~~~~~~~~
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:252:45: warning: ‘template<class _Arg, class _Result> struct std::unary_function’ is deprecated [-Wdeprecated-declarations]
  252 |         : public boost::functional::detail::unary_function<
      |                                             ^~~~~~~~~~~~~~
/usr/include/c++/13.1.1/bits/stl_function.h:117:12: note: declared here
  117 |     struct unary_function
      |            ^~~~~~~~~~~~~~
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:299:45: warning: ‘template<class _Arg, class _Result> struct std::unary_function’ is deprecated [-Wdeprecated-declarations]
  299 |         : public boost::functional::detail::unary_function<
      |                                             ^~~~~~~~~~~~~~
/usr/include/c++/13.1.1/bits/stl_function.h:117:12: note: declared here
  117 |     struct unary_function
      |            ^~~~~~~~~~~~~~
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:345:57: warning: ‘template<class _Arg, class _Result> struct std::unary_function’ is deprecated [-Wdeprecated-declarations]
  345 |     class mem_fun_t : public boost::functional::detail::unary_function<T*, S>
      |                                                         ^~~~~~~~~~~~~~
/usr/include/c++/13.1.1/bits/stl_function.h:117:12: note: declared here
  117 |     struct unary_function
      |            ^~~~~~~~~~~~~~
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:361:58: warning: ‘template<class _Arg1, class _Arg2, class _Result> struct std::binary_function’ is deprecated [-Wdeprecated-declarations]
  361 |     class mem_fun1_t : public boost::functional::detail::binary_function<T*, A, S>
      |                                                          ^~~~~~~~~~~~~~~
/usr/include/c++/13.1.1/bits/stl_function.h:131:12: note: declared here
  131 |     struct binary_function
      |            ^~~~~~~~~~~~~~~
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:377:63: warning: ‘template<class _Arg, class _Result> struct std::unary_function’ is deprecated [-Wdeprecated-declarations]
  377 |     class const_mem_fun_t : public boost::functional::detail::unary_function<const T*, S>
      |                                                               ^~~~~~~~~~~~~~
/usr/include/c++/13.1.1/bits/stl_function.h:117:12: note: declared here
  117 |     struct unary_function
      |            ^~~~~~~~~~~~~~
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:393:64: warning: ‘template<class _Arg1, class _Arg2, class _Result> struct std::binary_function’ is deprecated [-Wdeprecated-declarations]
  393 |     class const_mem_fun1_t : public boost::functional::detail::binary_function<const T*, A, S>
      |                                                                ^~~~~~~~~~~~~~~
/usr/include/c++/13.1.1/bits/stl_function.h:131:12: note: declared here
  131 |     struct binary_function
      |            ^~~~~~~~~~~~~~~
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:438:61: warning: ‘template<class _Arg, class _Result> struct std::unary_function’ is deprecated [-Wdeprecated-declarations]
  438 |     class mem_fun_ref_t : public boost::functional::detail::unary_function<T&, S>
      |                                                             ^~~~~~~~~~~~~~
/usr/include/c++/13.1.1/bits/stl_function.h:117:12: note: declared here
  117 |     struct unary_function
      |            ^~~~~~~~~~~~~~
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:454:62: warning: ‘template<class _Arg1, class _Arg2, class _Result> struct std::binary_function’ is deprecated [-Wdeprecated-declarations]
  454 |     class mem_fun1_ref_t : public boost::functional::detail::binary_function<T&, A, S>
      |                                                              ^~~~~~~~~~~~~~~
/usr/include/c++/13.1.1/bits/stl_function.h:131:12: note: declared here
  131 |     struct binary_function
      |            ^~~~~~~~~~~~~~~
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:470:67: warning: ‘template<class _Arg, class _Result> struct std::unary_function’ is deprecated [-Wdeprecated-declarations]
  470 |     class const_mem_fun_ref_t : public boost::functional::detail::unary_function<const T&, S>
      |                                                                   ^~~~~~~~~~~~~~
/usr/include/c++/13.1.1/bits/stl_function.h:117:12: note: declared here
  117 |     struct unary_function
      |            ^~~~~~~~~~~~~~
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:487:68: warning: ‘template<class _Arg1, class _Arg2, class _Result> struct std::binary_function’ is deprecated [-Wdeprecated-declarations]
  487 |     class const_mem_fun1_ref_t : public boost::functional::detail::binary_function<const T&, A, S>
      |                                                                    ^~~~~~~~~~~~~~~
/usr/include/c++/13.1.1/bits/stl_function.h:131:12: note: declared here
  131 |     struct binary_function
      |            ^~~~~~~~~~~~~~~
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:533:73: warning: ‘template<class _Arg, class _Result> struct std::unary_function’ is deprecated [-Wdeprecated-declarations]
  533 |     class pointer_to_unary_function : public boost::functional::detail::unary_function<Arg,Result>
      |                                                                         ^~~~~~~~~~~~~~
/usr/include/c++/13.1.1/bits/stl_function.h:117:12: note: declared here
  117 |     struct unary_function
      |            ^~~~~~~~~~~~~~
stan/lib/stan_math/lib/boost_1.78.0/boost/functional.hpp:557:74: warning: ‘template<class _Arg1, class _Arg2, class _Result> struct std::binary_function’ is deprecated [-Wdeprecated-declarations]
  557 |     class pointer_to_binary_function : public boost::functional::detail::binary_function<Arg1,Arg2,Result>
      |                                                                          ^~~~~~~~~~~~~~~
/usr/include/c++/13.1.1/bits/stl_function.h:131:12: note: declared here
  131 |     struct binary_function
      |            ^~~~~~~~~~~~~~~
/home/alecks/git/MinimalODEExample/odefunc.hpp: In instantiation of ‘Eigen::Matrix<typename boost::math::tools::promote_args<typename stan::base_type<T>::type>::type, -1, 1> minimalODEExampleExternal_model_namespace::odefunc(const double&, const T1__&, const double&, std::ostream*) [with T1__ = Eigen::Matrix<stan::math::var_value<double>, -1, 1>; stan::require_all_t<stan::is_col_vector<T>, stan::is_vt_not_complex<T1__> >* <anonymous> = 0; typename boost::math::tools::promote_args<typename stan::base_type<T>::type>::type = stan::math::var_value<double>; typename stan::base_type<T>::type = stan::math::var_value<double>; std::ostream = std::basic_ostream<char>]’:
/home/alecks/git/MinimalODEExample/minimalODEExampleExternal.hpp:38:0:   required from ‘Eigen::Matrix<typename boost::math::tools::promote_args<T0__, typename stan::base_type<T1__, void>::type, T2__>::type, -1, 1> minimalODEExampleExternal_model_namespace::odefunc_variadic2_functor__::operator()(const T0__&, const T1__&, std::ostream*, const T2__&) const [with T0__ = double; T1__ = Eigen::Matrix<stan::math::var_value<double>, -1, 1>; T2__ = double; stan::require_all_t<stan::is_stan_scalar<Var1>, stan::is_col_vector<Col>, stan::is_vt_not_complex<T1__>, stan::is_stan_scalar<T2__> >* <anonymous> = 0; typename boost::math::tools::promote_args<T0__, typename stan::base_type<T1__, void>::type, T2__>::type = stan::math::var_value<double>; typename stan::base_type<T1__, void>::type = stan::math::var_value<double>; std::ostream = std::basic_ostream<char>]’
stan/lib/stan_math/stan/math/rev/functor/coupled_ode_system.hpp:130:40:   required from ‘stan::math::coupled_ode_system_impl<false, minimalODEExampleExternal_model_namespace::odefunc_variadic2_functor__, stan::math::var_value<double>, const double&>::operator()(const std::vector<double, std::allocator<double> >&, std::vector<double, std::allocator<double> >&, double)::<lambda(auto:616&& ...)> [with auto:616 = {const double&}]’
stan/lib/stan_math/stan/math/prim/functor/apply.hpp:26:11:   required from ‘constexpr decltype(auto) stan::math::internal::apply_impl(F&&, Tuple&&, std::index_sequence<I ...>) [with F = stan::math::coupled_ode_system_impl<false, minimalODEExampleExternal_model_namespace::odefunc_variadic2_functor__, stan::math::var_value<double>, const double&>::operator()(const std::vector<double, std::allocator<double> >&, std::vector<double, std::allocator<double> >&, double)::<lambda(auto:616&& ...)>; Tuple = std::tuple<const double&>&; long unsigned int ...I = {0}; std::index_sequence<I ...> = std::integer_sequence<long unsigned int, 0>]’
stan/lib/stan_math/stan/math/prim/functor/apply.hpp:44:30:   required from ‘constexpr decltype(auto) stan::math::apply(F&&, Tuple&&) [with F = coupled_ode_system_impl<false, minimalODEExampleExternal_model_namespace::odefunc_variadic2_functor__, var_value<double>, const double&>::operator()(const std::vector<double, std::allocator<double> >&, std::vector<double, std::allocator<double> >&, double)::<lambda(auto:616&& ...)>; Tuple = std::tuple<const double&>&]’
stan/lib/stan_math/stan/math/rev/functor/coupled_ode_system.hpp:129:67:   required from ‘void stan::math::coupled_ode_system_impl<false, F, T_y0, Args ...>::operator()(const std::vector<double, std::allocator<double> >&, std::vector<double, std::allocator<double> >&, double) [with F = minimalODEExampleExternal_model_namespace::odefunc_variadic2_functor__; T_y0 = stan::math::var_value<double>; Args = {const double&}]’
stan/lib/stan_math/lib/boost_1.78.0/boost/numeric/odeint/stepper/dense_output_runge_kutta.hpp:329:16:   [ skipping 4 instantiation contexts, use -ftemplate-backtrace-limit=0 to disable ]
stan/lib/stan_math/stan/math/prim/functor/ode_rk45.hpp:145:20:   required from ‘std::vector<Eigen::Matrix<typename stan::return_type<T_y0, T_t0, T_ts, Args ...>::type, -1, 1> > stan::math::ode_rk45_tol_impl(const char*, const F&, const T_y0&, T_t0, const std::vector<T_l>&, double, double, long int, std::ostream*, const Args& ...) [with F = minimalODEExampleExternal_model_namespace::odefunc_variadic2_functor__; T_y0 = Eigen::Matrix<var_value<double>, -1, 1>; T_t0 = double; T_ts = double; Args = {double}; stan::require_eigen_vector_t<T_CPCs>* <anonymous> = 0; typename stan::return_type<T_y0, T_t0, T_ts, Args ...>::type = var_value<double>; std::ostream = std::basic_ostream<char>]’
stan/lib/stan_math/stan/math/prim/functor/ode_rk45.hpp:253:27:   required from ‘std::vector<Eigen::Matrix<typename stan::return_type<T_y0, T_t0, T_ts, Args ...>::type, -1, 1> > stan::math::ode_rk45(const F&, const T_y0&, T_t0, const std::vector<T_l>&, std::ostream*, const Args& ...) [with F = minimalODEExampleExternal_model_namespace::odefunc_variadic2_functor__; T_y0 = Eigen::Matrix<var_value<double>, -1, 1>; T_t0 = double; T_ts = double; Args = {double}; stan::require_eigen_vector_t<T_CPCs>* <anonymous> = 0; typename stan::return_type<T_y0, T_t0, T_ts, Args ...>::type = var_value<double>; std::ostream = std::basic_ostream<char>]’
/home/alecks/git/MinimalODEExample/minimalODEExampleExternal.hpp:297:0:   required from ‘stan::scalar_type_t<T2> minimalODEExampleExternal_model_namespace::minimalODEExampleExternal_model::log_prob_impl(VecR&, VecI&, std::ostream*) const [with bool propto__ = false; bool jacobian__ = false; VecR = Eigen::Matrix<stan::math::var_value<double>, -1, 1>; VecI = Eigen::Matrix<int, -1, 1>; stan::require_vector_like_t<VecR>* <anonymous> = 0; stan::require_vector_like_vt<std::is_integral, VecI>* <anonymous> = 0; stan::require_st_var<VecR>* <anonymous> = 0; stan::scalar_type_t<T2> = stan::math::var_value<double>; std::ostream = std::basic_ostream<char>]’
/home/alecks/git/MinimalODEExample/minimalODEExampleExternal.hpp:543:0:   required from ‘T_ minimalODEExampleExternal_model_namespace::minimalODEExampleExternal_model::log_prob(Eigen::Matrix<T_job_param, -1, 1>&, std::ostream*) const [with bool propto__ = false; bool jacobian__ = false; T_ = stan::math::var_value<double>; std::ostream = std::basic_ostream<char>]’
stan/src/stan/model/model_base_crtp.hpp:98:0:   required from ‘stan::math::var stan::model::model_base_crtp<M>::log_prob(Eigen::Matrix<stan::math::var_value<double>, -1, 1>&, std::ostream*) const [with M = minimalODEExampleExternal_model_namespace::minimalODEExampleExternal_model; stan::math::var = stan::math::var_value<double>; std::ostream = std::basic_ostream<char>]’
stan/src/stan/model/model_base_crtp.hpp:96:0:   required from here
/home/alecks/git/MinimalODEExample/odefunc.hpp:38: error: could not convert ‘stan::math::make_callback_var<Eigen::Matrix<double, -1, 1>&, minimalODEExampleExternal_model_namespace::odefunc<Eigen::Matrix<stan::math::var_value<double>, -1, 1> >(const double&, const Eigen::Matrix<stan::math::var_value<double>, -1, 1>&, const double&, std::ostream*)::<lambda(auto:660&)> >(dydt, <lambda closure object>minimalODEExampleExternal_model_namespace::odefunc<Eigen::Matrix<stan::math::var_value<double>, -1, 1> >(const double&, const Eigen::Matrix<stan::math::var_value<double>, -1, 1>&, const double&, std::ostream*)::<lambda(auto:660&)>{stan::math::arena_matrix<Eigen::Matrix<stan::math::var_value<double>, -1, 1> >(y_arena), Eigen::Matrix<double, -1, 1>(jacobian)})’ from ‘stan::math::var_value<Eigen::Matrix<double, -1, 1> >’ to ‘Eigen::Matrix<stan::math::var_value<double>, -1, 1>’
   38 |   return stan::math::make_callback_var(dydt, [y_arena, jacobian](auto& res){
      |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   39 |     y_arena.adj() += jacobian.transpose() * res.adj();
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   40 |   });
      | 
/home/alecks/git/MinimalODEExample/odefunc.hpp: In instantiation of ‘minimalODEExampleExternal_model_namespace::odefunc<Eigen::Matrix<stan::math::var_value<double>, -1, 1> >(const double&, const Eigen::Matrix<stan::math::var_value<double>, -1, 1>&, const double&, std::ostream*)::<lambda(auto:660&)> [with auto:660 = stan::math::internal::callback_vari<Eigen::Matrix<double, -1, 1>, minimalODEExampleExternal_model_namespace::odefunc<Eigen::Matrix<stan::math::var_value<double>, -1, 1> >(const double&, const Eigen::Matrix<stan::math::var_value<double>, -1, 1>&, const double&, std::ostream*)::<lambda(auto:660&)> >]’:
stan/lib/stan_math/stan/math/rev/core/callback_vari.hpp:21:43:   required from ‘void stan::math::internal::callback_vari<T, F>::chain() [with T = Eigen::Matrix<double, -1, 1>; F = minimalODEExampleExternal_model_namespace::odefunc<Eigen::Matrix<stan::math::var_value<double>, -1, 1> >(const double&, const Eigen::Matrix<stan::math::var_value<double>, -1, 1>&, const double&, std::ostream*)::<lambda(auto:660&)>]’
stan/lib/stan_math/stan/math/rev/core/callback_vari.hpp:21:15:   required from here
/home/alecks/git/MinimalODEExample/odefunc.hpp:39: error: no match for ‘operator+=’ (operand types are ‘const Eigen::CwiseUnaryOp<Eigen::MatrixBase<Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >::adj_Op, const Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >’ and ‘const Eigen::Product<Eigen::Transpose<const Eigen::Matrix<double, -1, 1> >, Eigen::Map<Eigen::Matrix<double, -1, 1>, 0, Eigen::Stride<0, 0> >, 0>’)
   39 |     y_arena.adj() += jacobian.transpose() * res.adj();
      | 
In file included from stan/lib/stan_math/lib/eigen_3.4.0/Eigen/Core:274:
stan/lib/stan_math/lib/eigen_3.4.0/Eigen/src/Core/MatrixBase.h:493:46: note: candidate: ‘template<class OtherDerived> Derived& Eigen::MatrixBase<Derived>::operator+=(const Eigen::ArrayBase<OtherDerived>&) [with Derived = Eigen::CwiseUnaryOp<Eigen::MatrixBase<Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >::adj_Op, const Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >]’
  493 |     template<typename OtherDerived> Derived& operator+=(const ArrayBase<OtherDerived>& )
      |                                              ^~~~~~~~
stan/lib/stan_math/lib/eigen_3.4.0/Eigen/src/Core/MatrixBase.h:493:46: note:   template argument deduction/substitution failed:
/home/alecks/git/MinimalODEExample/odefunc.hpp:39: note:   ‘const Eigen::Product<Eigen::Transpose<const Eigen::Matrix<double, -1, 1> >, Eigen::Map<Eigen::Matrix<double, -1, 1>, 0, Eigen::Stride<0, 0> >, 0>’ is not derived from ‘const Eigen::ArrayBase<Derived>’
   39 |     y_arena.adj() += jacobian.transpose() * res.adj();
      | 
In file included from stan/lib/stan_math/lib/eigen_3.4.0/Eigen/Core:299:
stan/lib/stan_math/lib/eigen_3.4.0/Eigen/src/Core/CwiseBinaryOp.h:175:1: note: candidate: ‘Derived& Eigen::MatrixBase<Derived>::operator+=(const Eigen::MatrixBase<OtherDerived>&) [with OtherDerived = Eigen::Product<Eigen::Transpose<const Eigen::Matrix<double, -1, 1> >, Eigen::Map<Eigen::Matrix<double, -1, 1>, 0, Eigen::Stride<0, 0> >, 0>; Derived = Eigen::CwiseUnaryOp<Eigen::MatrixBase<Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >::adj_Op, const Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >]’ (near match)
  175 | MatrixBase<Derived>::operator+=(const MatrixBase<OtherDerived>& other)
      | ^~~~~~~~~~~~~~~~~~~
stan/lib/stan_math/lib/eigen_3.4.0/Eigen/src/Core/CwiseBinaryOp.h:175:1: note:   passing ‘const Eigen::CwiseUnaryOp<Eigen::MatrixBase<Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >::adj_Op, const Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >*’ as ‘this’ argument discards qualifiers
In file included from stan/lib/stan_math/lib/eigen_3.4.0/Eigen/Core:275:
stan/lib/stan_math/lib/eigen_3.4.0/Eigen/src/Core/EigenBase.h:143:10: note: candidate: ‘Derived& Eigen::DenseBase<Derived>::operator+=(const Eigen::EigenBase<OtherDerived>&) [with OtherDerived = Eigen::Product<Eigen::Transpose<const Eigen::Matrix<double, -1, 1> >, Eigen::Map<Eigen::Matrix<double, -1, 1>, 0, Eigen::Stride<0, 0> >, 0>; Derived = Eigen::CwiseUnaryOp<Eigen::MatrixBase<Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >::adj_Op, const Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >]’ (near match)
  143 | Derived& DenseBase<Derived>::operator+=(const EigenBase<OtherDerived> &other)
      |          ^~~~~~~~~~~~~~~~~~
stan/lib/stan_math/lib/eigen_3.4.0/Eigen/src/Core/EigenBase.h:143:10: note:   passing ‘const Eigen::CwiseUnaryOp<Eigen::MatrixBase<Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >::adj_Op, const Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >*’ as ‘this’ argument discards qualifiers
make: *** [make/program:58: /home/alecks/git/MinimalODEExample/minimalODEExampleExternal] Error 1

minimalODEExampleExternal.stan (559 Bytes)
odefunc.hpp (2.3 KB)

1 Like

This is presumably obvious, but that error’s just telling you the return type is wrong in the odefunc implementation—you can’t cast that make_callback_var to the autodiff type. Here, it’s finding a double matrix type rather than a variable type. I think your dydt variable needs to be declared as a Matrix<var, -1, 1>

This code in your C++ also looks wrong:

Eigen::Matrix<double, -1, 1> jacobian = Eigen::Matrix<double, -1, -1>::Constant(2, 2, 0.0);
    jacobian[0,1] = 1.0;
    jacobian[1,0] = -theta;

You are declaring jacobian to be a vector and then assigning a matrix to it. I don’t see how the jacobian[0, 1] can compile since Eigen doesn’t allow bracket indexing for matrices—it needs to be jacobian(0, 1).

I’ve also simplified your Stan code a bit to use append_array and compound declare-define.

functions {
  vector odefunc(real t, vector y, real theta);
}
data {
  int<lower=1> T;
  real t0;
  real theta; 
  array[T - 1] real times;
  vector[2] y0_prior_mu;
  vector[2] y0_prior_std;
  real sigma;
  vector[T] z;
}
parameters {
  vector[2] y0_raw;
}
transformed parameters {
  vector[2] y0 = y0_raw .* y0_prior_std + y0_prior_mu;
}
model {
  array[T] vector[2] y  = append_array(y0, ode_rk45(odefunc, y0, t0, times, theta));
  z ~ normal(y[, 1], sigma);
}
1 Like

Bob’s answer is right that make_callback_var() is meant to return stan::math::var_value<T> types. In additon to his fix I would make an explicit version for var return types like the following

template<typename T1__, stan::require_all_t<stan::is_col_vector<T1__>,
                                            stan::is_vt_not_complex<T1__>, stan::is_vt_var<T1__>>* = nullptr>
Eigen::Matrix<stan::math::var, -1, 1>

odefunc(const double& t, const T1__& y, const double& theta,
        std::ostream* pstream__)
{
  Eigen::Matrix<double, -1, 1> dydt =
    Eigen::Matrix<double, -1, 1>::Constant(2, 0.0);
  dydt[0] = y[1].val();
  dydt[1] = -theta * y[0].val();

  stan::arena_t<Eigen::Matrix<double, -1, 1>> jacobian = Eigen::Matrix<double, -1, -1>::Constant(2, 2, 0.0);
  jacobian[0, 1] = 1.0;
  jacobian[1, 0] = -theta;

  stan::arena_t<T1__> y_arena = y;
  // Everything matrix/vector that is used in the reverse pass must be put on the arena
  reverse_pass_callback([y_arena, jacobian]() mutable {
    y_arena.adj() += jacobian.transpose() * y_arena.adj();
  });
  return Eigen::Matrix<stan::math::var, -1, 1>(dydt);
}

Though I think having a specialization for var means the compiler may also ask about a version for fvar<T>. If the compiler doesn’t complain then dont worry about it, but the good news is that if it does complain then you can just define the forward mode specialization and leave it empty since the ode solvers never call forward mode autodiff.

Thanks Bob. The version that was compiling was before I’d added the jacobian, but the compiler wasn’t complaining about those lines.

I think I’ve confused myself looking at other examples of precomputed gradients where the result of make_callback_var is returned (albeit for scalar returns).

With your suggested edits and returning dydt I’m now getting a different error:

/home/alecks/git/MinimalODEExample/odefunc.hpp:37: error: invalid use of incomplete type ‘class stan::math::var_value<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, void>’
   37 |   stan::math::make_callback_var(dydt, [y_arena, jacobian](auto& res){
      |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   38 |     y_arena.adj() += jacobian.transpose() * res.adj();
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   39 |   });
      | 
In file included from stan/lib/stan_math/stan/math/rev/core/vari.hpp:4,
                 from stan/lib/stan_math/stan/math/rev/core/var.hpp:8,
                 from stan/lib/stan_math/stan/math/rev/meta/conditional_var_value.hpp:4,
                 from stan/lib/stan_math/stan/math/rev/meta.hpp:6,
                 from stan/lib/stan_math/stan/math/rev/core/accumulate_adjoints.hpp:5,
                 from stan/lib/stan_math/stan/math/rev/core.hpp:4,
                 from stan/lib/stan_math/stan/math/rev.hpp:8:
stan/lib/stan_math/stan/math/rev/core/var_value_fwd_declare.hpp:8:7: note: declaration of ‘class stan::math::var_value<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, void>’
    8 | class var_value;
      |       ^~~~~~~~~

Apologies if I’m missing something obvious! Current version and log attached.

error.txt (23.9 KB)
odefunc.hpp (2.3 KB)

Thanks Steve.

Are you suggesting an explicit version for var return types would be needed alongside or in place of the version I’ve already got?

Adding your code to what I’ve got after Bob’s edits RE types and indexing, I get an additional error complaining that ‘is_vt_var’ is not a member of ‘stan’; did you mean ‘is_var’? . C++ attached.

Also, in your code snippet, you wrote y_arena.adj() += jacobian.transpose() * y_arena.adj();; is there a reason why the maths for the gradient would need to change for the reverse mode?

odefunc-explicitvar.hpp (3.2 KB)

Yes, make_callback_var always returns a var_value<T>, but in the math library we have functions that can return a var_value<double> and a var_value<Eigen::Matrix<double, ...>>. The var_value<Eigen::MatrixXd> is a specialized matrix type, here you want to return a Eigen::Matrix<stan::math::var, -1, 1>. This is the normal matrix type used in our ode solvers and the rest of the math library.

Yes, anything that uses var types should be specialized. We want to make sure when this function is called the right version is used for double or fvar types. We want the specialization so that those instances do not accidentally put things on the reverse mode autodiff call stack.

Sorry that was a typo on my end. I cleaned everything, though I didn’t try to compile the below, vscode did not complain so I think it should work. If not I’ll writeup a test case to try it out.

That was a goof on my part, fixed below!

struct odefunc_functor__
{
  template<typename T0__, typename T1__, typename T2__,
           stan::require_all_t<stan::is_stan_scalar<T0__>, stan::is_col_vector<T1__>,
                               stan::is_vt_not_complex<T1__>,
                               stan::is_stan_scalar<T2__>>* = nullptr>
  Eigen::Matrix<stan::promote_args_t<T0__, stan::base_type_t<T1__>, T2__>, -1, 1> operator()(
    const T0__& t, const T1__& y, const T2__& theta,
    std::ostream* pstream__) const;
};

// Specialization for var types
template<typename T1__, stan::require_all_t<stan::is_col_vector<T1__>,
                                            stan::is_vt_not_complex<T1__>, 
                                            stan::is_var<stan::scalar_type_t<T1__>>>* = nullptr>
Eigen::Matrix<stan::math::var, -1, 1>
odefunc(const double& t, const T1__& y, const double& theta,
        std::ostream* pstream__) {
  // All matrices used in reverse pass callback must be arena allocated
  stan::arena_t<Eigen::Matrix<stan::math::var, -1, 1>> dydt(2);
  dydt[0] = y[1].val();
  dydt[1] = -theta * y[0].val();

  stan::arena_t<Eigen::Matrix<double, -1, 1>> jacobian = Eigen::MatrixXd::Zero(2, 2);
  jacobian[0, 1] = 1.0;
  jacobian[1, 0] = -theta;

  stan::arena_t<T1__> y_arena = y;
  // Use reverse_pass_callback since our return type is an Eigen matrix
  // This adds the reverse pass to our callback stack when we call grad() after building the autodiff tree
  reverse_pass_callback([y_arena, jacobian, dydt]() mutable {
    // Could you just swap 1.0 and -theta's insertion points in jacobian instead of doing transpose?
    y_arena.adj() += jacobian.transpose() * dydt.adj();
  });
  // Make a hard copy of dydt to avoid overwriting values used in reverse pass
  return Eigen::Matrix<stan::math::var, -1, 1>(dydt);
}

/**
 *  Only doing the forward pass of above. Supports double and fvar types
 */
template<typename T1__, stan::require_all_t<stan::is_col_vector<T1__>,
                                            stan::is_vt_not_complex<T1__>>* = nullptr,
                        stan::require_not_vt_var<T1__>* = nullptr>
Eigen::Matrix<stan::scalar_type_t<T1__>, -1, 1>
odefunc(const double& t, const T1__& y, const double& theta,
        std::ostream* pstream__) {
  using scalar_t = stan::scalar_type_t<T1__>;
  Eigen::Matrix<scalar_t, -1, 1> dydt = Eigen::Matrix<scalar_t, -1, 1>::Zero(2);
  dydt[0] = y[1].val();
  dydt[1] = -theta * y[0].val();
  return dydt;
}

template<
  typename T0__, typename T1__, typename T2__,
  stan::require_all_t<stan::is_stan_scalar<T0__>, stan::is_col_vector<T1__>,
                      stan::is_vt_not_complex<T1__>, stan::is_stan_scalar<T2__>>*>
Eigen::Matrix<stan::promote_args_t<T0__, stan::base_type_t<T1__>, T2__>, -1, 1> odefunc_functor__::
operator()(const T0__& t, const T1__& y, const T2__& theta,
           std::ostream* pstream__) const
{
  return odefunc(t, y, theta, pstream__);
}

Once you have everything working this would be a very cool writeup to show people how to add their own functions with custom adjoint calculations in Stan! I’d totally be willing to help with that

2 Likes

Brilliant, thank you! That’s now compiling and running (with a few minor edits including removing .val() on the forward pass version since doubles don’t have a .val). C++ header attached for posterity.

Stansummary from model with external C++:

Inference for Stan model: minimalODEExampleExternal_model
4 chains: each with iter=(1000,1000,1000,1000); warmup=(0,0,0,0); thin=(1,1,1,1); 4000 iterations saved.

Warmup took (30, 30, 30, 30) seconds, 2.0 minutes total
Sampling took (32, 32, 31, 31) seconds, 2.1 minutes total

                   Mean     MCSE   StdDev       5%      50%      95%  N_Eff  N_Eff/s    R_hat

lp__               -6.0    0.093      1.0     -8.2     -5.7     -5.0    119     0.95      1.0
accept_stat__   7.7e-01  9.4e-03  2.0e-01  3.6e-01  8.2e-01  1.0e+00    477      3.8  1.0e+00
stepsize__      9.1e-04  8.7e-05  1.2e-04  7.6e-04  9.7e-04  1.1e-03    2.0    0.016  2.5e+13
treedepth__     9.9e+00  1.2e-02  4.7e-01  9.0e+00  1.0e+01  1.0e+01   1587       13  1.0e+00
n_leapfrog__    9.9e+02  3.7e+00  1.5e+02  5.1e+02  1.0e+03  1.0e+03   1525       12  1.0e+00
divergent__     0.0e+00      nan  0.0e+00  0.0e+00  0.0e+00  0.0e+00    nan      nan      nan
energy__        7.0e+00  9.1e-02  1.4e+00  5.3e+00  6.7e+00  9.9e+00    244      1.9  1.0e+00

y0_raw[1]         -0.20    0.027     0.30    -0.72    -0.20     0.29    124     0.99      1.0
y0_raw[2]          0.21    0.065     0.37    -0.40     0.22     0.84     31     0.25      1.1
y0[1]               4.8    0.027     0.30      4.3      4.8      5.3    124     0.99      1.0
y0[2]               1.2    0.065     0.37     0.60      1.2      1.8     31     0.25      1.1

It’s running >100x slower with a commensurate hit to the effective sample size and a tiny step size compared to the vanilla model with the function coded in Stan (also attached):

Stansummary from vanilla model:

Inference for Stan model: minimalODEExample_model
4 chains: each with iter=(1000,1000,1000,1000); warmup=(0,0,0,0); thin=(1,1,1,1); 4000 iterations saved.

Warmup took (0.070, 0.072, 0.071, 0.070) seconds, 0.28 seconds total
Sampling took (0.070, 0.073, 0.068, 0.080) seconds, 0.29 seconds total

                 Mean     MCSE  StdDev     5%    50%   95%  N_Eff  N_Eff/s    R_hat

lp__             -6.0  2.3e-02     1.0   -8.0   -5.7  -5.0   1897     6519      1.0
accept_stat__    0.91  1.7e-03    0.11   0.67   0.95   1.0   4580    15739  1.0e+00
stepsize__       0.91  1.2e-03   0.052   0.86   0.97  0.97   2000     6873  1.5e+13
treedepth__       1.8  9.8e-03    0.51    1.0    2.0   3.0   2704     9293  1.0e+00
n_leapfrog__      3.4  1.1e-01     1.7    1.0    3.0   7.0    242      832  1.0e+00
divergent__      0.00      nan    0.00   0.00   0.00  0.00    nan      nan      nan
energy__          7.0  3.4e-02     1.4    5.4    6.7   9.8   1680     5772  1.0e+00

y0_raw[1]       -0.23  5.1e-03    0.30  -0.73  -0.23  0.26   3553    12209     1.00
y0_raw[2]        0.23  6.0e-03    0.35  -0.36   0.23  0.82   3459    11886     1.00
y0[1]             4.8  5.1e-03    0.30    4.3    4.8   5.3   3553    12209     1.00
y0[2]             1.2  6.0e-03    0.35   0.64    1.2   1.8   3459    11886     1.00

Is this to be expected or does it suggest there’s an error or some finite differencing happening somewhere for gradient calculation? Obviously with this simple model the answer is to “just write the function in Stan”, but where I’m aiming I need to calculate the derivate dydt and the gradients using another C++ library (for an astrodynamics problem).

I’d be happy to help write this up though - would you want it e.g. as a vignette somewhere in the docs?

odefunc.hpp (2.9 KB)
minimalODEExample.stan (651 Bytes)

1 Like

Looking at the code from the minimal example it’s doing much less work (posted here for posterity)

vector odefunc(real t, vector y, real theta) {
  vector[2] dydt;
  dydt[1] = y[2];
  dydt[2] = -theta * y[1];
  return dydt;
}

Where your C++ code is creating the full jacobian, Stan’s internal autodiff just does the assignment (dydt[1] = y[2], no reverse pass) and the adjoint for a var * var (-theta * y[1]). I would remove the full jacobian and just update the coefficients of y’s adjoint by element. I think that would look something like

  reverse_pass_callback([y_arena, theta, dydt]() mutable {
    y_arena.adj().coeff(0) +=  dydt.adj().coeff(0);
    y_arena.adj().coeff(1) += -theta * dydt.adj().coeff(1);
  });

Overall this is a pretty simple function and I think having a custom reverse mode here is not going to be faster than what Stan would do. Generally you only need a custom reverse mode if you have a very expensive set of functions whose adjoint you know already and can calculate in one reverse pass instead of multiple reverse passes.

EDIT: Below I just misread indices!

Also in your C++ code you have

  dydt[0] = y[1].val();
  dydt[1] = -theta * y[0].val();

which I think is the indexing swapped? ie I think it should be

  dydt[0] = y[2].val();
  dydt[1] = -theta * y[1].val();

Thanks, more operations → slower runtime makes sense.
However, I don’t understand why the statistical efficiency (N_Eff) would drop if it’s the same underlying model, unless there’s something else going on to force NUTS to take tiny steps.

The model is a simple harmonic oscillator:
y = \left[\begin{matrix}x\\v\end{matrix}\right]
\frac{\partial x}{\partial t} = v
\frac{\partial v}{\partial t} = -\theta x

I think the first is correct unless I’ve missed something.

Edit:

I’ve just ran the diagnostic mode on the model with external C++:

TEST GRADIENT MODE

 Log probability=-7.99769

 param idx           value           model     finite diff           error
         0       -0.930348        -2.22247         8.27639        -10.4989
         1        0.151173        -21.3631         1.83853        -23.2016

Apologies I misread the indices

That seems pretty off, my guess is that your adjoint calculation is incorrect

Sorry for bumping, I wanted to add some more details showing my working with the hope that it makes it easier to spot where I’ve gone wrong (and hopefully help others in the process).

The system state y consists of the position x and velocity v:
y = \left[\begin{matrix}x\\v\end{matrix}\right]

The ODE system function f gives the derivatives of the state wrt time t:
\frac{\partial y}{\partial t} = \left[\begin{matrix} \frac{\partial x}{\partial t} \\ \frac{\partial v}{\partial t} \end{matrix}\right]= f(y) =\left[\begin{matrix}v\\-\theta x \end{matrix}\right]

Which is coded in the C++ implementation of the function as:

dydt[0] = y[1].val();
dydt[1] = -theta * y[0].val();

The Jacobian for the function f is then:
\mathbf{J}_f = \left[\begin{matrix} 0 \ \ 1\\ -\theta \ \ 0 \end{matrix}\right]

\mathbf{J}^T_f = \left[\begin{matrix} 0 \ -\theta\\ 1 \ \ 0 \end{matrix}\right]

My understanding is that the adjoint calculation should be:
\overline{y} = \mathbf{J}_f^T \cdot \overline{\frac{\partial y}{\partial t}}

Which is written in the C++ function with:

stan::arena_t<Eigen::Matrix<double, -1, -1>> jacobianT = Eigen::MatrixXd::Zero(2, 2);
jacobianT(1, 0) = 1.0;
jacobianT(0, 1) = -theta;

stan::arena_t<T1__> y_arena = y;
stan::math::reverse_pass_callback([y_arena, jacobianT, dydt]() mutable {
  y_arena.adj() += jacobianT * dydt.adj();
});

Is my maths for the adjoint calculation incorrect and/or have I mucked up its translation into code?

Hi!
My understanding is that (i) what you’re doing is more akin to a “forward method”, rather than an adjoint and (ii) there is a missing step in the forward method.

I find some of your notation ambiguous, so I’ll be a bit more explicit. Let p be the target density.

Then,

\begin{eqnarray*} \frac{\text d p}{\text d \theta} & = & \frac{\partial p}{\partial \theta} + \frac{\text d p}{\text d y} \cdot \frac{\text d y}{\text d \theta} \\ & = & \frac{\partial p}{\partial \theta} + \frac{\text d p}{\text d y} \cdot \int \text d t \frac{\text d^2 y}{\text d \theta \text d t} \\ & = & \frac{\partial p}{\partial \theta} + \frac{\text d p}{\text d y} \cdot \int \text d t \frac{\text d f}{\text d \theta} \\ \end{eqnarray*}

with the second term being the desired adjoint \bar y we want to compute. At this point we have two options: either we compute the term in the integral and then take the product of two Jacobians, or we contract the integral with the jacobian in front of it to obtain a new integral

\frac{\partial p}{\partial \theta} + \int \text d t \frac{\text d p}{\text d y} \cdot \frac{\text d f}{\text d \theta}.

The adjoint method attempts to solve the above integral directly. Now

\frac{\text d p}{\text d y} = \left [ \frac{\text d p}{\text d y_1}, \frac{\text d p}{\text d y_2} \right ],

and

\frac{\text d f}{\text d \theta}^\dagger = [0, - x],

so taking their dot products returns a scalar - \frac{\text d p}{\text d y_2}x. In total, we solve an augmented ODE with 3 states.

Your Jacobians have different dimensions then the one I obtain, so I’m not sure how you’ve defined J_f. Also I recommend not using the \bar y notation, rather writing explicitly the derivatives (and taking care to distinguish partial and total derivatives). The difficulty with the adjoint method is that to solve the adjoint ODE (i.e. integrate the scalar above) we need to first solve the ODE forward and then backwards in time. For more details, see https://arxiv.org/pdf/2112.14217.pdf, Section 3.

I suspect you may actually be trying to use a “forward” method, wherein you explicitly compute the full Jacobian matrix,

\frac{\text d y}{\text d \theta} = \int \text d t \frac{\text d f}{\text d \theta},

with every Jacobian appearing above being 1 \times 2. The J_f you computed above “looks most” like \text d f / \text d \theta but this quantity needs to be integrated with respect to time. I believe this is what’s missing in your algorithm.

1 Like