Using PETSc library together with Stan

Hello everyone! I’d like to share my work on bringing together Stan and PETSc.
PETSc is a library for parallel scientific computing. The motivation for this work for me was being able to run PETSc-based partial differential equations (PDEs) solvers inside Stan models.
Others might find other use cases.
First, relevant links to repositories:

Let’s see how it works on an example, let’s add to Stan model a Rosenbrock function defined in PETSc world. First, you’d have to install PETSc following the instructions here.
Make sure that the PETSc installation is working via running make all test, you don’t need to wait for all tests to finish.

Now set the PETSC_DIR environment variable to point to the path where PETSc was installed export PETSC_DIR=/path/to/petsc.
Next git clone, git clone
Set the path of stan-math-petsc directory into MATH environment variable export MATH=$PWD/stan-math-petsc/.
Go inside the cmdstan-petsc directory cd cmdstan-petsc.
Run make build inside the cmdstan-petsc directory. Everything should compile now without errors.
Only two usual steps are left compiling the .stan file and running the example.
Compilation of .stan file now requires some more environment variables STANCFLAGS and USER_HEADER.
Now we have a simple rosenbrock.stan file:

functions {
  vector my_rosenbrock(vector xy);

parameters {
  vector[2] xy;

model {
  target += -my_rosenbrock(xy);

And here is the implementation of my_rosenbrock function using built-in Stan tools:

namespace rosenbrock_model_namespace {

template <typename T0__>
Eigen::Matrix<stan::promote_args_t<T0__>, -1, 1>
my_rosenbrock(const Eigen::Matrix<T0__, -1, 1>& xy, std::ostream* pstream__)
    // throw std::logic_error("not implemented");  // this should never be called
    // typedef typename boost::math::tools::promote_args<T0__>::type T;
    if (xy.size() > 2) {
        throw std::logic_error("This function is implemented only for input of size 2.");
    using T = stan::return_type_t<T0__>;
    T x = xy(0);
    T y = xy(1);
    T res = pow((1 - x), 2) + (100 * pow((y - pow(x, 2)), 2));
    Eigen::Matrix<T, -1, 1> out(1);
    out(0) = res;
    return out;

} // rosenbrock_model_namespace

While the implementation relying on PETSc looks like:

#include <stan/math/rev/functor.hpp>
#include <petscvec.h>
#include <petscerror.h>
#include "petsc_rosenbrock.hpp"

using namespace stan::math;

namespace rosenbrock_model_namespace {

template <typename T0__>
Eigen::Matrix<typename boost::math::tools::promote_args<T0__>::type, -1, 1>
my_rosenbrock(const Eigen::Matrix<T0__, -1, 1>& xy, std::ostream* pstream__) 
    throw std::logic_error("not implemented");  // this should never be called

Eigen::Matrix<double, -1, 1> my_rosenbrock<double>(const Eigen::Matrix<double, -1, 1>& xy, std::ostream* pstream__) {
    matrix_v res = adj_jac_apply<stan::math::petsc::petsc_scalar_functor<PetscRosenbrock>>(xy);
    return res.val();

Eigen::Matrix<var, -1, 1> my_rosenbrock<var>(const Eigen::Matrix<var, -1, 1>& xy, std::ostream* pstream__) {
    return adj_jac_apply<stan::math::petsc::petsc_scalar_functor<PetscRosenbrock>>(xy);

} // rosenbrock_model_namespace

Run STANCFLAGS=--allow-undefined USER_HEADER=examples/rosenbrock/ext_ros_petsc.hpp make examples/rosenbrock/rosenbrock.
Now you can run the example using usual CmdStan’s commands, for example examples/rosenbrock/rosenbrock optimize.
Since we’re using PETSc we should be able to use mpirun and indeed we can now run mpirun -n 127 examples/rosenbrock/rosenbrock optimize.
Note that this mpirun command does different things than Stan’s MPI. In fact, I have switched off Stan’s MPI compilation in this fork as the usual PETSc MPI parallelism (single program multiple data) can’t be easily combined with Stan’s MPI paradigm (parallel cluster with listeners processes).
PETSc command line options can be used as well. For example mpirun -n 5 examples/rosenbrock/rosenbrock optimize -log_view will output PETSc level log information in the end of the execution. Rosenbrock function in PETSc was implemented to accept command line arguments to change the condition parameter alpha (the default value is 100.0), we can change this value with mpirun -n 5 examples/rosenbrock/rosenbrock optimize -log_view -alpha 50.0.

PETSc includes an optimization solver TAO, which requires the user to provide functions for evaluation of the objective function and gradients wrt parameters, very much like Stan requirement for external functions. So for this specific Rosenbrock example petsc_rosenbrock.hpp was adapted from TAO’s tutorial for Rosenbrock funciton optimization. If someone is familiar with TAO then it won’t take much effort to adapt the code to be compatible with the current interface. Of course the opposite direction (Stan -> PETSc) has more steep learning curve as it’s expected to be able to use PETSc and be ready to define the multiply_adjoint_jacobian method (that is implementing the gradient computations) manually.

Current example uses stan::math::adj_jac_apply, but it’s also possible to use the new reverse_pass_callback.
To use adj_jac_apply user has to implement a functor (in the future there could be a general templated functor with variadic arguments, etc.) with operator() and multiply_adjoint_jacobian methods.
In operator() Eigen Vectors are transformed into PETSc Vector and passed to the PETSc side of computation, the result is converted back to Eigen type.
In multiply_adjoint_jacobian the input vector is restored from autodiff memory stack, adj vector is converted into PETSc type and PETSc’s pullback function (solve_adjoint in this case) is called and the result is transformed back into Eigen type.
Example for a scalar function with vector valued input (\mathbb{R}^N \rightarrow \mathbb{R})

The relevant conversion functions are defined here.

In parallel mode all PETSc side of things happen in parallel and PETSc takes care of all necessary communications between the processes. In Stan side of things duplicated computations are performed on all processes so that calls to PETSc world are synchronized.

The main components of PETSc library is

  • KSP, a linear solver for Ax=b problems. Derivatives for that are not difficult to implement in PETSc (see 2.3.1 Matrix inverse product here
  • SNES, a nonlinear solver for F(u)=0 problems. Derivatives can be obtained using the adjoint approach.
  • TS, a timestepping solver for F(u̇, u, t) = G(u, t) problems. Derivatives can be calculated using TS-Adjoint.

Feedback and questions welcome!
Thanks! 🙂




Yeah same, cool! Any particular application?

Hopefully in the future we get some sort of formal interface to other languages.

Something like this with a clear definition of how all the Stan types get carried over:

extern "C++" vector my_rosenbrock(vector xy);
extern "julia" vector my_rosenbrock(vector xy);

I know @yizhang talked about this before in Math and I got nervous cause of the scope. I guess I’m still nervous but it’d be nice to have it all figured out lol.

The scope is actually quite modular, as this is not very different from how we append sundials. The main challenge is to have a uniform MPI framework. The way I designed Torsten’s MPI(also cross-chain warmup) backend specifically considered this.

I’m working on phase-field models for metal microstructure evolution. One goal is to combine experiments and simulations from this paper “Phase field modeling of rapid resolidification of Al-Cu thin films”. And this interface to PETSc will be further used to interface with the FEniCS C++ library for solving PDEs with automatic tangent and adjoint equations.

Interesting. Some people would like the “official” foreign-function-interface declared in Stan somehow. I think FFI is already there up to normal C++ usage such as including headers and linking object files since with the adj_jac_apply and reverse_pass_callback external functions do not need to interact with Stan autodiff types, only conversions to and from Eigen arrays are needed.
It took too much time for me to realize that and the end result doesn’t contain much code

Taking this work as a template, we can see that getting something connected to C/C++/Fortran/Julia and running some specialized example probably won’t take much effort. The problem is in generalizing the approach and putting more responsibility on the user as it departs from the probabilistic programming way of using ready-made blocks.

This is fantastic Ivan!

My question is what did you find annoying when interfacing with Cmdstan and what could be done simplify this procedure?

For example what I find annoying when interfacing C++ in Cmdstan is the namespace thing, where you have to set the namespace of the model of the external .hpp code when you change the name of the model. rstan handles that nicer for example.

If we could simplify this to essentially having to add:

#include <stan_petsc/petsc.hpp>

in the Stan model and

-include ../stan_petsc/makestuff

in the make/local

that would get stuff like this a lot more exposure. If it could be done. I think this is basically what @yizhang 's idea was.

Dont get me wrong, this is great as is.

I didn’t find anything to be particularly annoying to me.
Yes, one thing that one needs to keep in mind is the namespace name.
One small inconvenience is that one has to run .stan compilation once before trying to code an external function to copy the expected templated function signature.

I personally think that it’s a good thing that C++ files can’t be included in .stan files. It keeps separate the Stan modeling language and C++.

1 Like