Hello everyone! I’d like to share my work on bringing together Stan and PETSc.

PETSc is a library for parallel scientific computing. The motivation for this work for me was being able to run PETSc-based partial differential equations (PDEs) solvers inside Stan models.

Others might find other use cases.

First, relevant links to repositories:

- Fork of CmdStan: https://github.com/IvanYashchuk/cmdstan-petsc
- Fork of Stan-Math: https://github.com/IvanYashchuk/stan-math-petsc

Let’s see how it works on an example, let’s add to Stan model a Rosenbrock function defined in PETSc world. First, you’d have to install PETSc following the instructions here.

Make sure that the PETSc installation is working via running `make all test`

, you don’t need to wait for all tests to finish.

Now set the `PETSC_DIR`

environment variable to point to the path where PETSc was installed `export PETSC_DIR=/path/to/petsc`

.

Next `git clone https://github.com/IvanYashchuk/cmdstan-petsc.git`

, `git clone https://github.com/IvanYashchuk/stan-math-petsc.git`

.

Set the path of stan-math-petsc directory into `MATH`

environment variable `export MATH=$PWD/stan-math-petsc/`

.

Go inside the cmdstan-petsc directory `cd cmdstan-petsc`

.

Run `make build`

inside the cmdstan-petsc directory. Everything should compile now without errors.

Only two usual steps are left compiling the `.stan`

file and running the example.

Compilation of `.stan`

file now requires some more environment variables `STANCFLAGS`

and `USER_HEADER`

.

Now we have a simple `rosenbrock.stan`

file:

```
functions {
vector my_rosenbrock(vector xy);
}
parameters {
vector[2] xy;
}
model {
target += -my_rosenbrock(xy);
}
```

And here is the implementation of `my_rosenbrock`

function using built-in Stan tools:

```
namespace rosenbrock_model_namespace {
template <typename T0__>
Eigen::Matrix<stan::promote_args_t<T0__>, -1, 1>
my_rosenbrock(const Eigen::Matrix<T0__, -1, 1>& xy, std::ostream* pstream__)
{
// throw std::logic_error("not implemented"); // this should never be called
// typedef typename boost::math::tools::promote_args<T0__>::type T;
if (xy.size() > 2) {
throw std::logic_error("This function is implemented only for input of size 2.");
}
using T = stan::return_type_t<T0__>;
T x = xy(0);
T y = xy(1);
T res = pow((1 - x), 2) + (100 * pow((y - pow(x, 2)), 2));
Eigen::Matrix<T, -1, 1> out(1);
out(0) = res;
return out;
}
} // rosenbrock_model_namespace
```

While the implementation relying on PETSc looks like:

```
#include <stan/math/rev/functor.hpp>
#define PETSC_CLANGUAGE_CXX 1
#include <petscvec.h>
#include <petscerror.h>
#include "petsc_rosenbrock.hpp"
using namespace stan::math;
namespace rosenbrock_model_namespace {
template <typename T0__>
Eigen::Matrix<typename boost::math::tools::promote_args<T0__>::type, -1, 1>
my_rosenbrock(const Eigen::Matrix<T0__, -1, 1>& xy, std::ostream* pstream__)
{
throw std::logic_error("not implemented"); // this should never be called
}
template<>
Eigen::Matrix<double, -1, 1> my_rosenbrock<double>(const Eigen::Matrix<double, -1, 1>& xy, std::ostream* pstream__) {
matrix_v res = adj_jac_apply<stan::math::petsc::petsc_scalar_functor<PetscRosenbrock>>(xy);
return res.val();
}
template<>
Eigen::Matrix<var, -1, 1> my_rosenbrock<var>(const Eigen::Matrix<var, -1, 1>& xy, std::ostream* pstream__) {
return adj_jac_apply<stan::math::petsc::petsc_scalar_functor<PetscRosenbrock>>(xy);
}
} // rosenbrock_model_namespace
```

Run `STANCFLAGS=--allow-undefined USER_HEADER=examples/rosenbrock/ext_ros_petsc.hpp make examples/rosenbrock/rosenbrock`

.

Now you can run the example using usual CmdStan’s commands, for example `examples/rosenbrock/rosenbrock optimize`

.

Since we’re using PETSc we should be able to use `mpirun`

and indeed we can now run `mpirun -n 127 examples/rosenbrock/rosenbrock optimize`

.

Note that this mpirun command does different things than Stan’s MPI. In fact, I have switched off Stan’s MPI compilation in this fork as the usual PETSc MPI parallelism (single program multiple data) can’t be easily combined with Stan’s MPI paradigm (parallel cluster with listeners processes).

PETSc command line options can be used as well. For example `mpirun -n 5 examples/rosenbrock/rosenbrock optimize -log_view`

will output PETSc level log information in the end of the execution. Rosenbrock function in PETSc was implemented to accept command line arguments to change the condition parameter `alpha`

(the default value is 100.0), we can change this value with `mpirun -n 5 examples/rosenbrock/rosenbrock optimize -log_view -alpha 50.0`

.

PETSc includes an optimization solver TAO, which requires the user to provide functions for evaluation of the objective function and gradients wrt parameters, very much like Stan requirement for external functions. So for this specific Rosenbrock example `petsc_rosenbrock.hpp`

was adapted from TAO’s tutorial for Rosenbrock funciton optimization. If someone is familiar with TAO then it won’t take much effort to adapt the code to be compatible with the current interface. Of course the opposite direction (Stan -> PETSc) has more steep learning curve as it’s expected to be able to use PETSc and be ready to define the `multiply_adjoint_jacobian`

method (that is implementing the gradient computations) manually.

Current example uses `stan::math::adj_jac_apply`

, but it’s also possible to use the new `reverse_pass_callback`

.

To use `adj_jac_apply`

user has to implement a functor (in the future there could be a general templated functor with variadic arguments, etc.) with `operator()`

and `multiply_adjoint_jacobian`

methods.

In `operator()`

Eigen Vectors are transformed into PETSc Vector and passed to the PETSc side of computation, the result is converted back to Eigen type.

In `multiply_adjoint_jacobian`

the input vector is restored from autodiff memory stack, `adj`

vector is converted into PETSc type and PETSc’s pullback function (`solve_adjoint`

in this case) is called and the result is transformed back into Eigen type.

Example for a scalar function with vector valued input (\mathbb{R}^N \rightarrow \mathbb{R}) https://github.com/IvanYashchuk/stan-math-petsc/blob/44e9052e9c4ff1cbf014e1c56a45ac7bf9f98349/stan/math/rev/functor/petsc_functor.hpp#L19

The relevant conversion functions are defined here.

In parallel mode all PETSc side of things happen in parallel and PETSc takes care of all necessary communications between the processes. In Stan side of things duplicated computations are performed on all processes so that calls to PETSc world are synchronized.

The main components of PETSc library is

- KSP, a linear solver for
`Ax=b`

problems. Derivatives for that are not difficult to implement in PETSc (see 2.3.1 Matrix inverse product here`https://people.maths.ox.ac.uk/gilesm/files/NA-08-01.pdf`

). - SNES, a nonlinear solver for
`F(u)=0`

problems. Derivatives can be obtained using the adjoint approach. - TS, a timestepping solver for
`F(u̇, u, t) = G(u, t)`

problems. Derivatives can be calculated using`TS-Adjoint`

.

Feedback and questions welcome!

Thanks! 🙂