Defining a target distribution using Python model output

jackelsey · September 21, 2021, 5:46pm

I’ve written a biochemistry model in Python, and I’d like to perform a Monte Carlo experiment using HMC to constrain the parameters of that model to those that produce realistic-looking model output. We’ve figured out how to to assign a value from 0 to 1 to every point in parameter space based on how “realistic” the corresponding model output looks, but calculating that value requires running the model’s Python code (which in turn numerically integrates a complicated system of ODEs).

After searching the forum, reading the user’s guide, and watching this video, I’ve concluded that it isn’t possible to do this with Stan. Am I mistaken? Is there a different software package that you think would work? This is my first time working with MCMC methods, and any advice would be greatly appreciated.

mike-lawrence · September 22, 2021, 1:22pm

Yeah, you’d have to rewrite your simulations in C to use Stan. If you want to re-use your python code I’d look at PyMC3 and possibly tensorflow. If you look at the latter, report back on whether it is possible to call non-tf python code; I think tensorflow’s python-API-atop-C-core architecture might leave you in a similar place as Stan with respect to using non-tf python code (or, @charlesm93: know off-hand if TF can use non-TF python in the target?)

jackelsey · September 22, 2021, 2:03pm

When I was looking at the other options out there, TensorFlow seemed like it might work. The argument describing the target distribution in tfp.mcmc.HamiltonianMonteCarlo is specified in the API documentation as a “Python callable”.

jsocolar · September 22, 2021, 2:15pm

I think there exist tools that will compile python code to c++. I don’t know much about them, and I don’t know if they produce great c++ code, but if your goal is just to have a c++ implementation that Stan can work with (and not necessarily a well-optimized c++implementation), there might be some possibility of doing it automatically.

charlesm93 · September 22, 2021, 3:00pm

TensorFlow Probability would be a good option, although the interface experience isn’t the same as with Stan. It definitely took a bit of getting used to. I’m not saying it’s better or worse, just different. With TFP, you need to make sure your Python code is differentiable if you use a gradient-based algorithm. So something like vanilla Numpy would not work; on the other hand Jax, which seamlessly supports many Numpy functions and has autodiff, would. You can write your function using TensorFlow, but I find Jax easier to use.

Big caveat: Jax’s ODE support is somewhat limited. You only have access to a non-stiff integrator with adjoint differentiation. Stan supports stiff and non-stiff solvers, and both forward and adjoint differentiation. There’s a PR for a non-stiff solver, which I expect will get in anytime (might already be available).

Bigger caveat: Jax’s ODE doesn’t return an error message when it fails. See No error message when `odeint` fails · Issue #7326 · google/jax · GitHub.

I haven’t tried PyMC3. I think it allows you to specify your code in Python, but you probably need to use their autodiff library. I’m not sure what their ODE support is.

Topic		Replies	Views
A big stochastic model in Stan Modeling	20	2068	October 22, 2017
Animating MCMC with Python Modeling mcmc , interpret-results , visualisation	4	708	August 16, 2021
TensorFlow probability c.f. Stan / comment General	4	807	June 7, 2021
Education-oriented, functional, interpreted Stan/HMC implementation Project Proposals education	19	1934	February 14, 2022
Rienman Manifold HMC / MALA? (for multivariate probit model) General	4	777	March 29, 2023

Defining a target distribution using Python model output

Related topics