How to translate between PyMC and Stan

Bob_Carpenter · January 22, 2025, 7:23pm

I just wanted to point people to this super useful notebook and gist from Ricardo Vieira, one of the PyMC developers, who outlined the relation between how PyMC and Stan specify models, including a lot of tricks for getting Stan-like flexibility into the graphical-modeling-centric world of PyMC.

It spun out of a bunch of a discussion on the PyMC discourse:

spinkney · January 23, 2025, 4:28pm

The idea of using different backends but keeping the language is cool. The GPU stuff in Jax is great and enables huge programs. The variational inference stuff and normalizing flows can really expand Bayesian modeling to big data.

When I look at what I want today for any size data, I think it’s useful to look at the strengths and issues we currently have with Stan and other PPLs. One of these days I’ll wire-frame what my current ideal PPL would look like. Probably something closer to slic-stan but with a few additions:

Stuff that is probably do-able today:

Turning AD on/off for some things, user defined derivatives, and higher order AD.
Allow the user to AD stuff to get estimates. When using an extended Kalman filter the jacobian is needed, instead of hand supplying it just call a jacobian function. Also “score” models (see Informant (statistics) - Wikipedia) that take in the derivative of the log-likelihood into account. Similar to what our ODE does pausing AD and then using that info again in the tape.
Different samplers for different parameter types (looking at you inequality constrained MVN see a review at [2209.12403] Sampling Constrained Continuous Probability Distributions: A Review and sampling exactly from topological spaces that don’t map exactly to the real line like spheres).
Easy automatic optimizations like what PyTensor and Aesara do. Turn log(1 + x) into log1p(x), etc.

Stuff that needs more research

Automatic reparameterization
Much, much, much faster usage of posterior draws. Like approximations that only save a fraction of the data and are about 95% accurate. The hard problem here is the joint distribution of the parameters. Univariate stuff is easy but condensing the joint movements of variables is hard. However, if it is found then, in one line, tell the program to calculate using the full draws. The idea is to iterate super fast.
A Stan-like syntax to construct programs/models over the draws. Right now I hate that it is super awkward to do matrix math and stuff over the draws. Make this easy!
Composing Stan models. Using posterior fits as priors. Maybe using message passing like what rxinfer.jl does but combining it with some HMC inference? I don’t know how to do this but would be cool to have.

Topic		Replies	Views
Comparision: simple AR(1) model in stan vs. pymc Modeling cmdstanpy , prior-predictive , python , autoregressive-model	4	146	December 10, 2024
Big difference between PyMC and Stan results Modeling specification	3	2422	July 5, 2022
Nutpie, now with normalizing flow adaptation Publicity	7	388	November 19, 2024
Converting PYMC to Stan Model Modeling	0	330	August 20, 2023
TensorFlow probability c.f. Stan / comment General	4	801	June 7, 2021

How to translate between PyMC and Stan

Related topics