Brmspy: Python-first access to brms (cmdstanr backend, ArviZ output)

Hi all. I wanted to share a Python interface I’ve been building for brms that may be useful for anyone who works across both R and Python environments.

brmspy provides a Python-side API for fitting brms models via cmdstanr, returning results directly as ArviZ InferenceData while still exposing the underlying brmsfit object when needed.

It’s designed for production pipelines where models are authored in the brms formula language but the rest of the workflow (data processing, prediction, dashboards, etc.) is Python-based.


What it does

  • Calls brms functions through rpy2 with correct parameter translation
  • Delegates all modeling logic to real brms. No Python-side reimplementation, no divergence from native behavior. Opinionated wrappers that rebuild formulas or stancode in Python inevitably drift from brms and accumulate their own bugs.
  • Preserves full brms parameter names
    (e.g. b_Intercept, b_x, sd_group__Intercept, etc.)
  • Returns ArviZ InferenceData by default for downstream analysis in Python
  • Still provides access to the R-side brmsfit object (model.r)
  • Includes helpers like prior(), formula(), make_stancode(), and summary utilities
  • Tested against multiple brms workflows (priors, prediction functions, stancode generation, etc.)
  • Works well with various algorithms. Tested with NUTS sampling, fullrank optimization and meanfield optimization.

I’m using this in production (several daily pipelines), so the focus has been stability and predictable behavior rather than feature breadth.


Test coverage & stability

The current release has:

  • 77% test coverage (Python-side)
  • 54 tests covering priors, model fitting, get_stancode, summary paths, predictions, and no-sampling modes
  • Continuous testing across Python 3.10–3.14

The interface is still early (0.1.x), but the core fitting, prior handling, and prediction paths have been hardened through adaptation in production systems.


Example

pip install brmspy
from brmspy import brms, prior

epilepsy = brms.get_brms_data("epilepsy")

model = brms.fit(
    formula="count ~ zAge + zBase * Trt + (1|patient)",
    data=epilepsy,
    family="poisson",
    priors=[
        prior("normal(0, 1)", "b"),
        prior("exponential(1)", "sd", group="patient"),
        prior("student_t(3, 0, 2.5)", "Intercept"),
    ],
    chains=4,
    iter=2000
)

# Python-side analysis
idata = model.idata

Documentation

Docs:
https://kaitumisuuringute-keskus.github.io/brmspy/

Source:
https://github.com/kaitumisuuringute-keskus/brmspy/

Pypi:
https://pypi.org/project/brmspy/

7 Likes

Very cool! Thanks for letting us know about this.

1 Like

Did some work on installations. 0.1.9 release adds prebuilt runtimes for brms + cmdstanr + CmdStan, which bring installation on a clean machine down to ~20–60 seconds (with decent internet speed) instead of the usual ~20–30 minutes of compiling from source.

All binaries are built in Github actions and they are signed with attestation. Binaries can be verified to have come from actions.

Quickstart:

from brmspy import brms

brms.install_brms(use_prebuilt_binaries=True)

This downloads and activates a precompiled set of R libraries and compiled cmdstan environment. The default versions will always be either the latest stable version of brms and cmdstanr. In case of incompatibilities (R4.5 and cmdstanr 0.8 on windows) beta releases are used.

Platforms and requirements

Targeted toolchain / runtime assumptions:

R: R >= 4.0

Linux (x86_64)

  • glibc >= 2.27 (Ubuntu 18.04+, Debian 10+, RHEL 8+)
  • g++ >= 9.0

macOS (Apple Silicon)

  • Xcode Command Line Tools (xcode-select --install)
  • clang >= 11.0

Windows (x86_64)

  • Rtools 4.x with MinGW toolchain
  • g++ >= 9.0
  • The installer calls cmdstanr::check_cmdstan_toolchain() and installs the matching Rtools version when needed.
  • For R 4.5 the prebuilt binary contains cmdstanr 0.9

Current coverage

  • Prebuilt runtimes are currently built and tested for R 4.5 on:
    • Linux x86_64
    • macOS arm64
    • Windows x86_64
  • Runtimes for R 4.0–4.4 will be built next using the same pipeline (tomorrow, very likely).

Each runtime is automatically tested. Linux and Macos binaries I have tested myself in various environments.

If you want to benchmark the cold-start time in a clean environment, here is a Colab notebook that demonstrates the prebuilt path:

Colab demo notebook – brmspy prebuilt runtime

Currently built runtimes can be viewed HERE

2 Likes

I’m actively moving towards a stable public API release. Currently implementing easy-to-use wrappers around common brms functions.

0.1.12

New Features

  • Added save_rds() for saving brmsfit or generic R objects.
  • Added load_rds_fit() for loading saved brmsfit objects and returning a FitResult with attached InferenceData.
  • Added load_rds_raw() for loading arbitrary R objects from RDS files.
  • Added fit alias brm.

Families

  • Added brmspy.families module with Python wrappers for brmsfamily() and family().
  • Implemented keyword-argument wrappers for the following families:
    student, bernoulli, beta_binomial, negbinomial, geometric,
    lognormal, shifted_lognormal, skew_normal, exponential, weibull,
    frechet, gen_extreme_value, exgaussian, wiener,
    Beta, dirichlet, logistic_normal, von_mises, asym_laplace, cox,
    hurdle_poisson, hurdle_negbinomial, hurdle_gamma, hurdle_lognormal,
    hurdle_cumulative, zero_inflated_beta, zero_one_inflated_beta,
    zero_inflated_poisson, zero_inflated_negbinomial,
    zero_inflated_binomial, zero_inflated_beta_binomial,
    categorical, multinomial, cumulative, sratio, cratio, acat.

Priors

  • Added default_prior() for retrieving default priors for a model formula and dataset (pd.DataFrame).
  • Added get_prior() for inspecting prior structure before fitting (pd.DataFrame).

API Organization

  • Reorganized brms wrappers into modular files under brmspy/brms_functions/
    (brm, diagnosis, families, formula, io, prediction, prior, stan).

Internal / Typing

  • Added RListVectorExtension protocol for return types that wrap R list-like structures.
    Enables automatic extraction of underlying R objects in py_to_r and kwargs_r.

Slightly expanded example:

Repo: GitHub - kaitumisuuringute-keskus/brmspy: Python-first access to R’s brms with proper parameter names, ArviZ support, and cmdstanr performance. The easiest way to run brms models from Python.

Future stable release (0.2)

There are a couple things to do before the public API will be frozen and I could actually recommend using the library without a fixed version:

  • Expand diagnostics functions: fixef, ranef, loo, loo_compare, posterior_summary, validate_newdata
  • Criterion support: add_criterion, add_loo, add_waic, add_ic
  • Proper error model (get rid of generic exceptions, make sure all R errors are readable)
  • Test coverage 85%+ (critical functions first)
  • R dependencies test coverage 80%+
  • Runtimes built for R 4.0 -4.5 for all 3 major OS (windows only has 4.5 at the moment).
2 Likes

Thanks for the update, definitely keep us posted. Hopefully continuing to post updates here will also help get you some new users.

Glad to hear updates are welcome :) I’ll still keep posts to bigger releases rather than every small change.

Right now I’m doing a bigger refactor for the first “stable” API release - about 95% there, mostly chasing edge cases with the R environment and dependency management. That won’t land today, so here’s yesterday’s larger update and a few QoL things that are coming with the stable release.

Yesterday’s release added:

  • summary() - full summary object with pretty printing (previously only fixed effects)
  • fixef()
  • ranef()
  • posterior_summary()
  • prior_summary()
  • loo()
  • loo_compare()
  • validate_newdata()
  • call() - a generic wrapper to call arbitrary brms or R functions

On the quality-of-life side I’m trying to stay as close as possible to brms’ formula interface.

Both of these will be valid formula definitions in the stable release:

from brmspy import bf, set_rescor, nlf, lf

formula1 = (
    bf("y ~ 1") +
    nlf("sigma ~ a * exp(b * x)") +
    lf("a ~ x", "b ~ z + (1|g)", dpar="sigma")
)

formula2 = bf("""
mvbind(tarsus, back) ~
    sex + 
    hatchdate + 
    (1|p|fosternest) + 
    (1|q|dam)
""") + set_rescor(rescor=True)

There’s no heavy Python-side machinery behind the + operator - it just delegates to R’s +:

@classmethod
def _formula_parse(cls, r: Union[str, robjects.ListVector]) -> "FormulaResult":
    from .helpers.conversion import r_to_py
    if isinstance(r, str):
        r_fun = cast(Callable, robjects.r("brms::bf"))
        r = cast(robjects.ListVector, r_fun(r))
    return cls(r=r, dict=cast(dict, r_to_py(r)))


def __add__(self, other):
    if isinstance(other, (robjects.ListVector, str)):
        # for performance reasons, shouldnt do a full parse here.
        other = FormulaResult._formula_parse(other)

    if not isinstance(other, FormulaResult):
        raise ValueError(
            "When adding values to formula, they must be FormulaResult "
            "or parseable to FormulaResult"
        )

    plus = cast(Callable, robjects.r("function (a, b) a + b"))
    combo = plus(self.r, other.r)

    return FormulaResult._formula_parse(combo)
2 Likes

brmspy update: 0.2.0 is on PyPI.

The main changes are:

  • new formula DSL that mirrors brms’ multivariate / distributional interface
  • a refactored runtime / installation layer
  • removal of the embedded loo/add_criterion helpers in favour of ArviZ
  • The public API is now stable, parameters, functions etc wont be removed, they may normally be deprecated and removed after a while. Internals (underscored modules or methods) may change without warning.

Repo: https://github.com/kaitumisuuringute-keskus/brmspy
Docs: https://kaitumisuuringute-keskus.github.io/brmspy/

For now I’m done with major changes and will mainly be focusing on bugfixes, performance and memory improvements.


Breaking changes

  • Removed loo, loo_compare, add_criterion. In embedded R mode I kept running into hard crashes around add_criterion + large models (especially on Windows). The intended path now is to use arviz.loo / arviz.compare on the .idata attribute.
  • install_brms(use_prebuilt_binaries=...)install_brms(use_prebuilt=...).
  • The “canonical” names going forward are brm and bf. Older aliases fit / formula are still exported but effectively deprecated.

Formula DSL + multivariate example

The formula side is now much closer to brms. You can build multivariate, distributional and non-linear models in Python using the same structure and just hand them straight to brms under the hood.

Runnable example on Google Colab.

A concrete example (BTdata multivariate model), side by side:

Python (brmspy)R (brms)
from brmspy import brms, bf, lf, set_rescor
from brmspy import skew_normal, gaussian
import arviz as az

df = brms.get_data("BTdata", package="MCMCglmm")

bf_tarsus = (
    bf("tarsus ~ sex + (1|p|fosternest) + (1|q|dam)")
    + lf("sigma ~ 0 + sex")
    + skew_normal()
)

bf_back = (
    bf("back ~ s(hatchdate) + (1|p|fosternest) + (1|q|dam)")
    + gaussian()
)

fit = brms.brm(
    bf_tarsus + bf_back + set_rescor(False),
    data=df,
    chains=2,
    cores=2,
    control={"adapt_delta": 0.95},
    silent=2,
    refresh=0,
)

for resp in fit.idata.posterior_predictive.data_vars:
    print(resp)
    print(az.loo(fit.idata, var_name=resp))
    print()
library(brms)
library(MCMCglmm)
library(loo)

df <- get(data("BTdata", package = "MCMCglmm"))

bf_tarsus <- bf(
  tarsus ~ sex + (1 | p | fosternest) + (1 | q | dam)
) + lf(sigma ~ 0 + sex) + skew_normal()

bf_back <- bf(
  back ~ s(hatchdate) + (1 | p | fosternest) + (1 | q | dam)
) + gaussian()

fit <- brm(
  bf_tarsus + bf_back + set_rescor(FALSE),
  data = df,
  chains = 2,
  cores = 2,
  control = list(adapt_delta = 0.95)
)

loo_tarsus <- loo(fit, resp = "tarsus")
loo_back   <- loo(fit, resp = "back")

On the Python side, all the “interesting” analysis is intended to go through ArviZ using the InferenceData:

import arviz as az

az.plot_ppc(fit.idata, var_names=["tarsus"])
az.plot_ppc(fit.idata, var_names=["back"])

loo_tarsus = az.loo(fit.idata, var_name="tarsus")
loo_back   = az.loo(fit.idata, var_name="back")

Formula objects are just thin wrappers around real brms formula objects. The + operator calls an R function function(a, b) a + b and re-parses the result, so behaviour should match brms as closely as possible.


Runtime / installation changes

The runtime / installation layer got a full cleanup:

  • split into _config, _r_env, _platform, _install, _state, etc., so importing brmspy doesn’t mutate the R environment

  • activate_runtime() now checks a manifest and a system fingerprint before touching .libPaths() or cmdstan paths, and either completes or rolls back

  • brmspy.runtime.status() returns a dataclass with:

    • active runtime path and version
    • system fingerprint / toolchain info
    • prebuilt compatibility
    • installed versions of brms, cmdstanr, rstan

There’s also a generic get_data() helper that loads example datasets from any installed R package, alongside the older get_brms_data().


Windows caveats

Windows still behaves differently:

  • Installing or reinstalling R packages / cmdstanr in the same embedded R session after they’ve been used is unreliable because DLLs stay locked and some packages don’t detach cleanly.

  • Because of that, automatic reuse of a previously used prebuilt runtime is disabled on Windows. The intended pattern there is:

    • start a fresh Python process
    • call activate_runtime() explicitly
    • run models, avoid doing heavy install work after the runtime has been used

There is an optional install_rtools=True flag on install_brms() which runs the official Rtools installer, but it’s off by default.