CmdStanCache: caches Stan MCMC for quicker model iterations

Hi Python users of Stan,

I find this library quite useful so I thought I’d share it:

It caches Stan MCMC for quicker model iterations and enhanced productivity. It does this by wrapping CmdStanPy and recognises near-identical (comments or white space changes only) code or data, in which case a cached result is returned.

Usage:

model = """
data {
  int N;
}
parameters {
  real<lower=-10.0, upper=10.0> x[N];
}
model {
  for (i in 1:N-1) {
         target += -2 * (100 * square(x[i+1] - square(x[i])) + square(1 - x[i]));
  }
}
"""
data = dict(N=2)

import cmdstancache

stan_variables, method_variables = cmdstancache.run_stan(
        model,
        data=data,
        # any other CmdStanMCMC.sample() parameters go here
        seed=42
)

Now comes the trick:

  • If you run this code twice, the second time the stored result is read.
  • If you add or modify a code comment, the same result is returned without having to resample.

More technical details here: cmdstancache · PyPI

2 Likes

Seems useful.
The problem of normalizing a Stan model is tricky, due to the multiple kinds of comments. It looks like this only takes into account the // style, not /* comment */ or # comment. The issue with the latter # style is it can also be an #include directive (which is then its own problem, since now you need to worry about what to do if the #include-d file changed!)

Yep, cmdstancache is staying on the safe side on that.

Included files aren’t considered at the moment. You can always force a rerun by incrementing the seed.

1 Like