Non-model Stan code in production software?

patrickjmc · December 7, 2021, 4:52pm

I’m scoping a project where I need to compute functions of random variables, such as a sum of Student T-distributed variables with different parameters. The parameters come as given from a different process so I don’t actually need to fit a model from data.

What I actually want is to write a python program such that I can output these posteriors:

def get_support(t_mean_0, t_sd_0, t_mean_1, t_sd_1):
  # stan code
  #
  # python code to extract descriptions of samples
  return lower_bound, upper_bound, posterior_mean

Is (py) stan suitable for this use case, or would I be better off using something like PyMC3?

yizhang · December 7, 2021, 5:34pm

Maybe just use generated quantities based on fixed(your given) parameters?

patrickjmc · December 8, 2021, 2:29pm

That seems like the right approach to take.

I’ve had a difficult time finding examples of running just this without a data or a model block, since it seems to often be used for posterior predictive checks of fit models. Is there a good reference for a use case like mine?

yizhang · December 8, 2021, 4:54pm

See 15.3 Sampling without parameters | Stan Reference Manual.

patrickjmc · December 8, 2021, 9:44pm

Ah, so this reproduces a place where I’ve been getting stuck.

If I copy/paste from that example into my python file:

import stan

stancode = """
data {
  real<lower=0, upper=1> theta;
  int<lower=0> K;
  int<lower=0> N;
}
generated quantities {
  array[N] int<lower=0, upper=K> y;
  for (n in 1:N) {
    y[n] = binomial_rng(K, theta);
  }
}
"""
stan_model = stan.build(stancode, data={'theta':0.5,'K':2,'N':10})
fit = stan_model.sample(num_chains=4, num_samples=1000)
df = fit.to_frame()

I get errors:

Building: found in cache, done.
Sampling: 100%, done.
Messages received during sampling:
  Gradient evaluation took 7e-06 seconds
  1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
  Adjust your expectations accordingly!
  Exception initializing step size.
  Posterior is improper. Please check your model.
  Gradient evaluation took 2e-06 seconds
  1000 transitions using 10 leapfrog steps per transition would take 0.02 seconds.
  Adjust your expectations accordingly!
  Exception initializing step size.
  Posterior is improper. Please check your model.
  Gradient evaluation took 1e-06 seconds
  1000 transitions using 10 leapfrog steps per transition would take 0.01 seconds.
  Adjust your expectations accordingly!
  Exception initializing step size.
  Posterior is improper. Please check your model.
  Gradient evaluation took 1e-06 seconds
  1000 transitions using 10 leapfrog steps per transition would take 0.01 seconds.
  Adjust your expectations accordingly!
  Exception initializing step size.
  Posterior is improper. Please check your model.
Traceback (most recent call last):
  File "/Users/pmccarthy/Desktop/pystan/student.py", line 17, in <module>
    fit = stan_model.sample(num_chains=4, num_samples=1000)
  File "/Users/pmccarthy/.pyenv/versions/pystan/lib/python3.9/site-packages/stan/model.py", line 89, in sample
    return self.hmc_nuts_diag_e_adapt(num_chains=num_chains, **kwargs)
  File "/Users/pmccarthy/.pyenv/versions/pystan/lib/python3.9/site-packages/stan/model.py", line 108, in hmc_nuts_diag_e_adapt
    return self._create_fit(function=function, num_chains=num_chains, **kwargs)
  File "/Users/pmccarthy/.pyenv/versions/pystan/lib/python3.9/site-packages/stan/model.py", line 311, in _create_fit
    return asyncio.run(go())
  File "/Users/pmccarthy/.pyenv/versions/3.9.6/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/pmccarthy/.pyenv/versions/3.9.6/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/Users/pmccarthy/.pyenv/versions/pystan/lib/python3.9/site-packages/stan/model.py", line 293, in go
    fit = stan.fit.Fit(
  File "/Users/pmccarthy/.pyenv/versions/pystan/lib/python3.9/site-packages/stan/fit.py", line 95, in __init__
    assert draw_index == num_samples_saved
AssertionError

What am I doing wrong?

stijn · December 9, 2021, 10:21am

I am not a pystan user but from the docs I gather that you need stan_model.fixed_param instead of stan_model.sample.

That’s what the guide is referring to with this quote.

For this model, the sampler must be configured to use the fixed-parameters setting because there are no parameters. Without parameter sampling there is no need for adaptation and the number of warmup iterations should be set to zero

yizhang · December 9, 2021, 4:29pm

The model looks alright so it should work. I don’t know pystan but at least cmdstan will be able to see that the parameter block is missing and switch to fixed_param run. Perhaps in pystan you need to manually specify that?

mitzimorris · December 9, 2021, 5:54pm

or switch to CmdStanPy which will also figure out what’s going on and do the right thing.

patrickjmc · December 9, 2021, 6:25pm

@stijn is spot on, I changed the line to
fit = stan_model.fixed_param(num_chains=4, num_samples=1000)
and it works. Without reading all the pystan docs it wasn’t obvious from any examples that it would be a different call.

Thanks for everyone’s help!

Topic		Replies	Views
How to use a fitted stan model for production (get posterior predictive distributions) in python Interfaces rstan , posterior-predictive , python	1	48	October 14, 2024
Generated Quantities for prediction data General	6	1095	July 28, 2020
Running stan in R with only generated quantities RStan	7	2471	May 27, 2021
Bayesian inference in Stan Modeling	40	5807	July 18, 2017
Using Stan for MLE-like estimation? Modeling	5	1382	September 4, 2019

Non-model Stan code in production software?

Related topics