Having inference flow "one way" in a model with conditional sampling

I have a model where part of it describes the behavior of a system, and a second part describes a measurement of that system. Consider this contrived example:

data {
  int<lower=0> n;
  int<lower=0, upper=1> X[n];
  int<lower=0, upper=1> Y[n];
}

parameters {
  real<lower=0, upper=1> sys;
  real meas;
}

model {
  sys ~ beta(1,1);
  meas ~ normal(1,0.01);
  
  X ~ bernoulli(sys);
  Y ~ bernoulli_logit(sys * meas);
}

I want to use Stan to generate samples (\textrm{sys}_i, \textrm{meas}_i) not from the joint posterior, but rather such that \textrm{sys}_i \sim P(\textrm{sys} | X) and \textrm{meas}_i \sim P(\textrm{meas} | X, Y, \textrm{sys}_i).

This produces a “one way” flow of inference: X affects our estimate of \textrm{sys} and \textrm{meas}, but Y doesn’t affect our estimate of \textrm{sys}.

My reason for wanting this is that \textrm{meas} represents the quality of a measurement. I never want an unexpected Y measurement to change how I think the system behaves (represented by \textrm{sys}), I just want it to change the inferred quality of the measurement.

It’s easy to sample \textrm{sys}_i \sim P(\textrm{sys} | X) in Stan, but sampling \textrm{meas}_i \sim P(\textrm{meas} | Y, \textrm{sys}_i) would seem to require one MCMC run for each \textrm{sys}_i sample, which quickly becomes infeasible.

Is there a way to do this sort of conditional sampling in Stan?

1 Like

Sorry this got missed before.

There’s no way to do this directly in a single Stan model. We are working on a feature that would enable this in certain cases; here’s the preliminary design doc: design-docs/designs/0035-pre-model-gqs.md at 0035-pre-model-gqs · stan-dev/design-docs · GitHub

What you’re asking for is known as “cut” in the Bayesian literature. It’s common in Pharma models when you have a very good metabolic model (pharmacokinetics) but a crummy disease model (pharmacodynamics) and you don’t want information to flow from the disease model and mess up the metabolic parameters. See, e.g., Martin Plummer’s paper (he’s the author of JAGS): https://statmath.wu.ac.at/research/talks/resources/Plummer_WU_2015.pdf

One way you can do this, albeit with a lot of computation, is to take draws

\textit{sys}^{(n)} \sim p(\textit{sys} \mid X)

for n = 1, \ldots, N. Then for each n, use \textit{sys}^{(n)} as data in a model. Then if you combine all the draws from all N runs, you get the inferences you want.

This brute-force approach is known as “multiple imputation” in statistics: Multiple Imputation.

It’s yet another in the class of “not quite, but almost Bayesian” methods.

2 Likes

Really appreciate the response Bob, this is super useful.

I have been tinkering with some external C++ to get a similar thing working for within-model bootstrapping, so that the (continuous) weights for the observations are generated in each draw. Not for any particular use case, but more along the lines of “can it be done”. Here is how I approached it:

The Bayesian bootstrap is easier. That you can do within Stan more easily. I wrote a section of the User’s Guide on just this:

https://mc-stan.org/docs/stan-users-guide/bootstrap.html#bayesian-bootstrap-and-bagging

Yes, easier, but not exactly what I was aiming for. But maybe the approaches are more or less equivalent? (but I think the Bayesian cut version I wrote would generally give wider posteriors). But continued discussion of bootstrap in a Bayesian context is a bit off-topic for this thread.

I hope the suggested generated data will eventually be implemented! Maybe @scorbett can take some inspiration from my code for his use-case. The C++ parts of my code was written with lots of back-and-forth with ChatGPT (I recommend giving it relevant documentation to work with).

Another approach would be the approach used by @ajnafa and @andrewheiss. First a model for estimating sys and then a model shoehorning in those estimates when estimating meas.