When it's necessary: doing model cutting right. How?- reference to statmodeling post

stemangiola · July 3, 2020, 2:12am

Premise

In the real world, start modeling from some data+uncertainty is much better than using just data (uncertainty free). When the inference has been done by third parties (or joint modelling is not possible or really impractical).

I really enjoyed the post https://statmodeling.stat.columbia.edu/2020/01/08/how-to-cut-using-stan-if-you-must/

In particular by Ben

We cut in all models, at some level. No way we have good, probabilistic models all the way down to the sensor level. At some point when we take measurements, we write down numbers — those are cuts.

And by Pierre E Jacob

Your suggested first solution to approximate the cut distribution (“basically, you’d first fit model 1 and get posterior simulations, then approx those simulations by a mixture of multivariate normal or t distributions, then use that as a prior for model 2”) would not approximate the cut distribution, if I understand it correctly. This would approximate the standard posterior distribution, with an error coming from a mixture being an imperfect representation of the first posterior. It would still be standard Bayesian inference since the parameter on which you put a prior gets updated in the second stage.
The point of the “cut” is that some marginal distributions would not get updated. A long-and-dirty way (but conceptually simple) of getting an approximation of the cut, is as follows: perform the ‘second stage’ posterior simulation, conditional on many independent draws from the ‘first stage’ posterior, and finally putt all the draws together.

Question

If the second model is really simple and works perfectly with optimize from rstan, is it an OK solution for run a model for each draw of the previous model (or each data uncertainty combination) and merge the point-estimates to optain a sudo-posterior distribution?

Thanks a lot

martinmodrak · July 8, 2020, 2:17pm

I am far from expert on this, but my gut instinct would be that the proposed method (optimize a second model for each posterior sample of a first model) to be usually somewhat better than just fitting something like the second model directly and sometimes possibly better than just fitting something like the first model directly. It IMHO would very likely be inferior to both the approximation by Pierre E Jacob (or the neural net approximation mentioned elsewhere) and certainly inferior to the full model.

I am also quite sure that in unfavorable conditions the method would break terribly (e.g. when the second model is very poorly identified, so a lot of uncertainty would get ignored by running optimize). I would not use unless I completely had to.

I would expect @betanalpha to vehemently object to the proposed method and share some geometrical insight on why the proposed method has problems :-)

martinmodrak · November 12, 2020, 10:36am

A post was split to a new topic: Implementing model cuts in Stan

Topic		Replies	Views
Scaling Bayesian updating (old posterior to new prior) Modeling fitting-issues	3	201	October 30, 2024
Is it possible for a parameter to be estimated only using a subset of the data for which the model applies? Modeling specification	2	645	July 31, 2018
Splitting data and combining sub-posteriors for “big” data General	7	1919	July 7, 2020
Updating model with new data General	11	2761	September 7, 2018
Beginner Advice General	17	2597	August 2, 2021

When it's necessary: doing model cutting right. How?- reference to statmodeling post

Related topics