Averaging an estimate across mis-specified models

James_Savage · March 23, 2018, 8:16pm

Hey, dumb question:

I have average treatment effects which are a function of model j parameters theta_j and covariates X. Say there are three models ATE_1 = f(theta_1, X); ATE_2 = g(theta_2, X); ATE_3 = h(theta_3, X). Each model is mis-specified in a relatively benign way–the precise non-linear relationship between X and some outcome y is being modeled incorrectly. No unobserved variables; OLS will consistently estimate the ATE but won’t be too (statistically) efficient.

The idea is that the three candidate models all capture a part of the troublesome non-linearity, but not all aspects.

Now say I’ve fit the three models separately and have posteriors for the three average treatment effects. That is, I’ve not fit this as a mixture. Let’s say I have some probability that each model is correct (say, based on relative likelihood or Bayes factor or something). What I want is a posterior for my weighted (across models) average treatment effect.

Is it possible to obtain a weighted average treatment effect by creating new “draws” of the weighted treatment effect across all MCMC draws from the three models? My intuition is that this won’t work, as there is no correlation in the MCMC draws across the three models. But I want to just check I’m not overthinking things. I’ve been doing some simulation tests, and it does seem to be consistent. But I’m concerned that I’m fooling myself about its statistical efficiency because I’m averaging across uncorrelated draws (within draw, across models).

Cheers,
Jim

bgoodri · March 23, 2018, 8:25pm

https://projecteuclid.org/euclid.ba/1516093227

anon75146577 · March 23, 2018, 8:35pm

I’m not sure if stacking as we implemented it will work here (although it may well - I genuinely have no idea). In particular, we need to stack predictive distributions. Why? Because we use LOO to compute the score function that we optimize to find the weights.

Now, I guess you could then use this stacked predictive distribution to compute a new ATE, but I’m not sure if that’s what you want. @avehtari: Do you have any thoughts?

avehtari · March 23, 2018, 9:55pm

Bayesian stacking as described in our paper will give you weights that will optimize the expected predictive performance of weighted posterior predictive distributions. Given the weights you can think as having a mixture model, and thus you can easily compute any predictive quantities. You can install loo 2.0 with stacking_weights function from https://github.com/stan-dev/loo/tree/new-psis (I’ll finish a vignette for pseudo-BMA and stacking weights tomorrow)

I don’t understand your worry about not having correlation, so there is possibility that you want something else.

Topic		Replies	Views
Stacking weights to get posterior of quantity of interest General loo	1	346	February 23, 2023
Does it make sense to apply regularization simultaneously with model averaging? General	2	680	April 13, 2018
Model uncertainty using Stan General	9	1384	February 7, 2018
Extract parameters in a distributional model via model stacking brms fitting-issues , loo	7	891	October 21, 2020
How to describe bayesian stacking weights? General	1	794	May 11, 2018

Averaging an estimate across mis-specified models

Related topics