Fully Bayesian Bootstrap

I came across an excellent blog post by @andrewheiss, which is based on the work by @ajnafa. You can find it here: https://www.andrewheiss.com/blog/2021/12/20/fully-bayesian-ate-iptw/.

Their approach to fully Bayesian inverse probability weighting (IPW) consists of two main steps. Firstly, they separately fit a weight model to generate posterior weights. Then, during the sampling process of the outcome model, they incorporate these posterior weights by assigning each draw in the outcome model to a corresponding row of weights.

Upon reading this, it reminded me of the Bayesian bootstrap method, also known as Fractional-Random-Weight bootstrap. In contrast to the “ordinary” bootstrap, Bayesian bootstrap employs weights that sum up to N. Typically, these weights are sampled from a Dirichlet distribution using a constant vector of ones for the parameter alpha. To verify this idea, I conducted a test using the Bearing Cage Field Failure Data, which was also used in Xu et al.'s (2020) paper. Interestingly, I obtained nearly identical results in terms of precision as reported in the paper.

To be honest, I find this quite fascinating, but my main question is whether this result is considered novel or trivial?

There is a slightly different approach to bootstrapping documented in the repository for Paananen et al.'s (2021) work in a case study titled “Importance Weighted Moment Matching for Fast Bootstrapping.” It should be noted that Paananen only considers the “ordinary” bootstrap and does not explore the case with fractional weights. When I experimented with this method in the past, it also worked well with fractional weights. You can find the relevant information here: https://htmlpreview.github.io/?https://github.com/topipa/iter-mm-paper/blob/master/case_studies/IWMM_BS.html.

References:

2 Likes

The approach illustrated in that blog post was my first attempt at tackling Bayesian estimation of IPTW. Unfortunately, it doesn’t scale well and runs into sampling issues pretty easily. I developed a more scalable solution that samples the weights from the design stage and only requires passing the location and scale as input data. You can find the working paper, code, and simulations here: https://github.com/ajnafa/Latent-Bayesian-MSM.

1 Like

Thank you for sharing your insights and providing the link to your additional work!

I completely understand how the scaling issue can arise, especially considering the rapid growth of the weight matrix. In the case of bootstrapping, it seems reasonable to sample the weights within each draw, as they are independent of other variables. However, using the built-in Dirichlet sampler in the model block of Stan would lead to loud complaints. In that case, relying on the external C++ trick would be necessary. It would be interesting to conduct further experimentation regarding sampling issues and explore different bootstrapping schemes, such as clustered bootstrap.

I intend to further investigate this topic when I have the opportunity.

1 Like

The bootstrapping angle here is quite interesting and I’m not aware of any existing approaches that attempt to handle the weights via a C++ routine similar to how we originally did for handling the posterior IPTW weights in @andrewheiss’s blog post.

I use the fractional weights approach to the Bayesian bootstrap fairly often when working with the g-formula for estimating PATEs as in this example but I haven’t bothered experimenting with it further and I’d be interested to see how one could generalize it to time series or hierarchical contexts as the last time I tried doing this in Stan I found it to be less than straightforward.