Validity of Bayesian Stacking and Pseudo-BMA Weights for Complex Structural Models

I’m currently working with a series of complex structural models (hierarchically implemented in stan) that involve non-linear calculations and intricate dependencies between variables. While exploring model evaluation techniques, I came across Bayesian stacking and Pseudo-BMA weights. These methods seem appealing, but I’m wondering:

Are Bayesian stacking and Pseudo-BMA weights valid and reliable when applied to complex, non-linear models?

If anyone has experience or insights into their use in similar contexts, particularly in terms of, assumptions, or potential pitfalls, I’d greatly appreciate your input!

Valid if the cross-validation computation works. The default PSIS-LOO computation may fail for very flexible models, but the diagnostic will tell if that is the case. Reliability is relative, and depending on your models it is possible that nothing is as reliable as you would like, but @yuling has used Bayesian stacking successfully for quite complex models, too. Based on a quick web search, there are also several R packages using Bayesian stacking for quite complex models. Bayesian stacking is likely to provide better results than Pseudo-BMA (as demonstrated in the Bayesian stacking paper)