Hello, I’m an environmental engineer rather than a statistician. Does anyone have experience or advice with metamodeling for Bayesian inference? I think there is a broad use case in the engineering profession, though I have not found any examples. A search for “metamodeling” yields many examples for the purpose of computational efficiency, but none for the purpose of Bayesian inference. I’ll briefly explain a hypothetical example and outline some potential solutions.

As a hypothetical example, think of a highly non-linear USGS or Army Corps model that has been developed for a specific purpose, say groundwater modeling. It would be impractical to recode the model in Stan, although the model could be executed repetitively with a variety of input parameters. I’d like to make a deterministic metamodel from the input-output matrix that could then be used to fit real-world data (i.e., perform inference on the output parameters to estimate input parameters).

So the task is to identify a metamodel representation that is both faithful to the original model and explorable by MCMC. A linear regression model would be straight forward to implement, but would not sufficiently pick up non-linearities for my purpose. A latent gaussian process model could work, although I could use some advice on appropriate kernels for the purpose of deterministic modeling with variable data density in multidimensional space. I’ve considered k-nearest neighbors, decision tress and linear interpolations, but I’m assuming that the edges would be difficult for MCMC.

Hopefully this is a general use case that could benefit others. Thoughts and advice are welcome, thanks!