I am working on a project that is modeling life expectancies for several countries and years.
We are trying to assess model uncertainty of an estimate using a Bayesian approach (e.g., the effect of GDP on life expectancy) not only regarding the specification of a particular class of model (let’s say OLS) but also to the use of different model classes (e.g., OLS versus Poisson when predicting mortality rates).
For instance, I run an OLS and Poisson model. I predict values and estimate the first difference of a change in the key independent variable. I get distributions for those differences.
I could combine those distributions weighting them by a performance measure. Does that make sense? It would be an ensemble process, but not focused on prediction per se but on the first differences for the variable of interest.
Does anyone have experience trying to do something like this?
Sorry, my question was no clear. What we are trying to do is some kind of model averaging using different classes of models (linear, non-linear, boxcox). I haven’t seen applications of that kind of model averaging or stacking using a Bayesian approach.
We don’t know how to combine these models because they transform the dependent variable (e.g. boxcox), so using WAIC/LOO wouldn’t be appropriate. We always can recover the original scale of the data to assess model GOF and predictive error using MSE, but we are not sure how to weigh them.
In Machine Learning, for instance, a strategy is to use predictions of individual models as predictors for the outcome, and get weights using linear regression or random forests. But we would like to get an idea of model uncertainty and our dataset is not big (~ 200 observations, countries by year).
LOO (and waic, but no need to use waic if you can do psis-loo) is appropriate if you take the Jacobian of that transformation into account when computing the log predictive density. After you have LOO log-score you can do Bayesian stacking as described in the above mentioned article.