Model uncertainty using Stan

sdaza · January 18, 2018, 4:37pm

Hello all,

I am working on a project that is modeling life expectancies for several countries and years.

We are trying to assess model uncertainty of an estimate using a Bayesian approach (e.g., the effect of GDP on life expectancy) not only regarding the specification of a particular class of model (let’s say OLS) but also to the use of different model classes (e.g., OLS versus Poisson when predicting mortality rates).

For instance, I run an OLS and Poisson model. I predict values and estimate the first difference of a change in the key independent variable. I get distributions for those differences.

I could combine those distributions weighting them by a performance measure. Does that make sense? It would be an ensemble process, but not focused on prediction per se but on the first differences for the variable of interest.
Does anyone have experience trying to do something like this?

Any ideas would be very appreciated!

Thank you so much,
Sebastian

Bob_Carpenter · February 6, 2018, 2:23am

By “OLS” do you mean linear regression? Usually optimization with least squares doesn’t give you uncertainty.

I’m not sure what you mean here.

You can, in general, define derived quantities in terms of other random variables in a Stan model and get posterior uncertainty.

sdaza · February 6, 2018, 3:18pm

Thanks for your answer, Bob.

OLS = linear regression.

Sorry, my question was no clear. What we are trying to do is some kind of model averaging using different classes of models (linear, non-linear, boxcox). I haven’t seen applications of that kind of model averaging or stacking using a Bayesian approach.

We don’t know how to combine these models because they transform the dependent variable (e.g. boxcox), so using WAIC/LOO wouldn’t be appropriate. We always can recover the original scale of the data to assess model GOF and predictive error using MSE, but we are not sure how to weigh them.

In Machine Learning, for instance, a strategy is to use predictions of individual models as predictors for the outcome, and get weights using linear regression or random forests. But we would like to get an idea of model uncertainty and our dataset is not big (~ 200 observations, countries by year).

Any ideas or suggestions?

Thank you so much!
Sebastian

avehtari · February 6, 2018, 8:56pm

How about Bayesian stacking Using Stacking to Average Bayesian Predictive Distributions (with Discussion) ?

LOO (and waic, but no need to use waic if you can do psis-loo) is appropriate if you take the Jacobian of that transformation into account when computing the log predictive density. After you have LOO log-score you can do Bayesian stacking as described in the above mentioned article.

sdaza · February 7, 2018, 6:15pm

Thanks for your reply!

That means that if do something like:

m1: y ~ x1+ x2
m2: log(y) ~ x1 + x2 # dependent variable transformation

Can I get model weights using?

log_lik_list = list()
log_lik_list[[1]] = extract(m1)[[“log_lik”]] }
log_lik_list[[2]] = extract(m2)[[“log_lik”]] }

loo::model_weights(log_lik_list, method=“stacking”)

Thank you!

avehtari · February 7, 2018, 6:23pm

No. You need to take into account the Jacobian of that transformation log(y) when computing the log predictive density. See my comment in thread Model comparison via PSIS-LOO for 3-level hierarchical growth model - #4 by avehtari

sdaza · February 7, 2018, 6:35pm

Great, thanks!

Bob_Carpenter · February 7, 2018, 7:35pm

You can just make a mixture model.

avehtari · February 7, 2018, 8:40pm

Stacking paper section 4.3 demonstrates that mixture model with small data can perform badly (at least in M-open case).

Bob_Carpenter · February 7, 2018, 8:46pm

Thanks for the pointer. I’ll take a look.

Topic		Replies	Views
Model Validation - Linear Regression with X and Y Uncertainty Modeling specification	4	1242	March 5, 2024
Bayesian parallels of weighted regression Modeling	12	3897	July 29, 2021
Implementing Model Averaging RStan loo	9	3609	August 17, 2022
I used stan_jm with different option for assoc like etavalue, etaauc etc. How can I compare these models, any fitness information I can get from these model? rstanarm	5	847	January 22, 2021
Model averaged parameter estimates Modeling loo	3	1005	February 9, 2018

Model uncertainty using Stan

Related topics