Testing against holdout data

timdisher · September 20, 2017, 1:12pm

I am using a train/test for a binomial three-level model in rstanarm (obs in babies in pregnancies). If I were to compare two models, am I correct that the process in rstanarm would be to use:

log_lik(train, newdata = test)

Average the results by column, add them by row, and then compare the two models to each other?

bgoodri · September 20, 2017, 3:47pm

To get it to match up with the elpd_loo estimate, I think you would just do mean(rowSums(log_lik(train, newdata = test)) / nrow(model.frame(train)))

timdisher · September 20, 2017, 4:02pm

Great, this makes a lot of sense. Thank you.

avehtari · September 20, 2017, 5:51pm

I think log_sum_exp should be used

bgoodri · September 20, 2017, 5:56pm

Right

Topic		Replies	Views
Comparing results from lm and stan_lm rstanarm	4	951	September 18, 2017
Loo with k_threshold parameter vs. kfold for comparing rstanarm models rstanarm loo	5	1142	December 21, 2018
Looic and elpd_diff (rstanarm model) rstanarm loo	10	3142	August 9, 2017
Loo+rstanarm for hierarchical models rstanarm loo	1	981	April 18, 2018
Computing pointwise log-likelihood for a binomial normal hierarchical model Modeling loo	6	2071	June 11, 2017

Testing against holdout data

Related topics