Model comparison (two models ran on two different N sizes)

Tafadzwa · May 29, 2023, 8:57pm

I fitted two models on the same dataset but different N sizes. How I handle missing observations explains different N. Now I want to see which model is better. I consulted Chatgpt and I got this response.

“Pareto Smoothed Importance Sampling (PSIS): PSIS is a method used to estimate the out-of-sample predictive performance of a model. It addresses the issue of comparing models with different sample sizes by adjusting for the different levels of uncertainty introduced by the smaller dataset. You can calculate PSIS-LOO (leave-one-out) or PSIS-WAIC (Widely Applicable Information Criterion) to compare the models”.

But when I checked from the loo package, It seems what Chatgpt is saying is not true. But I’m not so sure. Can you please confirm if what Chatgpt is saying is correct. Please also give any suggestions besides multiple imputation about how I can compare these models.

Thank you.

jsocolar · May 30, 2023, 12:41am

Your question doesn’t have enough information in it to understand how you might go about comparing these models. However, if we can define the prediction task that you want to evaluate using leave-one-out CV as restricted to the shared (nonmissing) data common to the two approaches, and if both approaches admit a factorization such that the point-wise likelihood is well defined, then you can use the log-likelihood matrix over the shared observations to evaluate and compare the two models’ leave-one-out predictive performance over the shared data via PSIS LOO.
Chat GPT is wrong if it’s talking about the sample sizes corresponding to the dataset used to fit the model, but is basically correct if it’s talking about the MCMC sample sizes from the posterior distribution.

Tafadzwa · May 30, 2023, 12:13pm

Thank you for your help. I’m new in Bayesian and this is my first work as a masters student. I don’t have a solid background in Mathematics and Statistics as well but i’m willing to build now.

Here are my two models. fit_1 is my main model and has all the predictors. However, it has few groups (CLASS) because information about DNA_Dynamut_deltaG, BLOSUM, LIG_Dynamut_deltaG, DNA_PPI_RSA, DNA_mCSM_Stability_deltaG is missing for the other groups. Therefore, small sample size.

My second model (fit_2) has all the groups (CLASS). However, this model has few predictors (CONSURF, dimerization_affected, DNA_binding_affected). We have complete information for all the groups but only for these three predictors. Therefore, large sample size compared to the previous model.

My problem now is to conclude whether or not a model with only a few predictors but more groups is good compared to the model with all the predictors but few groups (fit_1). I want to judge the performance in terms of goodness of fit and out of sample predictive performance. For goodness of fit, I used the Average Bayesian posterior predictive p-value and there was not much difference between these two models. For out of sample predictive performance, I wanted to use LOOIC and use loo.compare but it did not work because of different N.

If you have suggestions please guide me.

fit_1 <- brm(data = model2,
      family = binomial(link = "logit"), pDST_Resistance|trials(Number_mutation) ~ CONSURF + BLOSUM
+ DNA_binding_affected + DNA_Dynamut_deltaG + LIG_Dynamut_deltaG + DNA_PPI_RSA + dimerization_affected + DNA_mCSM_Stability_deltaG + (1 |CLASS))

fit_2 <- brm(data = model2,
      family = binomial(link = "logit"), pDST_Resistance|trials(Number_mutation) ~ CONSURF 
+ DNA_binding_affected + dimerization_affected  + (1 |CLASS))

jsocolar · May 30, 2023, 1:16pm

If desired, you can compare the leave-one-out predictive performance over just the shared portion of the data. There is no good way to evaluate the out-of-sample predictive performance over the data for which you do not know the covariate values in model2. To compare over the shared data, you will use the log-likelihood matrices just for the shared datapoints. Using brms::loo.brmsfit, you can achieve this by passing the shared data to the newdata argument.

Tafadzwa · May 30, 2023, 3:34pm

Makes sense. Thank you so much. I really appreciate.

Topic		Replies	Views
Model comparison - large model Modeling loo	5	795	March 8, 2018
PSIS-LOO in hierarchical model - only mean log-likelihoods per group work Modeling rstan , loo	8	851	May 9, 2022
WAIC or PSIS-Loo when using the optimizing function Modeling loo	10	879	August 9, 2020
I used stan_jm with different option for assoc like etavalue, etaauc etc. How can I compare these models, any fitness information I can get from these model? rstanarm	5	844	January 22, 2021
{loo} truncated importance sampling? General loo	15	795	April 20, 2023

Model comparison (two models ran on two different N sizes)

Related topics