Please also provide the following information in addition to your question:
- Operating System: Windows 10
- brms Version: 2.8
I fit a model which has 7 influential points with pareto_k > 0.7.
Now, as explained here , the option reloo would allow for refitting the model using leave-one-out
And, of course, this would allow us to compare several models and choose the one which is best fit.
What confuses me, however, is that the output of reloo() is a loo-object, instead of a brmsfit. We can use this newly-obtained loo-object to compare it to the loo-s of other models, and choose the best fit. But what I was wondering is whether it is possible to somehow save the (in my case 7-times refit model) and use it to make predictions, instead of the original model (which has 7 influential points).
I am confused, because this is the case with k-fold - the model is refit, say 10 times, and then the kfold object is used in the kfold_predict function to do predictions.
I looked into the loo_predict function, thinking that I could probably use it, just like kfold_predict, but that doesn’t seem to be the case.
Or do we only use the reloo-part to calculate elpd more precisely by refitting the model for the influential points, and, then, once we compare the models and identify the best one, just use the original model (with the influential points, without any refitting), for predictions?
I hope my question isn’t too confusing.
kfold_predict was just a nice little addon I did because people repeatedly requested it.
kfold also does not return a brmsfit object but a kfold object, which may also contain
a list of fitted model object (if the user requests so) in addition to a lot of other stuff.
reloo returns a loo object because you simply replace the pointwise approximate ELPD contributions of problematic observations with their exact counterparts obtained from fitting the
model without this specific observation. That’s it more or less what the function is (currently)
Thanks for the quick reply!
So if I would like to use my model for predictions (and it has just a few influential points),
I would just use it as it is, and not refit it (unless there is a large number of influential points and the system suggests kfold)?
This probably depends on what kinds of predictions you want to actually make.
For example, if I would like to see how well my model does in predicting new data (not used for fitting the model), do I have to use kfold, even if the number of influential points is really small (e.g. less than 10)?
Do you want to predict for the predictions sake (actually predicting stuff) or for evaluating/comparing model fit? In the latter case, the loo/kfold methods already give you exactly that.
let’s say I have 2 models - model1 and model2, both of which have some influential points. I use the reloo function to refit them, so that I can compare their elpds.
Now, let’s say model1 turns out to be “better”. Then, if I would like to use it to make predictions for some new data (not used for fitting the model), do I just use the original model1 (without any refitting)?