Is there an easier way to integrate projpred with stanfit object?

I just found it hard to integrate the projpred with stanfit object. My major obstacle is using the cv_varsel function in the k fold scenario. So according to the description, I need to specify the cvfun parameter in the init_refmodel function. But I don’t know how to specify the function just according to the description here https://mc-stan.org/projpred/reference/init_refmodel.html

I went to see the source code of init_refmodel, their cvfun function is defined based on the object created by rstanarm or brms. Can anyone provide me an example to get to the k fold predictive projection method from an stanfit object created by RStan?

Welcome to the community!

I haven’t used projpred with Stan as-is (only with brms). I’m sure @avehtari can provide an answer :)

Does Custom reference models section in projpred quick start help?

Nope. The reference model listed there is working under the loo scenario, but I want to get a custome reference model working under the k fold scenario. So according the the description of the init_refmodel function, I need to write a cvfun inside the init_refmodel function. But just based on the description of cvfun, it is kind of hard to write the funciton without any examples.

ping @AlejandroCatalina

There are two examples in the projpred source code, that use init_refmodel() to define reference models for rstanarm and brms models (see for example https://github.com/stan-dev/projpred/blob/181606a416471f8145976e8b048d5213b9fb6ab3/R/refmodel.R#L116). By following those functions, you’ll see how to set up the various parameters expected by init_refmodel().

Having said that, I agree that documentation could be improved on that front! :)

1 Like

Thanks! That really helps!

Hi, sorry for the late reply!

I got the documentation message and will try to improve it!

Then, to the question at hand, the key reason why projpred is not so straightforward for a general stanfit object is because we don’t know anything about it basically, so we can’t have a general predfun or anything, and the same goes for cvfun. As @mcol said, there are some examples in the code where these functions are implemented for some models, and that helps (thanks for the link). I’ll try to improve the documentation so it’s clearer.

In the case of cvfun, you have to provide a function that builds a model per fold and returns the mean prediction for that fold (predfun) and its dispersion parameter. If you built your model with k-fold in mind you probably already built different fits per fold and can provide cvfits argument instead (basically each cvfit is a fold-refmodel). Anyway, I’ll try to improve the documentation and provide more examples.

Hope this clarifies the issue a bit more!

1 Like