Hi dear Stan community!

I would like to perform Projection Predictive Feature Selection - more specifically the rankig of the predictors - and using the `cmdstanr`

interface to handle R code, Stan code, compliations…

While it is possible to do directly with `brms`

and `rstanarm`

I do not know how to couple `projpred`

with `cmdstanr`

. Any help?

I tag here @jonah, @fweber144 and @avehtari because of their activity on the relevant packages for my work.

Since you are not happy with `brms`

, I guess you have a model that is not easy to make with `brms`

, which may mean that it’s not easy for projection predictive approach. Can you tell more about your model, so I can tell whether the method itself is suitable, before we discuss implementation details?

Dear @avehtari, I am not confident with the grammar for the formula of `brms`

and maybe my model could be easily traslated. However I feel confident in writing directly the `stan`

code from scratch to add more and more complexity form the first prototypal model up to the final implementation with multiple hierarchical levels.

So, long story short:

my response variable y (daily average temperature over decades) is normally distributed and show strong periodicity (seasonality) along years. To catch this behavior I decided to model the latent mean \mu as a Fourier series of n terms:

With n \rightarrow \infty this is a gaussian process with a periodic kernel, but GP here are too costly in terms of computation.

P in my case is fixed and the periodic terms can be precomputed and the problem is similar to a multiple linear regression with 2N+1 terms. First prototypes show that with just N=2 i have good results in approximating my data with no convergence issues, good EBFMI and R-hat very close to one.

But I would like not to be arbitrary in choosing N, so I would like to check N between overfitting (N too high) and a simplistic model (N too low). And so `projpred`

seems to me the best way to go… What do you think?

If the goal is to choose just N, then projpred is likely to be more complicated than needed, and you could use, e.g. LOO-CV.

If you are using a prior on A_n and B_n which is derived from the limiting GP, then you can’t overfit when N increases, you just get closer to that GP. See e.g. Appendix B in Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming | Statistics and Computing for the equations for periodic GP, and Stan implementation in Birthdays case study. The paper I mentioned, discuss also how to diagnose whether there are enough basis functions or too many.

Thanks @avehtari . I will read the Practical Hilbert space again. I posted some code here some months ago about HIlbert Space approximation but this approach is definitely an inexhaustible mine of information!

For this problem so mine I will proceed with the stacking of the predictive distributions increasing n ad see how it’s going. Then I will use the HSGP with the periodic kernel! Thanks a lot!