Inquiry on the article: Efficient leave-one-out cross-validation for Bayesian non-factorized normal and Student-t models

Dear all,

I am Alejandro, I am currently writing a paper, that involves the stacking methodology for prediction with different models. At this moment, I am working with the multivariate normal distribution and non-independent data. I proceed to read the paper “Efficient leave-one-out cross-validation for Bayesian non-factorized normal and Student-t models” in which, one of the authors is @avehtari. I understood that there is a closed expression to get the leave-one out predictive distribution, specifically, for my case, p\left(y_i \mid y_{-i}, \theta\right) is univariate normal with parameters \tilde{\mu}_i=\mu_i+\sigma_{i,-i} \Sigma_{-i}^{-1}\left(y_{-i}-\mu_{-i}\right) and
\tilde{\sigma}_i=\sigma_{i i}+\sigma_{i,-i} \Sigma_{-i}^{-1} \sigma_{-i, i}

I was recovering the stacking weights using the loo_model_weights(x) function from the loo package. The thing is, that I realize I was considering x as the matrices generated by p\left(y_i \mid y_{-i}, \theta\right) instead of the matrices generated by the individual log-likelihoods p\left(y_i \mid \theta, M_k\right) as indicated in the package.

I am pretty sure I am wrong, I supposed I got confused. So, my question is: is there a way of doing stacking using loo with the income being the matrices generated by p\left(y_i \mid y_{-i}, \theta\right) or should I proceed to do stacking manually?. I suppose this is possible, since the problem is basically solve,

\max _{w \in \mathcal{S}_K} \frac{1}{n} \sum_{i=1}^n \log \sum_{k=1}^K w_k p\left(y_i \mid y_{-i}, M_k\right)

and I already know how to compute p\left(y_i \mid y_{-i},M_k\right). Am I wrong? is there an easier way of doing it?

I would appreciate any help,

Thanks in advance,

Best regards,

Two things.

First, you might want to check out @yuling et al. on stacking—it’s the same authors more or less as the LOO paper:

Second, with an autoregressive time series like this, you probably want leave future out evaluations. Luckily, @avehtari et al. have you covered there, too, with their “Leave future out” paper:

https://www.tandfonline.com/doi/full/10.1080/00949655.2020.1783262

You can combine the analytic equation for the multivariate normal part and importance sampling for the other parameters, as discussed in the Section 3 of the paper “Efficient leave-one-out cross-validation for Bayesian non-factorized normal and Student-t models” (check the steps). You should give the matrix with \log p(y_i | y_{-i}, \theta^{(s)}) to loo package.