Leave-future-out cross-validation for time-series models

Thanks for the clear questions, these will help to improve the paper and the vignette

AR-model can be written with f explicitly shown and this is what is done, e.g., in Stan code produced by brms.

Your questions raises another question that is what does factorizable mean?

If we don’t present f explicitly, AR-model still factorizable for LFO purposes as the joint posterior is
p(y_1|\theta)p(y_2|y_1,\theta)p(y_3|y_2,y_1,\theta),\ldots,p(y_T|y_1,...,Y_{T-1},\theta)
which, e.g., in case of AR(1) reduces to
p(y_1|\theta)p(y_2|y_1,\theta)p(y_3|y_2,\theta),\ldots,p(y_T|Y_{T-1},\theta).
Which are terms we are interested in. This factorization is the reason we can easily use (Pareto smoothed) importance sampling as the contribution of each additional observation is easy to compute. We are restricted to add these contributions in time order, which is not a problem as in LFO we want to do that anyway. For LOO we have a challenge if we can’t represent the model with terms y_i|f_i as discussed in the non-factorizable paper.

That efficient computation Kalman filter is based on that each term in the time ordered factorization is simple. Kalman filter could be used to compute these terms in cases where Kalman filter is applicable. Maybe we should have used non-Gaussian observation model (although then you could ask why we don’t use EKF or UKF, and we would need to again extend our model in a way that is easy in Stan and MCMC, but gets more difficult with *KF type algorithms, and then you could ask why don’t we use particle filters, and then I would say we don’t have those implemented in Stan)

Your questions show that we need to be more careful with the term factorizable and when we have explicit latent values f and when not.

1 Like