- Pareto diagnostic and Pareto smoothed importance sampling https://arxiv.org/abs/1507.02646 is used to diagnose and improve inference when using stan_glm with algorithm=‘optimizing’. Demonstration of timings with n=100,000 and p=100, and Gaussian and logistic regression at https://avehtari.github.io/RAOS-Examples/BigData/bigdata.html
- Using PSIS and importance resampling means also that we can use PSIS-LOO also when algorithm=‘optimizing’ (I recommend to increase the number of draws from the default value, and for bigger n you may need the latest loo from github with one over-strict check loosened)
- In addition of
optimizing
these work formeanfield
and `fullrank’ ADVI, but so far we have not seen any example where these would be better than optimizing or MCMC. - There is also is 4x speedup for GLMs and GAMs (with all inference algorithms) with normal (when n<=p, OLS trick was already used for n>p), bernoulli, poisson and neg_binomial_2 families, using compound glm functions previously implemented in Stan math by Matthijs Vákár
4 Likes