It would be interesting to know if there’s anyone using algorithm='optimizing'
with stan_glm()
for faster inference?
If you do then using this experimental rstanarm branch would give you diagnostics described in Yao, Vehtari, Simpson, and Gelman (2018). Yes, but Did It Work?: Evaluating Variational Inference. Thirty-fifth International Conference on Machine Learning, PMLR 80:5577-5586 and loo()
works, too.
Model Info:
function: stan_glm
family: binomial [logit]
formula: y ~ c_dist100 + c_log_arsenic + c_educ4 + c_dist100:c_educ4 +
c_log_arsenic:c_educ4
algorithm: optimizing
priors: see help('prior_summary')
observations: 3020
predictors: 6
Estimates:
Median MAD_SD 2.5% 25% 50% 75% 97.5%
(Intercept) 0.34 0.04 0.26 0.31 0.34 0.36 0.41
c_dist100 -1.00 0.11 -1.22 -1.08 -1.00 -0.93 -0.80
c_log_arsenic 0.91 0.07 0.78 0.86 0.91 0.96 1.04
c_educ4 0.18 0.04 0.11 0.15 0.18 0.21 0.25
c_dist100:c_educ4 0.34 0.11 0.14 0.27 0.34 0.42 0.56
c_log_arsenic:c_educ4 0.07 0.07 -0.08 0.02 0.07 0.11 0.20
Diagnostics:
mcse khat n_eff
(Intercept) 0.00 0.14 2905
c_dist100 0.00 0.12 2847
c_log_arsenic 0.00 0.12 2909
c_educ4 0.00 0.07 2994
c_dist100:c_educ4 0.00 0.12 2828
c_log_arsenic:c_educ4 0.00 0.09 2911
For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and khat is the Pareto k diagnostic for importance sampling (usually good perfomance when khat<0.7).
Computed from 4000 by 3020 log-likelihood matrix
Estimate SE
elpd_loo -1937.9 17.2
p_loo 6.1 0.2
looic 3875.9 34.4
------
Monte Carlo SE of elpd_loo is 0.0.
All Pareto k estimates are good (k < 0.5).
See help('pareto-k-diagnostic') for details.