Does setting stepsize also needs improvement in current ADVI implementation?
Robust VI paper’s motivation is that ∆ELBO
is too noisy and not scaleoptimized to be used for stopping rule ( @avehtari 's more detailed summary).
I thought even though the updated version of stopping rule is implemented, the same problem could remain in setting the stepsize (eta in the code). We might be trying to improve ‘noisy objective function’ problem (mostly this part) with a fixed stepsize which are the result of an operation on ‘noisy objective function’.
I have thought of two possibilities:
More generally, sometimes the objective estimates are too noisy relative to the chosen step size η ,

improvement not needed:
since the relative scale is the problem (quote from the paper), optimizing the stopping rule given the fixed step size might be enough? 
improvement needed:
If so, could I ask for some opinion on its improvement? I guess Monte Carlo standard error suggested in the paper cannot be applied here as stepsize, unlike solution \lambda_{t}, does not form a Markov chain (or does it?).
Please let me know if I am missing something, thanks!