More details about the model itself will likely be needed to make a proper comparison. Without further information, I can probably say that this is a very small number of HMC samples, so it may not be enough to tine the method parameters, and may not be getting good samples from the posterior. If that’s the case, comparing speed would be meaningless.

Maybe not, but unless the model is trivial and the estimates are comparable between methods, you should probably take into account that as well as the speed, not just how fast you get some output.

Here is my explanation why I don’t expect that. By default, ADVI runs the stochastic optimization maximum of 10k iterations and estimates ELBO every 100th iteration. With your additional options elbo_samples=1000 grad_samples=10 this means 100k log_prob and 100k log_prob_grad evaluations. Your HMC/NUTS options state the default 1000 warmup iterations and 250 post-warmup iterations. If we assume that the gradient computation dominates, HMC/NUTS could use 80 leapfrog steps per iteration and still be faster. 80 leapfrog step per HMC/NUTS iteration is already quite a lot, indicating that the posterior is likely to be far from normal or very high dimensional, which would imply that ADVI approximation would be bad even with more computation time. Even if you would use the default ADVI options, HMC/NUTS could use 8 leapfrog steps per iteration, which would be fine for many posteriors that are not too far from normal.

See more about diagnosing ADVI and improved black box variational algorithms in

These improved algorithms are not (yet) available in Stan, and even if they were, it’s unlikely that they could beat HMC/NUTS in speed for getting similar accuracy as HMC/NUTS. Stochastic optimization with Monte Carlo estimated ELBO is hard.