PyMC3 lets you do this. As does Edward. I don’t think anyone’s ever evaluated what happens when mixing HMC/NUTS with other samplers. If there are evals, I’d very much like to see them. We are afraid that HMC/NUTS is so sensitive to tuning parameters that just a naive alternation will lead to highly biased samples because NUTS won’t be able to set the right step size or mass matrix to move effectively.
You can’t get a sparsity-inducing prior in full Bayes for continuous parameters—you need to write something like spike-and-slab that gives a finite probability mass to the zero solution. So something like L1 (Lapace prior; aka double-exponential) regularization won’t lead to a sparse posterior the way it will with MLE point estimates.