Scalable Bayesian multilevel modeling

@wds15: agreed, we need a well-written and tested solver. What about KINSOL, as discussed in previous posts? Are there reasons to prefer IDAS to KINSOL?

You’re right that which autodiff mode we use plays a crucial part. Here, differentiation is done via the implicit function theorem, combined with a few analytical results. Only the Jacobian of the system, f, with respect to the parameters gets computed with autodiff: with rev-mode this takes > 40 s, with fwd-mode < 1 s. But I think we can do better with rev mode and the right initial cotangent, per @betanalpha’s suggestions. This takes one sweep, see details here.

Parallelization in general would be great – and one of the benefits of KINSOL. But even “before”, there are still steps to improve the optimization. In any case, down the road we want to use the solver from KINSOL or IDAS.