I think we’re on the same page here, other than that as I keep pointing out, nothing requires the log density to be a joint density over data and parameters—it’s just interpreted as a density over parameters up to a proportion.
No, but that’d be useful to know what kind of data we could impute as missing data—only modeled data. It might protect against accidental missing data leading to sampling improper densities.
If we have minibatches, it’d cross modeled (the data) and unmodeled (sizes of minibatches).
Yes. You can implement them both as submodels. You probably wouldn’t implement the simple case that way, since it’s also implementable as beta ~ normal(mu, sigma);
. So that’s what’d get swapped out with non_centered_normal(beta, mu, sigma);
if you wanted to write it in parallel. This is the case Maria’s thesis was largely designed to address, so there are examples there, too.
The math library doesn’t need to change. We could start distributing libraries of submodels as well as libraries of functions. Even better if we could compile and link independently. I’ve been reading some stuff by Bob Harper et al. recently only submodules in ML and principles of submodules in general that’s very compelling, and it includes discussion of independent compilation of submodules (which we could do—there really isn’t anything magic about scope going on despite some lingering confusion due to the ill- formed first example I presented).
This should be a separate issue, but I can’t resist any bait (I’d be a short-lived fish).
As you know, the hooks are there in the code to make this easy—all the Jacobians are done in one step in the log density function and could be stored in a separate value.
I’ve wanted to have jacobian +=
in the transformed parameters block from the get-go, but there is a lot of opposition to that both because it’d be confusing and because we’d never use it.
What’s the usue case?