The Monster Model | Hierarchical ODE models

Hm, just a short, not-yet-bread-worthy update. I’ll have to work on something else for now, but will revisit this at some later point. Everything that follows has been tested with just a few simulated data sets with the true parameters drawn from the population posterior as specified in the original paper, table 1, so take it with a grain of salt:

  • The linear ODE has a closed form solution, see e.g. Wikipedia
  • The linear ODE with the original parametrization actually fits (almost) fine with and without truncation. For large measurement errors, things are significantly easier than if the measurement is precise. However, for “precise” (log-std < .1) measurement, there is some unexpected “true” multimodality in the population-std parameters. “True” in the sense that we observe different modes in the marginal distributions, chains get stuck in those modes, but Rhat(lp__) is fine (<1.01). There may be a bug in my code.
  • I’m actually not sure whether the reduced linear ODE fits better or worse than the original ODE. It fails or works for the same problems. But I think I remember that it worked better than the original ODE, maybe it was just lucky before?
  • Both linear ODEs fit in the order of seconds or minutes, depending on the magnitude of the measurement noise. The difference in runtime is due to much larger n_leapfrog__ for precise measurements, as then the posterior is “more constrained”.
  • For any of the ODEs, the marginal distributions of many of the parameters do not change much.
  • To cheaply fit the full nonlinear ODE we can in principle make use of a Strang splitting. In practice, there are some floating point overflow issues, so instead of a splitting of the flow into two exact flows, I currently use the exact flow of the linear part + an approximation of the flow of the Michaelis-Menten part obtained via a single step of the implicit Euler method, which can be written down explicitly
  • We can monitor the convergence of our splitting method by recomputing lp__ (e.g. using `log_prob_grad` method for CmdStan mimicking its `generate_quantities` method) for the whole fit with more sub-steps. With this we can estimate neff and decide whether we need more sub-steps. For the whole of the prior we need quite a few sub-steps to be sufficiently precise, but the prior is very wide. For the whole of the posterior we need significantly fewer steps. How many steps we will need depends on the parameter values (larger VMI and KMI require more sub-steps because a linear approximation gets worse) and on the magnitude of the measurement noise (larger measurement noises allow coarser approximations).

There are a few more ideas I’ll try when I come back to this problem.

Ah, I forgot this:

  • To cheaply explore the qualitative behavior of the posterior of the nonlinear model, we may just use a fixed number of sub-steps for the data-generation and fitting. Then fitting the approximation of the full nonlinear model which hopefully behaves qualitatively the same as the true nonlinear model takes again in the order of seconds to minutes.

Oh, I also forgot to attach the obligatory excessively large picture:

Blue/Red: Prior/Posterior
The population-std of Pba misbehaves.
There are two divergences that can be eliminated, but I broke the part of my code that does this.

Almost everything should be as in the paper, except for

  • the prior on the population-stds, which is lognormal with std 1/2. The picture looks slightly worse with the original priors.
  • All timepoints are equidistant, including the exposure start and stop time (this allows us to reuse the matrix exponential) and at all timepoints blood and air concentration get measured
  • The data are synthetic, obtained via bdf with abstol=reltol=1e-12.
  • There’s no truncation.

(The prior on the measurement noise appears different than in the paper because we sample from the prior, but this doesn’t work with the original prior. For fitting we use the original prior.)

We used 8 sub-steps, which appears to have been enough. It looks like the true parameters (black vertical lines) are recovered quite nicely.

Fitting took ~ 3 minutes. Rhat(lp__) < 1.01, E-BFMI ~= 1.

1 Like