@bbbales2 & @wds15 have lamented for quite a while the lack of application of ode_adams
in their modeling & development practice. This is partially my fault in being terse in documentation and later procrastination, and partially due to the bias of models used in Stan’s testing & benchmark, which is really a bias of fields that Stan has touched so far. I’ve started amend the issue in documentation, and here would like to give an example in which ode_adams
triumphs.
So far Stan’s ODE development routinely involves only a handful of examples: predator-prey model by @Bob_Carpenter , epiomology model as found in posteriordb, a PKPD model,, a textbook oscillation model, and Lorenz attractor model. The model below is not really more sophisticated, in fact, one can argue it is simpler for being linear:
This is Airy’s equation, and in code it’s just
functions {
vector airy(real t, vector y) {
return [ y[2], -t * y[1] ]';
}
}
Assuming we want to infer y_0 given y observed at t=90,91,\dots,100:
data {
int n; // 11 data points from t=90 to 100
real ts[n];
vector[n] yobs;
real rtol; // 1e-10
real atol; // 1e-10
}
and also want to ensure the accuracy at these points is at scale of 1e-6
. Consider the equation has frequent applied in fields optics and electromagnetics, this is not too much to ask for instrumentation. Since it has analytical solution, we can backtrace to determine that tolerance required should be at least 1e-10
in ODE solvers. This happens to be the default of BDF & Adams solvers.
parameters {
real<lower=0> y0;
real<lower=0> sigma;
}
transformed parameters {
vector[2] yinit = [y0, 0.0]';
vector[2] y[n] = ode_adams_tol(airy, yinit, 0.0, ts, rtol, atol, 1000000); // or bdf or rk45
}
model {
y0 ~ cauchy(0, 0.5);
sigma ~ normal(0, 0.01);
for (i in 1:n) {
yobs[i] ~ normal(y[i][1], sigma);
}
}
Simulated data based on y0=1.0
sigma=0.01
:
n <- 11
ts <- c(90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100)
yobs <- c(
-0.27204644, 0.26198631, -0.25835151, 0.22486253, -0.16963251, 0.09098462,
0.02549337, -0.14729033, 0.23571404, -0.27632096, 0.25950532)
rtol <- 1.e-10
atol <- 1.e-10
Fitting from init y0=1.1; sigma=0.1
, regardless integrator used, we get something like
However, the time performance among the three integrators are very different:
with adams
takes less than half of the time than bdf
. To see why, we print out the order used in each Adams time step(keep in mind BDF & Adams are variant-order, with BDF order 1-5 and Adams order 1-12), in a fixed_param
run:
> fit <- mod$sample(data = "airy.data.R", init="airy.init.R", iter_sampling=1, fixed_param=TRUE, seed=987)
Running MCMC with 1 chain...
Chain 1 Last order: 9
Chain 1 Last order: 9
Chain 1 Last order: 9
Chain 1 Last order: 9
Chain 1 Last order: 9
Chain 1 Last order: 9
Chain 1 Last order: 9
Chain 1 Last order: 9
Chain 1 Last order: 9
Chain 1 Last order: 9
Chain 1 Last order: 9
Chain 1 Last order: 7
Chain 1 Last order: 7
Chain 1 Last order: 7
Chain 1 Last order: 7
Chain 1 Last order: 7
Chain 1 Last order: 7
Chain 1 Last order: 7
Chain 1 Last order: 7
Chain 1 Last order: 7
Chain 1 Last order: 7
Chain 1 Last order: 7
This shows Adams is constantly using much higher orders than rk45 and the max order of BDF, so it can take less steps to reach a same accuracy. It is this capability of bumping up to high orders in smooth non-stiff problems that makes Adams suitable to problems that requires higher accuracy, such as in orbital dynamics and oceanography. Some simulations in these fields require a long-duration time(where is Mercury in year 2050?) and one must reduce tolerance to avoid accumulation of local errors.
The model & data in this post can be found in Torsten’s example models.