I have been working with large systems of ordinary differential equations in a systems modelling context, similar to the ‘system dynamics’ literature of Jay Forrester, John Sterman et al. System dynamics models often include a large number of ODEs (20+ is not uncommon) and parameters, and data for some of the state variables may never be available/have a high proportion of missingness. For example, see the C-ROADS model (https://www.climateinteractive.org/tools/c-roads/).
Solving large systems of ODEs in Stan obviously takes a long time (up to days/weeks in certain cases with a lot of data), although I have managed to increase their speed by ensuring all parameters are on a similar scale and amending the tolerances etc.
I was reading this paper comparing dynamic Bayesian Markov models to ‘gold standard’ Bayesian ODEs (https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-018-0541-7), where the authors essentially converted the system of ODEs they were working with to difference equations, and fit the models using WinBugs (for the ODE model) and JAGS. They found that the discretised dynamic Markov model gave comparable estimates to the Bayesian ODE, with a massive speed increase.
I changed some of my Stan programs to difference equations rather than using integrate_ode_rk45
, and also saw a massive speed increase - one model on simulated data went from running for ~2 days to under 20 minutes, with no signs of convergence problems (no diveregent transitions, good n_eff) or any real decrease in accuracy of the results.
As a small example, I have attached a logistic model coded as an ODE system and as a difference equation/discrete-time system (DTS), showing that the estimates are reasonably comparable.
The ODE function looks like:
// logisitc ode model
functions{
real[] m(real t, real[] z, real[] theta, real[] x_r, int[] x_i){
real dydt = z[1] * theta[1] * (1 - z[1]/theta[2]);
return{ dydt };
}
}
and the DTS system looks like:
// logisitc dts model
functions{
real[] m_dts(int T, real[] z, real[] theta){
real n[T];
n[1] = z[1];
for(t in 1:(T-1)){
n[t+1] = n[t] + n[t] * theta[1] * (1 - n[t]/theta[2]);
}
return n[];
}
}
Here is the ODE output, where the true parameters were y_init = 1
, sigma = 0.1
, theta[1] = 0.2
and theta[2] = 200
:
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
y_init[1] 0.95 0.00 0.04 0.86 0.92 0.95 0.98 1.03 2042 1
sigma 0.11 0.00 0.01 0.10 0.11 0.11 0.12 0.13 2807 1
theta[1] 0.20 0.00 0.00 0.20 0.20 0.20 0.21 0.21 1966 1
theta[2] 196.98 0.05 2.74 191.52 195.08 197.02 198.85 202.32 3069 1
and here is the DTS output:
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
y_init[1] 1.19 0.00 0.05 1.10 1.16 1.19 1.23 1.30 2035 1
sigma 0.11 0.00 0.01 0.10 0.11 0.11 0.12 0.13 3375 1
theta[1] 0.22 0.00 0.00 0.21 0.22 0.22 0.22 0.23 1695 1
theta[2] 195.61 0.05 2.62 190.47 193.88 195.56 197.40 200.59 2620 1
And here are the model predictions (thin black lines - predictions are without added noise) plotted against the simulated noisy data (red dots) and actual simulated model (thick blue line):
There are some drops in accuracy, even in this small model, but I have not found them to be particularly worrying, even in larger models where differences between continuous and discrete time modelling may be more evident (i.e. models where ordering of events has a potentially large effect on the outputs).
I’d be really interested to hear people’s opinions about the costs and benefits of this approach over using the ODE solvers in Stan. Obviously, using the ODE solvers is the gold standard, but when there is a potentially 10-20 times speed up in model runs for a relatively small drop in accuracy, my thinking is that it could be a useful tool.
I will try to look into more complex models in a more systematic fashion too.
r-logistic-example.R (1.7 KB) logistic-ode.stan (805 Bytes) logistic-dts.stan (760 Bytes)
Thanks!