Solving ODE in Stan: different computational times declared by R and Stan

anthony_462 · December 11, 2019, 12:06pm

My ordinary differential equation (ODE) system contains 386 compartments with all but one rates fixed. I assigned a normal prior for the non-fixed rate “theta” and runned my ODE system for 120 timesteps. All the rates are known: the model is not fitted. I only used the bayesian (and Stan) framework in order to vary theta according to its prior. To have an idea of the computational time needed by Stan (and more particularly the integrate_ode_rk45), runned the model for 1 iteration, no warm-up and 1 chain. Surpringly, I found a big difference in the time declared by R and by Stan! The processing time calculated by R is 32 min, but when looking at Stan message, the elapsed time is only 275 secondes (approx. 4 min) (see below). According to me, these two times should be approximatively the same, as the .stan file has been already compilled before this simulation. Moreover, as I specified only 1 iteration, the ODE system needs to be solved only a few times until one value of the theta is accepted. I already solved this ODE system in R and this should not take 32 minutes.
As it is a quite complex model, I cannot unfortunately provide the code.

time.start_nuts1 <- Sys.time()
fit = sampling(mod3, data = list_data, init = 0, chains = 1, warmup = 0, iter = 1)

SAMPLING FOR MODEL 'diff1c' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 171.198 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.71198e+006 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: WARNING: No variance estimation is
Chain 1:          performed for num_warmup < 20
Chain 1: 
Chain 1: Iteration: 1 / 1 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0 seconds (Warm-up)
Chain 1:                275.864 seconds (Sampling)
Chain 1:                275.864 seconds (Total)
Chain 1: 
Warning messages:
1: There were 1 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 
2: Examine the pairs() plot to diagnose sampling problems
 
3: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess 
4: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess 
time.end_nuts1 <- Sys.time()
time.end_nuts1 - time.start_nuts1

EDIT : When I try to run the model with all parameters fixed (no “theta”) using the “Fixed_param” algorithm in the sampling() function, I also end up with big difference in computational times calculated in R and Stan:
time.start_nuts1 <- Sys.time()
fit = sampling(mod2, data = list_data, init = 0, chains = 1, warmup = 0, iter = 1, seed=13219, algorithm = “Fixed_param”)

SAMPLING FOR MODEL 'diff1b' NOW (CHAIN 1).
Chain 1: Iteration: 1 / 1 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0 seconds (Warm-up)
Chain 1:                1.551 seconds (Sampling)
Chain 1:                1.551 seconds (Total)
Chain 1: 
time.end_nuts1 <- Sys.time()
time.end_nuts1 - time.start_nuts1
Time difference of 7.666541 mins

wds15 · December 11, 2019, 12:43pm

Whow!

That’s huge.

Make sure that the only parameter “theta” you pass into the ODE solver is the one rate which you want to fit. Nothing more. You will very likely need the stiff solvers. If appropriate, then make the initial state non-varying (so it must be data).

However, I am not really sure if you get anywhere useful. Maybe try ADVI or optimization (if that’s doable in your case)… but full Bayes will be daunting to do.

The best route here is to simplify… to maybe 4-5 equations at most… really… otherwise you suffer.

anthony_462 · December 11, 2019, 1:51pm

Thanks wds15. Actually, I don’t fit the model, I just specify a prior for theta and run 1 iterations, that’s it. See my updated question. My question is really about the difference about the processing time declared by R and by Stan.

wds15 · December 11, 2019, 1:55pm

Ah. Are you returning a lot of data from the Stan program? Rstan is slow with serialising the data.

How does this work out if you use cmdstan and use “time” to measure the execution time?

anthony_462 · December 11, 2019, 2:13pm

Two array of size 120x386 are returned. I’m unfortunately not familiar with cmdstan, do you have an example to know how it work to measure time?

wds15 · December 11, 2019, 2:25pm

the issue is then most likely that Rstan is doing super inefficient serialisation which you cannot avoid. So you would need to get started with cmdstan (or use cmdstanr).

anthony_462 · December 11, 2019, 4:15pm

I tried with cmdstan and got the same issue. I also tried to remove the “theta” (using the “Fixed_param” algorithm), but the issue remains (see my edited question). I’m really wondering what Rstan do after (and before) the iteration to spend so much time.

Topic		Replies	Views
Fine tuning for polynomial Posteriors Modeling performance	1	37	December 3, 2024
Stan and cmdstan running slow Modeling	8	717	March 24, 2022
Slow hierarchical ODE model Modeling rstan	3	525	January 25, 2023
Costs/benefits of solving large ODE systems as systems of difference equations/in discrete time Modeling performance	9	2507	November 13, 2019
ODE with random effects running slow Modeling	0	287	November 9, 2022

Solving ODE in Stan: different computational times declared by R and Stan

Related topics