Zero chain movement/mixing in forced ODE model

If your forcing function is step-like, then integrate step-wise through the system. That is a bit more tedious, but this is actually the preferred solution for the case of step-function inputs.

I rediscovered this the hard way, but I’m glad to hear it’s something that’s already known. Is it documented anywhere? I haven’t found any other reference to it. If it’s not already documented, I would suggest giving it some prominence, given that it has the potential to cause very mysterious behavior.

All the best,

Robert Dodier

1 Like

Personally, I think it’s great that people dig out old threads to add information, that way other people can find everything in one place. But not quite sure what the actual policy is there…

You are certainly not the only one that fell into this trap. What did happen in your case? Insane runtime scalings when adding more intervals as e.g. here? And/Or runtime was fine, but the chains were stationary?

I do wonder whether it’s mainly the bad runtime scaling or also the bad gradient approximation. I’ll link back and forth to `log_prob_grad` method for CmdStan mimicking its `generate_quantities` method

Hi Funko, thanks for your reply. Since you asked, what happened, from what I can tell, is that the ODE solver (both rk45 and adams are susceptible, adams less so than rk45) was evaluating the derivatives at time points that were spaced relatively far apart, because the solution was very smooth in those regions. However, there are forcing events which can happen in a short time window, and if the forcing events are between two derivative evaluations, the ODE solver doesn’t see it at all, and the solution is computed as if that event never happened.

The observable symptom was that the solution sometimes had all the bumps (effects of the forcing events) and sometimes some of the bumps were missing. The likelihood function in this model is effectively just the mean-square error between the computed solution and some given data points, so when bumps are included or excluded, it makes for jumps in the MSE, which could occur for only slightly different parameter values.

So the likelihood is bumpy, which, I surmise, leads to problems in sampling. The sampling behavior I observed was that each chain would get stuck at one parameter value, and the sampler would take microscopic steps (several orders of magnitude too small) around that.

I was calling the ODE solver one time with a list of times for which the output is required (1152 intervals of 5 minutes each). I printed out the times at which the derivatives were evaluated and that’s how I found the solver was skipping over some event times (events happen within one 5 minute window). I changed it so that the ODE solver was called 1152 times, once for each 5 minute interval, with the previous solution as the initial conditions and asking for output at only one time, namely t + 5 minutes.

When I call the solver once for each interval, that ensures all events are detected, and the likelihood function is smooth, and the chains are happy.

I wonder how to help people avoid this problem. Maybe the ODE solver can have a secondary output which is the list of times at which the derivatives were evaluated? Just thinking out loud here. The problem I see is that if people supply a list of times for which they want the solutions, they might incorrectly assume that the derivatives will be evaluated at those times – that is the incorrect assumption which I made. Maybe there is some way to advertise that’s not the case.

Thanks for your interest, and all the best.

Robert Dodier

1 Like

I thought we have some text for this in the doc by now.

You did the right thing… though you pay for it if the initial condition at t0 is data only and not a parameter being inferred. In that case you are enlarging the ODE system by integrating one-by-one time point and restarting. What you could do instead is do again just one big solve, but output in addition around the events the solution of the ODE. Here is a good discussion about discontinuities:

https://computing.llnl.gov/projects/sundials/usage-notes

1 Like

I thought we have some text for this in the doc by now.

The documentation for the ODE solvers seems like the right place for it, but at present the documentation (9.2 Ordinary differential equation (ODE) solvers | Stan Functions Reference) says only that times is “solution times”, and that the return value is the “state of the system at every time in specified in the times argument”. So if one is sufficiently enlightened, as I am now, one knows that doesn’t imply the derivatives are evaluated at those times. I guess I am thinking maybe it should be more explicit, for the benefit of less enlightened readers.

What you could do instead is do again just one big solve, but output in addition around the events the solution of the ODE.

Not sure what you mean. It sounds like you are suggesting making one call to the solver and asking for output around the event times, but that’s what I did before, and it didn’t have the expected result.

Yes, this is what I mean. Maybe you need to output before, at and just after the event a time-point to make the ODE integrator really „see“ the jump. Did you try only RK45 or also the CVODES based solvers?

The ODE doc got improved and this will become available with 2.27

As I read the usage notes, this does not guarantee that the rhs actually gets evaluated at those times:

After a successful return from the solver function, the dense output functions return the interpolated solution at any time t

But whether or not the rhs gets evluated probably depends on the implementation stan-side?

Maybe you need to output before, at and just after the event a time-point to make the ODE integrator really „see“ the jump.

No, that’s exactly what didn’t work, and can’t work, according to the SUNDIALS page referenced before. The times array which I was supplying to the solver included the points at which there are discontinuities in the derivatives, and the solver missed them, because the solution was smooth before and after those times, so it believed (quite reasonably) that it didn’t need to evaluate in that region, and calculated the output value at those times by interpolation.

I tried rk45 and adams solvers, and rk45 is more susceptible to this problem, and adams less so, but instead of leaving it to chance it seems better to guarantee that the derivatives will be evaluated by restarting the integration at the discontinuities. This is what I ended up doing, and also it’s what’s recommended by the SUNDIALS page.

Interesting. It is for sure the safe thing to restart the ODE integrator at discontinuous events. I think I now recall in that I made my discontinuous events (dosing events of drugs given to patients) less discontinuous by controlling it with a width of a delta peak. When choosing the delta peak wide enough and having some time-points around the event, then everything worked for me. That’s probably problem specific. Anyway, I am glad it now works for you.

ODEs are tricky!

I’m working in a very similar domain; the event is physiological response to carbohydrate intake. I also considered stretching the event out over time so that the solver would be more likely to see it. However, I didn’t want to leave things to chance, and so I tried to figure out a way to be sure the event was noticed, and that’s how I ended up stopping and restarting the solution.

If you smear out the event, have the integrator spit out times around the event and have tolerances tight enough, then it should work ok is what I expect…but restarting is safest, but comes at a performance penalty potentially.

This sounds unreasonable, does it not?

Wonder whether it isn’t possible to remove that penalty…

1 Like

If use cvodes’ restart function the penalty is very modest. It’s just Stan doesn’t implement that.

My 2 cents

  • there’s now a cash-karp (ode_ckrk) integrator that handles near-discontinuity better than rk45.
  • instead of adding events through time points one can also code the RHS function using if-else to condition its behavior, but
  • there’s no reason to believe the solver will handle the solution immediately after impulse input properly if one doesn’t control the tolerance and time scale accordingly. Current implementation assumes initial time step 1 and will try to accommodate the requested accuracy as much as it can, but with impulse forcing it often fails.
  • this is an issue we should try to automate as much as possible. Even though we can ask user to strategically place time around discontinuity, it’s only possible for static events not dynamical events. Cvodes supports event location it’s just Stan hasn’t implemented it.
1 Like