Slow sampling in simple ode

wds15 · July 10, 2021, 7:59pm

Try to lower your absolute tolerance of your problem allows that. Also try the Adam’s solver for the backward solve. That helped in a few of my problems.

Chechgm · July 11, 2021, 7:19am

I tried reducing both absolute and relative tolerance to 10^{-3}, and changed the solver to Adam’s solver as you suggested. I am still getting no sampling and the following error:

'CVODES: CVode At t = 80 and h = -1.73212e-151, the error test failed repeatedly or with |h| = hmin. Error code: -3'

I am starting to think that the problem might be too ill posed with the data I am trying to fit it with.

wds15 · July 11, 2021, 8:45am

How about simulated data?

Chechgm · July 11, 2021, 11:04am

@wds15 Yes. I think that would also be a good step.

The difficulty of what I am trying to do is that I don’t know what is the underlying process. More specifically, I don’t know which variables affect which, and I would like to, through some variable selection process, choose those variables and interactions that best describe the process of each variable.

Now, in order to build such simulated data, I think I probably have to come up with a (sparse) matrix of coefficients, simulate the ode’s forward and then try to infer that matrix.

Funko_Unko · July 11, 2021, 11:13am

What do you know about the underlying process?

Chechgm · July 11, 2021, 11:17am

I know the class of functions that might have generated the process. Specifically, I know that the only possible function that could have generated any particular variable is the following:
\frac{dy_{i}}{dt}=\theta'y+\eta \hat{y} + \gamma x
Where \hat{y} are the interaction terms, and x are some “exogenous” variables. I also know that the process is sparse. So most of the elements of \theta, \eta, and \gamma are 0.

Funko_Unko · July 12, 2021, 7:39am

Hm, so just to clarify because I’m a little bit confused by the single index on the left hand side which does not appear anywhere else:

You have an N-dimensional (N=14 is fixed?) state vector y \in \mathbb{R}^N which (using Einstein’s summation convention) evolves according to the ODE
\dot{y}_i(t)=\theta_{ij}y_j(t)+\eta_{ijk}y_j(t)y_k(t)+\gamma_{i,\mu}x_\mu
where the indices i,j,k run from 1 to N, \mu runs from 1 to some other natural number M, \theta is a (sparse) constant NxN matrix, eta is a (sparse) constant NxNxN “tensor”, \gamma is a (sparse) constant NxM matrix and x is a (constant?) M-dimensional vector?

Chechgm · July 12, 2021, 9:13am

That is correct. I am sorry I was a bit too informal in my explanation. x is not necessarily constant, but we don’t care about the dynamics of it. In fact we can control x, and whenever we act on x there will be an (unknown) effect on y. We do know, though, that all the matrices (and tensor) of the dynamics are sparse. So it might be the case that only 3 or 4 \eta_{ijk} are non zero from all the possible 14^{2} parameters for each \dot{y}_{i}(t).

Funko_Unko · July 12, 2021, 10:10am

So x(t) is given/prescribed?

The ODE does sound simple enough, but fitting it seems quite difficult. Carrying along all potentially (effectively) zero parameters will be very expensive, but a priori you can also not just set anything to zero, or can you? Maybe you could start with just a few non-zero entries, and then incrementally add more? But this also becomes a combinatorial nightmare, doesn’t it…

There’s nothing else you know about the matrices and tensors? Symmetry, antisymmetry or positivity?

Chechgm · July 12, 2021, 10:51am

Yes, x(t) is prescribed. And yes, you hit the spot, if I were to try the problem with a few combinations at a time only, then it becomes a nightmare quite fast. I can’t set anything to zero a priori, unfortunately.

No additional information about the matrices and tensors. I do have more than one measurement. So for the system at hand I measured the process with different starting points and different x(t) 20 times.

Funko_Unko · July 12, 2021, 11:50am

Huh, how horrible.

For the ODE part you will have to somehow get the adjoint solver to work, but even then it will probably take forever…

I guess in principle it should be possible to adaptively remove non-zero entries, but none of the interfaces is equipped to handle this…

Chechgm · July 12, 2021, 1:51pm

Thank you very much for your answer! (no sarcasm) It really tells me that maybe I am not looking in the right place, or that I have to somehow try to reduce the problem to several smaller problems.

If you are interested on how this problem has been tackled in the past, there is an interesting paper about it: “Learning stable and predictive structures in kinetic systems: Benefits of a causal approach” (https://arxiv.org/pdf/1810.11776.pdf). The thing is that they use a frequentist approach, and they do, in fact, try out several small ODEs. I wanted to tackle the problem from a Bayesian perspective.

I guess there is also the possibility of using a different inference method (such as VI) that might be less precise, but considerably faster. Then I would have to get out of the Stan world.

In any case, thank you very much for taking the time, discussing with me, and suggesting different approaches. I have learned a lot these last couple of days.

Funko_Unko · July 12, 2021, 2:20pm

This might actually be the case. I think Stan also lacks support for sparse matrices, which would probably become crucial if you want to extend the method to e.g. metabolic networks as in the paper you linked.

Hm, Stan’s ADVI would not be applicable?

Well, I am looking for challenging models to fit. I guess your model would qualify ;)

Chechgm · July 12, 2021, 4:06pm

I didn’t know ADVI was still supported. I am using pystan for which the support stopped in version 3.2.: Upgrading to Newer Releases — pystan 3.2.0 documentation . Maybe I should take a look into cmdstan.

With respect to challenging problems. The data and all the rest of the details are available here: https://learningbydoingcompetition.github.io/

Funko_Unko · July 12, 2021, 5:37pm

Interesting. But the problem specification suffers from the same index confusion as your formulation above, doesn’t it?

Chechgm · July 12, 2021, 5:53pm

I think equation 7 of the whitepaper is more or less clear. It has a formulation similar to yours, without taking into account the control variables (what I called x but in their case is U).

Funko_Unko · July 12, 2021, 6:07pm

Well yes, it’s just that the index i again only appears on the left hand side, which of course wouldn’t make a lot of sense.

Funko_Unko · July 16, 2021, 9:35am

Just digging this thread up to tell you that this is not correct, which might help you solve the problem (though probably not using Stan).

In particular, due the ODE being derived from mass-action kinetics, the following holds:

\theta_{i,j} \ge 0 if i \ne j
\eta_{i,j,k} \ge 0 if i \ne j and i \ne k.

Furthermore, if you do not allow species to “magically” introduce mass into the system you have at least

\theta_{i.i} \le 0,

but I think magic is allowed.

Chechgm · July 16, 2021, 11:49am

@Funko_Unko Thank you for the information! Are you participating too?

Funko_Unko · July 16, 2021, 11:50am

Hm, I don’t know. Looks like I would need some serious computational resources, but currently I’m restricted to a sad little notebook.

The problem is of course massively scalable, both the fitting with 12x20 independent ODEs and then the optimization with 12x50 independent ODEs.

Topic		Replies	Views
Fitting ODE models: best/efficient practices? General ode	85	6134	March 29, 2021
Code review of slow hierarchical (ode) model Modeling fitting-issues , performance , epidemiology	3	844	November 19, 2020
Speed up sampling of forced ODE model Modeling	7	2322	June 14, 2017
Adjoint ODE behaviour CmdStan fitting-issues , ode , covid-19	56	3820	November 10, 2021
Help fitting ODEs in STAN Modeling fitting-issues	9	1090	October 22, 2019

Slow sampling in simple ode

Related topics