How to best handle a large forcing vector for an ODE system

Hi Stan team,

I’m working on a project that compares the fit of multiple soil ecology ODE models to a time series vector of some biomass measurements. These ODE systems would receive forcing input from temperature along the lines of the following pseudocode (the actual system is more complicated and unnecessary to show in this case):

real[] some_ode(real t, real[] y, real[] theta, real[] x_r, int[] x_i) {
    real dydt[3]; 
    dydt[1] = -theta[1] * y[1] + theta[2] * y[2] + theta[3] * (theta[4] - temp_vector[t]) * y[3]; 
    dydt[2] = theta[1]* y[1] - 2 * theta[2] * y[2];
    dydt[3] = theta[2] * y[2] - theta[3] * (theta[4] - temp_vector[t]) * y[3];
    return dydt;
}

t and consequently temp_vector are relatively long vectors with hundreds of thousands of entries. I was wondering what the best way of incorporating temp_vector into the ODE system code would be. As individual elements in x_r, that would be too inefficient and tedious to code. However, I will not be able to include temp_vector as an element in x_r because it would be an array and not a real. Is there an efficient means of implementing this forcing term in Stan?

Thank you for your help!

Hi,

t is a real so it cannot be used as an index for an array or vector. What you probably need to do is to define a function temp_vector(t). Even if you only request a solution at times that are integer, the numerical integrator evaluates the derivative, dydt are various time steps in between.

Do you have an expression for temp_vector, or only an array of value? In the latter case, you may need to extrapolate but that depends on the specifics of your model.

1 Like

Hi @charlesm93, nice to see you again! We interacted at some point 3 - 4 years back where you gave me some advice about ODE’s in Stan, so thanks again for that.

So, in this case, I’m required by the leads on this project to figure out a way to fetch the temperatures from an array of value. I do not have an expression for temp_vector. I could fit it very specifically via a lot of splines, but that also seems very inefficient.

Hmm, is there any possible way to hack something together to bring the vector in and then afterward, some ceiling or floor function that could be used to map the real t to an int? If not, I’ll probably have to use something other than Stan for this project.

1 Like

Hi!

This is a common problem… below is a function which gives you back the index i in a sorted vector (of time-points) for which x > sorted_i just holds.

int find_interval_elem(real x, vector sorted, int start_ind) {
  int res;
  int N;
  int max_iter;
  real left;
  real right;
  int left_ind;
  int right_ind;
  int iter;

  N = num_elements(sorted);

  if(N == 0) return(0);

  left_ind  = start_ind;
  right_ind = N;

  max_iter = 100 * N;
  left  = sorted[left_ind ] - x;
  right = sorted[right_ind] - x;

  if(0 <= left)  return(left_ind-1);
  if(0 == right) return(N-1);
  if(0 >  right) return(N);

  iter = 1;
  while((right_ind - left_ind) > 1  && iter != max_iter) {
    int mid_ind;
    real mid;
    // is there a controlled way without being yelled at with a
    // warning?
    mid_ind = (left_ind + right_ind) / 2;
    mid = sorted[mid_ind] - x;
    if (mid == 0) return(mid_ind-1);
    if (left  * mid < 0) { right = mid; right_ind = mid_ind; }
    if (right * mid < 0) { left  = mid; left_ind  = mid_ind; }
    iter = iter + 1;
  }
  if(iter == max_iter)
    print("Maximum number of iterations reached.");
  return(left_ind);
}

So you can first find the index of the respective time and then you use that to pull out the right value of y.

I used the function a lot and I think I tested it a while ago, but do treat this function with caution and consider to test yourself.

Sebastian

1 Like

Thanks, @wds15.

So, I think my remaining issue is just still getting the temp_vector into the ODE function code in the first place if x_r only takes reals and not vectors. Have any folks had experience bringing in forcing vectors into ODE code?

to_array_1d(temp_vector) ?

@wds15, sorry, I’m a bit confused. Are you saying that to_array_1d(temp_vector) should allow me to convert to an array of reals that could then be assigned as an element in x_r?

Also, @wds15, I’m guessing from your posts that you’ve implemented forcing in ODE systems that you’ve fit with Stan before? If so, I’m definitely interested in taking a look at your previous code if you’re willing to share for some inspiration. If not, no problem.

Correct.

Have a look at my Stancon Helsinki 2018 contribution…and there is a discourse thread of mine on odes and forcing functions floating around on the forum.

EDIT: And I am pretty sure @charlesm93 has also some material on forcing functions for ODEs.

Sebastian, Thank you for the heads up. So, for pseudocode, was something along the lines of this a reasonable way to conceptualize this?

functions{

int index_search(real x, int max_index) {
    real rounded_x = ceil(x);
    int iter = 0;
    int stop = 0;
    int out;
    while (stop < 1 && iter < max_index) {
        if (rounded_x == iter) {
            out == iter;
            stop = 1;
        } else {
            out == max_index; //If final iteration, x should equal max_index, and iter would grow until = max_index.
        }
        iter = iter + 1;
        return out;
    }
//I was thinking I could use a simpler function for my index search because I did 
//not need to match a dose, as in your code, so I used something based on 
//@Christopher-Peterson's code snippet he shared a while back in a different post. 
}

real[] some_ode(real t, real[] y, real[] theta, real[] x_r, int[] x_i) {
    real dydt[3]; 
    int t_index;
    real temp_array = x_r[1]  //in this pseudocode, I assigned the temperature array as x_r[1]

    vector[N_t] temp_vector = to_vector(temp_array);

    t_index = index_search(t, N_t);

    dydt[1] = -theta[1] * y[1] + theta[2] * y[2] + theta[3] * (theta[4] - temp_vector[t_index]) * y[3]; 
    dydt[2] = theta[1]* y[1] - 2 * theta[2] * y[2];
    dydt[3] = theta[2] * y[2] - theta[3] * (theta[4] - temp_vector[t_index]) * y[3];
    return dydt;
}
}

...
data {
int<lower=1> N_t // Number of time points and forcing points
vector[N_t] temp_vector
...
}
transformed data {
x_r[1] = to_array_1d(temp_vector)
...
}

...

Possibly there are logic errors in my pseudocode for finding the t index. Was just thinking I needed a different interval search function since I was not looking for an index corresponding to sorted increasing values.

Here’s the thread @wds15 is talking about. It has code and examples from the two of us:

I wrote a poster on the subject and an implementation for Torsten (a extension of Stan for pharmacometrics), which I have here. I also wrote a manuscript, but the journal I submitted it to was not interested.

@andrewgelman spoke about writing a case study paper or extending documentation on ODE-based models, following this episode. It could be an extensive update on the (relatively simple) StanCon article I wrote on ODE-based models in 2017, and an opportunity to pool everyone’s expertise.

The code looks fine, though Sebastian can take a closer look.

The question to raise is how do you want to do the extrapolation: here, it looks like you’re rounding t, so essentially, your function over \mathbb R is a step-function. If this is your intention, the above scheme should work.

Thanks, @charlesm93, the intent was indeed a step-function.

On a related note, I and my advisor have a pre-print in the journal Biogeosciences right now which used RStan and loo to fit two soil carbon models to univariate soil CO2 data from a meta-analysis and was heavily helped along by @Bob_Carpenter’s soil C example (https://mc-stan.org/users/documentation/case-studies/soil-knit.html). Current project involves a much larger multivariate data set and more complex ODE models. Coming from the applied side, my expertise is scant, but I’d happy to help how I can with ODE-based models in Stan review or documentation to give back to this community.

Much appreciation again for your help, @charlesm93 and @wds15!

Edit: @charlesm93, it is a bummer the journal was not interested. Looks like a very pertinent and useful project. Did you consider submitting the manuscript to a different journal?

Here’s the link to Sebastian’s article, really worth the read.

On a related note, I and my advisor have a pre-print in the journal Biogeosciences

Based on the abstract, this looks very interesting. Bob and I did spend some time on a carbon model way back when, the concept is very neat.

Edit: @charlesm93, it is a bummer the journal was not interested. Looks like a very pertinent and useful project.

I still got a conference poster and funding to implement the code in Torsten. I don’t think it was a great research paper. The idea was not novel and the autodiff analysis, if new, not relevant to most users – the idea here being to decouple a large ODE into a forcing function and a reduced ODE. We also used the same trick for steady states calculation, which is neat. I expect these results will be included into a general paper on Torsten, or a technical appendix thereof.