Help fitting ODEs in STAN

Hi all,
I’m working on a toy example to estimate parameters for the Droop equations:

P = V*(P0 - P) - rho*(P / (Kp + P))X;
Qp = rho
(P / (Kp + P)) - mu*(Qp - q0);
X = Xmu(1 - q0/Qp) - d*X;

I can solve the equations in STAN for a given set of parameters easily enough, and then add noise to those measurements (I’ve cross-checked the solver with scipy.integrate.odeint and it’s the same answer). The issue is then when I turn around and try to estimate the parameters, the model stalls and doesn’t fit well (and I often get “Exceeded maximum iteration” errors). Does anyone have any advice on what I’m doing wrong? Code below:

#%%
import numpy as np
import pystan as pyst
import matplotlib.pyplot as plt
from scipy.integrate import odeint


#%% Set up the parameters and initial conditions
V = 1
P0 = 2
rho = 0.7
Kp = 0.2
mu = 0.7
q0 = 0.3
d = 0.5

init = [5, 1, 2]
params = (V, P0, rho, Kp, mu, q0, d)

t = np.arange(0, 100, 1)
t0 = t[0]
ts = t[1:]

#%% STAN SOLVER
droop_stan_s = """
functions{
    real[] droop(real t,
                 real[] y,
                 real[] theta,
                 real[] x_r,
                 int[] x_i){
                    real dydt[3];
                    dydt[1] = theta[1]*(theta[2] - y[1]) - theta[3]*(y[1] / (theta[4] + y[1]))*y[3];
                    dydt[2] = theta[3]*(y[1] / (theta[4] + y[1])) - theta[5]*(y[2] - theta[6]);
                    dydt[3] = y[3]*theta[5]*(1 - theta[6]/y[2]) - theta[7]*y[3]; 
                    return dydt;
                 }
}
data{
    int<lower=1> T;
    real t0;
    real ts[T];
    real y0[3];
    real theta[7];
}
transformed data{
    real x_r[0];
    int x_i[0];
}
transformed parameters{
}
generated quantities{
    real yhat[T,3];
    yhat = integrate_ode_rk45(droop, y0, t0, ts, theta, x_r, x_i);
}
"""
droop_comp_s = pyst.StanModel(model_code=droop_stan_s)

#%% Simulate data
solver_dict = {'T': len(ts), 't0': t0, 'ts': ts, 'y0': init, 'theta': params}
sim_data = droop_comp_s.sampling(data=solver_dict,
                                 algorithm='Fixed_param',
                                 iter=1, chains=1)
yhat = sim_data.extract('yhat')['yhat']

plt.plot(ts, yhat[0,:,0], label='P')
plt.plot(ts, yhat[0,:,1], label='Qp')
plt.plot(ts, yhat[0,:,2], label='X')
plt.legend()
plt.show()

# add noise
yhat_obs = yhat + np.random.normal(0, 0.1, yhat.shape)

plt.plot(ts, yhat_obs[0,:,0], 'o', label = 'P')
plt.plot(ts, yhat_obs[0,:,1], 'o', label='Qp')
plt.plot(ts, yhat_obs[0,:,2], 'o', label='X')
plt.legend()
plt.show()



#%% STAN estimator
droop_stan_e = """
functions{
    real[] droop(real t,
                 real[] y,
                 real[] theta,
                 real[] x_r,
                 int[] x_i){
                    real dydt[3];
                    dydt[1] = theta[1]*(theta[2] - y[1]) - theta[3]*(y[1] / (theta[4] + y[1]))*y[3];
                    dydt[2] = theta[3]*(y[1] / (theta[4] + y[1])) - theta[5]*(y[2] - theta[6]);
                    dydt[3] = y[3]*theta[5]*(1 - theta[6]/y[2]) - theta[7]*y[3]; 
                    return dydt;
                 }
}
data{
    int<lower=1> T;
    real y[T,3];
    real t0;
    real ts[T];
}
transformed data{
    real x_r[0];
    int x_i[0];
}
parameters{
    real<lower=0> y0[3];
    real theta[7];
    vector<lower=0>[3] sigma_y;
}
transformed parameters{
}
model{
    real yhat[T,3];
    yhat = integrate_ode_rk45(droop, y0, t0, ts, theta, x_r, x_i);
    for (t in 1:T){
        y[t] ~ normal(yhat[t], sigma_y);
    }
    
    sigma_y ~ cauchy(0, 2.5);
    theta ~ normal(0,1);
    y0 ~ normal(0,1);
}
"""
droop_comp_e = pyst.StanModel(model_code=droop_stan_e)

#%%
estimate_dict = {'T': len(ts), 't0': t0, 'ts': ts, 'y': yhat_obs[0]}
droop_output = droop_comp_e.sampling(data=estimate_dict,
                                   iter=10)

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: Max number of iterations exceeded (1000000).  (in 'unknown file name' at line 34)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: Max number of iterations exceeded (1000000).  (in 'unknown file name' at line 34)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Also, can someone tell me what x_r and x_i are? I’ve read the docs, but I’m confused as to what they are what they are set to a 0-array (and sometimes I set x_i set to a 1-array).

Thanks, I appreciate the help!

1 Like

One quick thought. Try and use a lognormal distribution for your likelihood statement, i.e.

model{
    // priors go here
     sigma_y ~ lognormal(-1, 1); // for example
     y ~ lognormal(log(y_hat), sigma_y); // vectorized
}

I am finding that Stan doesn’t like it when a) estimating parameters for ODE models where the state variables get really big and variable (e.g. orders of magnitude difference between them), and b) using a Gaussian likelihood that has support from negative infinity to positive infinity.

Ensure that your theta parameters have the right constraints, e.g. lower bounds of zero and higher bounds of 1 for proportions.

The x_r and x_i are just data vectors. For instance, in some of my ODE models, I don’t need to estimate certain parameters (we know them already), so I put them in the x_i or x_r vectors to pass them to the ODE solver.

1 Like

And also consider using the stuff solver which is referred to as integrate ode bdf. For pystan this means to install sundials.

Hey @cgoold and @wds15, thanks for the help. I tried the lognormal distribution, it didn’t seem to help too much. I haven’t tried sundials, because that involves recompiling STAN, which I’m not sure I want to do just yet. Placing bounds on the parameters helped. Here’s what I’ve got:

#%%
import numpy as np
import pystan as pyst
import matplotlib.pyplot as plt
from scipy.integrate import odeint
import pandas as pd


#%% Set up the parameters and initial conditions
rho = 0.7
Kp = 2
mu = 0.7
q0 = 0.3
d = 0.5

init = [5, 1, 2]
params = (rho, Kp, mu, q0, d)

t = np.arange(0, 20, 1)
t0 = t[0]
ts = t[1:]


#%% STAN SOLVER
droop_stan_s = """
functions{
    real[] droop(real t,
                 real[] y,
                 real[] theta,
                 real[] x_r,
                 int[] x_i){
                    real dydt[3];
                    dydt[1] = 1*(1 - y[1]) - theta[1]*(y[1] / (theta[2] + y[1]))*y[3];
                    dydt[2] = theta[1]*(y[1] / (theta[2] + y[1])) - theta[3]*(y[2] - theta[4]);
                    dydt[3] = y[3]*theta[3]*(1 - theta[4]/y[2]) - theta[5]*y[3]; 
                    return dydt;
                 }
}
data{
    int<lower=1> T;
    real t0;
    real ts[T];
    real y0[3];
    real theta[5];
}
transformed data{
    real x_r[0];
    int x_i[0];
}
transformed parameters{
}
generated quantities{
    real yhat[T,3];
    yhat = integrate_ode_rk45(droop, y0, t0, ts, theta, x_r, x_i);
}
"""
droop_comp_s = pyst.StanModel(model_code=droop_stan_s)


#%% Simulate data
solver_dict = {'T': len(ts), 't0': t0, 'ts': ts, 'y0': init, 'theta': params}
sim_data = droop_comp_s.sampling(data=solver_dict,
                                 algorithm='Fixed_param',
                                 iter=1, chains=1)
yhat = sim_data.extract('yhat')['yhat']
yhat_obs = yhat + np.random.normal(0, 0.1, yhat.shape)

plt.plot(ts, yhat[0,:,0], 'b-', yhat_obs[0,:,0], 'bo', label='P')
plt.plot(ts, yhat[0,:,1], 'r-', yhat_obs[0,:,1], 'ro', label='Qp')
plt.plot(ts, yhat[0,:,2], 'k-', yhat_obs[0,:,2], 'ko', label='X')
plt.legend()
plt.savefig('/home/nate/Desktop/data.png')
plt.show()

data

#%% STAN estimator
droop_stan_e = """
functions{
    real[] droop(real t,
                 real[] y,
                 real[] theta,
                 real[] x_r,
                 int[] x_i){
                    real dydt[3];
                    dydt[1] = 1*(1 - y[1]) - theta[1]*(y[1] / (theta[2] + y[1]))*y[3];
                    dydt[2] = theta[1]*(y[1] / (theta[2] + y[1])) - theta[3]*(y[2] - theta[4]);
                    dydt[3] = y[3]*theta[3]*(1 - theta[4]/y[2]) - theta[5]*y[3];
                    return dydt;
                 }
}
data{
    int<lower=1> T;
    real y[T,3];
    real t0;
    real ts[T];
}
transformed data{
    real x_r[0];
    int x_i[0];
}
parameters{
    real<lower=0> y0[3];
    real<lower=0> rho;
    real<lower=0> Kp;
    real<lower=0> mu;
    real<lower=0> q0;
    real<lower=0, upper=1> d;
    vector<lower=0>[3] sigma_y;
}
transformed parameters{
    real yhat[T,3];
    real theta[5];
    theta[1] = rho;
    theta[2] = Kp;
    theta[3] = mu;
    theta[4] = q0;
    theta[5] = d;
    
    yhat = integrate_ode_rk45(droop, y0, t0, ts, theta, x_r, x_i);
    
}
model{
    for(t in 1:T){
        y[t] ~ lognormal(log(yhat[t]), sigma_y);
    }
    
    sigma_y ~ lognormal(-1, 1);
    y0 ~ normal(0,1);
    rho ~ normal(0,1);
    Kp ~ normal(0,1);
    mu ~ normal(0,1);
    q0 ~ normal(0,1);
    d ~ normal(0,1);
}
generated quantities{
}
"""
droop_comp_e = pyst.StanModel(model_code=droop_stan_e)

#%%
estimate_dict = {'T': len(ts), 't0': t0, 'ts': ts, 'y': yhat_obs[0]}
droop_output = droop_comp_e.sampling(data=estimate_dict,
                                   iter=1000, chains=4)

Gradient evaluation took 0.00069 seconds
1000 transitions using 10 leapfrog steps per transition would take 6.9 seconds.
Adjust your expectations accordingly!




Gradient evaluation took 0.000793 seconds
1000 transitions using 10 leapfrog steps per transition would take 7.93 seconds.
Adjust your expectations accordingly!


Gradient evaluation took 0.000711 seconds
1000 transitions using 10 leapfrog steps per transition would take 7.11 seconds.
Adjust your expectations accordingly!


Iteration:   1 / 1000 [  0%]  (Warmup)

Gradient evaluation took 0.011008 seconds
1000 transitions using 10 leapfrog steps per transition would take 110.08 seconds.
Adjust your expectations accordingly!


Iteration:   1 / 1000 [  0%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: lognormal_lpdf: Location parameter[3] is nan, but must be finite!  (in 'unknown file name' at line 48)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration: 100 / 1000 [ 10%]  (Warmup)
Iteration: 200 / 1000 [ 20%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: Max number of iterations exceeded (1000000).  (in 'unknown file name' at line 43)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: Max number of iterations exceeded (1000000).  (in 'unknown file name' at line 43)

Iteration: 300 / 1000 [ 30%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: Max number of iterations exceeded (1000000).  (in 'unknown file name' at line 43)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration: 400 / 1000 [ 40%]  (Warmup)
Iteration: 100 / 1000 [ 10%]  (Warmup)
Iteration: 500 / 1000 [ 50%]  (Warmup)
Iteration: 501 / 1000 [ 50%]  (Sampling)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: Max number of iterations exceeded (1000000).  (in 'unknown file name' at line 43)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: Max number of iterations exceeded (1000000).  (in 'unknown file name' at line 43)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration:   1 / 1000 [  0%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: integrate_ode_rk45: parameter vector[3] is inf, but must be finite!  (in 'unknown file name' at line 43)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration: 600 / 1000 [ 60%]  (Sampling)
Iteration: 200 / 1000 [ 20%]  (Warmup)
Iteration:   1 / 1000 [  0%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: lognormal_lpdf: Location parameter[3] is -inf, but must be finite!  (in 'unknown file name' at line 48)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration: 700 / 1000 [ 70%]  (Sampling)
Iteration: 100 / 1000 [ 10%]  (Warmup)
Iteration: 300 / 1000 [ 30%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: Max number of iterations exceeded (1000000).  (in 'unknown file name' at line 43)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration: 800 / 1000 [ 80%]  (Sampling)
Iteration: 200 / 1000 [ 20%]  (Warmup)
Iteration: 100 / 1000 [ 10%]  (Warmup)
Iteration: 900 / 1000 [ 90%]  (Sampling)
Iteration: 400 / 1000 [ 40%]  (Warmup)
Iteration: 300 / 1000 [ 30%]  (Warmup)
Iteration: 200 / 1000 [ 20%]  (Warmup)
Iteration: 1000 / 1000 [100%]  (Sampling)

 Elapsed Time: 21.0428 seconds (Warm-up)
               23.1567 seconds (Sampling)
               44.1995 seconds (Total)

Iteration: 500 / 1000 [ 50%]  (Warmup)
Iteration: 501 / 1000 [ 50%]  (Sampling)
Iteration: 400 / 1000 [ 40%]  (Warmup)
Iteration: 600 / 1000 [ 60%]  (Sampling)
Iteration: 300 / 1000 [ 30%]  (Warmup)
Iteration: 500 / 1000 [ 50%]  (Warmup)
Iteration: 501 / 1000 [ 50%]  (Sampling)
Iteration: 700 / 1000 [ 70%]  (Sampling)
Iteration: 600 / 1000 [ 60%]  (Sampling)
Iteration: 400 / 1000 [ 40%]  (Warmup)
Iteration: 700 / 1000 [ 70%]  (Sampling)
Iteration: 800 / 1000 [ 80%]  (Sampling)
Iteration: 800 / 1000 [ 80%]  (Sampling)
Iteration: 500 / 1000 [ 50%]  (Warmup)
Iteration: 501 / 1000 [ 50%]  (Sampling)
Iteration: 900 / 1000 [ 90%]  (Sampling)
Iteration: 900 / 1000 [ 90%]  (Sampling)
Iteration: 1000 / 1000 [100%]  (Sampling)

 Elapsed Time: 20.9352 seconds (Warm-up)
               12.3998 seconds (Sampling)
               33.335 seconds (Total)

Iteration: 1000 / 1000 [100%]  (Sampling)

 Elapsed Time: 44.4743 seconds (Warm-up)
               17.4008 seconds (Sampling)
               61.875 seconds (Total)

Iteration: 600 / 1000 [ 60%]  (Sampling)
Iteration: 700 / 1000 [ 70%]  (Sampling)
Iteration: 800 / 1000 [ 80%]  (Sampling)
Iteration: 900 / 1000 [ 90%]  (Sampling)
Iteration: 1000 / 1000 [100%]  (Sampling)

 Elapsed Time: 34.9841 seconds (Warm-up)
               18.2123 seconds (Sampling)
               53.1963 seconds (Total)

droop_output.plot()


#%%
theta = pd.DataFrame(droop_output.extract('theta')['theta'],
                     columns = ('rho', 'Kp', 'mu', 'q0', 'd'))

plt.errorbar(params, theta.mean(), yerr=theta.std(), fmt='o')
plt.plot([0,2], [0,2], 'k:')
plt.savefig('/home/nate/Desktop/params.png')
plt.close()

params

y0 = pd.DataFrame(droop_output.extract('y0')['y0'],
                  columns=['P', 'Qp', 'X'])
plt.errorbar(init, y0.mean(), yerr=y0.std(), fmt='o')
plt.plot([0,5], [0,5], 'k:')
plt.savefig('/home/nate/Desktop/inits.png')
plt.close()

inits

The process still seems to be slow, taking about a minute to run 1000 iterations. Is that just normal for ODE solvers because the numerical solver itself is slow?

Also, any suggestion for helping estimate Kp? The half saturation constant is the only parameter that appears to be consistently underestimated in the model (though it is usually not all that biologically interesting, anyway).

All of this is building up to another question, which is how to accurately estimate the values of a compartment when data are only available for some of the compartments (as in, how can I accurately estimate the parameters as well as Qp when I only have data for X and P, but that’s a different question).

1 Like

By all means - this system can get easily stiff when you integrate this with Stan due to the extreme parameter values being tried out. It’s better to let Stan explore these regions with a stiff capable solver instead of nailing the ranges tightly.

It’s unfortunate that for PyStan it’s such a pain to get Sundials going.

1 Like

I am looking into this for some of my ODE models at the moment, and it really depends on the model structure and how easy it is to estimate the parameters. If I have data on some of my state variables/compartments that are highly predictive of other state variables that don’t have data, it doesn’t really matter - good predictions are still possible. However, even with data, some of my state variables and their parameters are not that easy to estimate without considerable uncertainty.

Best thing to do is keep simulating from your model, removing state variable data incrementally, and seeing how the model runs.

1 Like

@cgoold Thanks, your answer makes sense. I can show you how its working on mine. Here, I modified the code to use only data from P and X in the likelihood sampler, and the model is estimating Qp with no constraints:

droop_stan_e2 = """
functions{
    real[] droop(real t,
                 real[] y,
                 real[] theta,
                 real[] x_r,
                 int[] x_i){
                    real dydt[3];
                    dydt[1] = 1*(1 - y[1]) - theta[1]*(y[1] / (theta[2] + y[1]))*y[3];
                    dydt[2] = theta[1]*(y[1] / (theta[2] + y[1])) - theta[3]*(y[2] - theta[4]);
                    dydt[3] = y[3]*theta[3]*(1 - theta[4]/y[2]) - theta[5]*y[3];
                    return dydt;
                 }
}
data{
    int<lower=1> T;
    real y[T,3];
    real t0;
    real ts[T];
}
transformed data{
    real x_r[0];
    int x_i[0];
}
parameters{
    real<lower=0> y0[3];
    real<lower=0, upper=1> y2[T];
    real<lower=0> rho;
    real<lower=0> Kp;
    real<lower=0> mu;
    real<lower=0> q0;
    real<lower=0, upper=1> d;
    vector<lower=0>[3] sigma_y;
}
transformed parameters{
    real yhat[T,3];
    real theta[5];
    theta[1] = rho;
    theta[2] = Kp;
    theta[3] = mu;
    theta[4] = q0;
    theta[5] = d;
    
    yhat = integrate_ode_rk45(droop, y0, t0, ts, theta, x_r, x_i);
    
}
model{
    for(t in 1:T){
        y[t,1] ~ normal(yhat[t,1], sigma_y[1]);
        y[t,3] ~ normal(yhat[t,3], sigma_y[3]);
    }
    
    sigma_y ~ cauchy(0, 2.5);
    y0 ~ normal(0,1);
    rho ~ normal(0,1);
    Kp ~ normal(0,1);
    mu ~ normal(0,1);
    q0 ~ normal(0,1);
    d ~ normal(0,1);

}
generated quantities{
}
""" 

This model still fits the parameters fairly well (except for Kp still):

params

and it fits both P and X well:

P X

But Qp is estimated with a high degree of uncertainty:

Qp

here basically its not informative. How that changes with more data, or tighter relationships between the state variables, I need to sort out. I’ve noticed this isn’t uncommon, some of my friends were using ecosystem models to get at the same idea, and their unknown compartments were also more or less completely uncertain too. I hoped I could do better in a simpler model.

@wds15, I’ll work on seeing if the stiff ODE solver helps. It’d be great if PyStan came automatically with support for sundials, but I understand why they don’t.

1 Like

I don’t know about you, but the models I am applying (systems models accounting for biological, social and economic processes) are often run in simulation experiments without any data fitting or parameter estimation routines, because the latter procedures are very difficult. For me, then, communicating the uncertainty is part-and-parcel of the modelling approach.

Also, take a look at the missing data chapter in the Stan manual, in particular the multivariate problems, which could be applied to ODEs where only some compartment data is available.

Well, I’m not sure about me because I rarely see ecologists do this kind of thing. I’ve only ever seen one person try something like this before (my friend fitting ecosystem models), and they had a custom MCMC sampler but still ran into the same issues where the unknown compartment was completely uncertain.

Thanks for the heads up, I’ll look at the missing data part. That could be a good solution. I’m just glad to know I’m not completely off base here.

Hi, good question edit. not really a question by I can still answer :)

We use Python tools to compile C++ and C code and the current implementation for Sundials (on PyStan) means that each of the Sundials files are compiled everytime a model is compiled. This increases the compilation and linking time, so it is kept out from the main distribution.
Also there is/was problems with mixing .c and .cpp files with clang, because Python tools use automatically the same compiler flags for both files. c++1y is something that will thrown an error with clang if used with .c file.

I have updated the procedure, so Sundials works with 2.19 ( https://github.com/stan-dev/pystan/pull/637 ) and you can install it with the following way.

git clone --recursive https://github.com/stan-dev/pystan
cd pystan
git checkout sundials_update
pip install .

We are now looking to do precompilation of the needed library files (this is because TBB needs this, so at the same step we can also compile Sundials and path of the Stan-lib), so in future everything should work faster and have the full ecosystem usable by default.

2 Likes