Estimating the parameters of the Differential Equations using Stan

Hello everyone!
I am using Stan.jl and DifferentialEquations.jl to estimate the parameters of the differential equations. In particular, I am taking LotkaVolterra Differential Equation which has 4 parameters.
As the starting step, I am estimating all the 4 parameters using the code below:

using OrdinaryDiffEq, ParameterizedFunctions, Mamba, Stan
f4 = @ode_def_nohes LotkaVolterraTest4 begin
  dx = a*x - b*x*y
  dy = -c*y + d*x*y
end a=>1.5 b=>1.0 c=>3.0 d=>1.0
u0 = [1.0,1.0]
tspan = (0.0,10.0)
prob4 = ODEProblem(f4,u0,tspan)
sol = solve(prob4,Tsit5())
t = collect(linspace(1,10,10))
using RecursiveArrayTools # for VectorOfArray
randomized = VectorOfArray([(sol(t[i]) + .01randn(2)) for i in 1:length(t)])
data = convert(Array,randomized)
const parameter_estimation_lotka_volterra4_model = "
functions {
  real[] sho(real t,real[] y,real[] theta,real[] x_r,int[] x_i) {
    real dydt[2];
    dydt[1] = theta[1]*y[1] - theta[2]*y[1]*y[2];
    dydt[2] = -theta[3]*y[1] - theta[4] * y[1]* y[2];
    return dydt;
    }
  }
data {
  real y0[2];
  int<lower=1> T;
  real y[T,2];
  real t0;
  real ts[T];
}
transformed data {
  real x_r[0];
  int x_i[0];
}
parameters {
  vector<lower=0>[2] sigma;
  real theta[4];
}
model{
  real y_hat[T,2];
  sigma ~ inv_gamma(2,3);
  theta[1] ~ normal(1.5, 1);
  theta[2] ~ normal(1.0, 1);
  theta[3] ~ normal(3.0, 1);
  theta[4] ~ normal(1.0, 1);
  y_hat = integrate_ode_rk45(sho, y0, t0, ts, theta, x_r, x_i);
  for (t in 1:T){
    y[t] ~ normal(y_hat[t], sigma);
    }
}
"
stanmodel = Stanmodel(num_samples=1000, num_warmup=1000, name="parameter_estimation_lotka_volterra4_model", model=parameter_estimation_lotka_volterra4_model);
const parameter_estimation_lotka_volterra4_data = Dict("y0"=>prob4.u0, "T" => size(t)[1], "y" => data', "t0" => tspan[1], "ts"=>t)
rc4, sim4 = stan(stanmodel, [parameter_estimation_lotka_volterra4_data], "/Users/ayush/tmp"; CmdStanDir=CMDSTAN_HOME)

which is working properly.
However when I am trying to estimate only 1 parameter (namely a) using the code below:

const parameter_estimation_lotka_volterra1_model = "
functions {
  real[] sho(real t,real[] y,real[] theta,real[] x_r,int[] x_i) {
    real dydt[2];
    dydt[1] = theta[1] * y[1] - 1.0*y[1] * y[2];
    dydt[2] = 3.0*y[1] - 1.0*y[1] * y[2];
    return dydt;
    }
  }
data {
  real y0[2];
  int<lower=1> T;
  real y[T,2];
  real t0;
  real ts[T];
}
transformed data {
  real x_r[0];
  int x_i[0];
}
parameters {
  vector<lower=0>[2] sigma;
  real theta[1];
}
model{
  real y_hat[T,2];
  sigma ~ inv_gamma(2, 3);
  theta[1] ~ normal(1.5, 1);
  y_hat = integrate_ode_rk45(sho, y0, t0, ts, theta, x_r, x_i);
  for (t in 1:T){
    y[t] ~ normal(y_hat[t], sigma);
    }
}
"
using OrdinaryDiffEq, ParameterizedFunctions, Mamba, Stan
f1 = @ode_def_nohes LotkaVolterraTest1 begin
  dx = a*x - b*x*y
  dy = -c*y + d*x*y
end a=>1.5 b=1.0 c=3.0 d=1.0
u0 = [1.0,1.0]
tspan = (0.0,10.0)
prob1 = ODEProblem(f1,u0,tspan)
sol = solve(prob1,Tsit5())
t = collect(linspace(1,10,10))
using RecursiveArrayTools # for VectorOfArray
randomized = VectorOfArray([(sol(t[i]) + .01randn(2)) for i in 1:length(t)])
data = convert(Array,randomized)
stanmodel = Stanmodel(num_samples=2, num_warmup=2, name="parameter_estimation_lotka_volterra1_model", model=parameter_estimation_lotka_volterra1_model);
const parameter_estimation_lotka_volterra1_data = Dict("y0"=>prob1.u0, "T" => size(t)[1], "y" => data', "t0" => tspan[1], "ts"=>t)
rc1, sim1 = stan(stanmodel, [parameter_estimation_lotka_volterra1_data], "/Users/ayush/tmp"; CmdStanDir=CMDSTAN_HOME)

I am getting the following error:
ERROR: Return code = -5
Stacktrace:
[1] #stan#4(::Type{T} where T, ::Bool, ::Bool, ::String, ::Function, ::Stan.Stanmodel, ::Array{Dict{String,Any},1}, ::String) at /Users/ayush/.julia/v0.6/Stan/src/main/stancode.jl:160
[2] (::Stan.#kw##stan)(::Array{Any,1}, ::Stan.#stan, ::Stan.Stanmodel, ::Array{Dict{String,Any},1}, ::String) at ./:0

Unrecoverable error evaluating the log probability at the initial value.
Any help would be appreciated. TIA :)

I think your second equation might be messed up. I think there should be a term that looks like: -theta[3] * y[2] corresponding to death of predators/wolves however you think about it.

I looked at the diff of the two files as well and it seems like there may have been a sign flip on the theta[3] parameter as well:

< dydt[1] = theta[1]*y[1] - theta[2]*y[1]*y[2];
< dydt[2] = -theta[3]*y[1] - theta[4] * y[1]* y[2];
---
> dydt[1] = theta[1] * y[1] - 1.0*y[1] * y[2];
> dydt[2] = 3.0*y[1] - 1.0*y[1] * y[2];

Btw, it’s a good idea to keep your Stan models in separate files from the code that actually runs them.

Hope that helps!

Thanks for the reply @bbbales2 :)
The actual value of theta[3] is 3 which I have hard coded in my model because this time I only want to estimate a single parameter (namely theta[1]).

How do we estimate just one parameter of Lotka Volterra Differential Equation when we already know the other 3?

How do we estimate just one parameter of Lotka Volterra Differential Equation when we already know the other 3?

You do exactly what you did. I just think there’s a problem with your ODE that is probably making it blow up or something. Or did you mean something else?

dydt[2] = 3.0*y[1] - 1.0*y[1] * y[2];

I just don’t think that the first bit is right, and the sign on the second term doesn’t make sense either (if you have more rabbits (y[1]) and more wolves (y[2]), the wolf population (the second term) should grow).

Yeah that’s off. It doesn’t match the model that generated the data.

Should be a + on the theta[4]. In theory Stan should be able to solve that though, but in practice it seems too sensitive to the prior to handle it.

The sign on the first is correct and the generative equation gives a periodic solution.

http://app.juliadiffeq.org/ode;settings=eyJkaWZmRXFUZXh0IjoiZHggPSBhKnggLSBiKngqeVxuZHkgPSAtYyp5ICsgZCp4KnkiLCJwYXJhbWV0ZXJzIjoiYT0xLjUsIGI9MSwgYz0zLCBkPTEiLCJ0aW1lU3BhbiI6WzAsMTBdLCJpbml0aWFsQ29uZGl0aW9ucyI6IjEuMCwgMS4wIiwic29sdmVyIjoiVHNpdDUiLCJ2YXJzIjoiWzp4LCA6eV0iLCJ0aXRsZSI6IlRoZSBMb3RrYS1Wb2x0ZXJyYSBFcXVhdGlvbnM6IE1vZGVsIG9mIFJhYmJpdHMgYW5kIFdvbHZlcyJ9

Silly me! It worked, thank you very much @bbbales2 :)
Now, I am trying the truncated version where I am limiting the values from which sampling can be done.

using OrdinaryDiffEq, ParameterizedFunctions, Mamba, Stan
# See page 82 for truncation
f4 = @ode_def_nohes LotkaVolterraTest4 begin
  dx = a*x - b*x*y
  dy = -c*y + d*x*y
end a=>1.5 b=>1.0 c=>3.0 d=>1.0
u0 = [1.0,1.0]
tspan = (0.0,10.0)
prob4 = ODEProblem(f4,u0,tspan)
sol = solve(prob4,Tsit5())
t = collect(linspace(1,10,10))
using RecursiveArrayTools # for VectorOfArray
randomized = VectorOfArray([(sol(t[i]) + .01randn(2)) for i in 1:length(t)])
data = convert(Array,randomized)
const parameter_estimation_lotka_volterra4_truncated_model = "
functions {
  real[] sho(real t,real[] y,real[] theta,real[] x_r,int[] x_i) {
    real dydt[2];
    dydt[1] = theta[1]*y[1] - theta[2]*y[1]*y[2];
    dydt[2] = -theta[3]*y[1] + theta[4] * y[1]* y[2];
    return dydt;
    }
  }
data {
  real y0[2];
  int<lower=1> T;
  real y[T,2];
  real t0;
  real ts[T];
}
transformed data {
  real x_r[0];
  int x_i[0];
}
parameters {
  vector<lower=0>[2] sigma;
  real theta[4];
}
model{
  real y_hat[T,2];
  sigma ~ inv_gamma(2, 3);
  theta[1] ~ normal(1.5, 1) T[0.5, 2.5];
  theta[2] ~ normal(1.0, 1) T[0.2, 2.1];
  theta[3] ~ normal(3.0, 1) T[2.0, 4.1];
  theta[4] ~ normal(1.0, 1) T[0.2, 2.1];
  y_hat = integrate_ode_rk45(sho, y0, t0, ts, theta, x_r, x_i);
  for (t in 1:T){
    y[t] ~ normal(y_hat[t], sigma);
    }
}
"
stanmodel = Stanmodel(num_samples=2, num_warmup=2, name="parameter_estimation_lotka_volterra4_truncated_model", model=parameter_estimation_lotka_volterra4_truncated_model);
const parameter_estimation_lotka_volterra4_truncated_data = Dict("y0"=>prob4.u0, "T" => size(t)[1], "y" => data', "t0" => tspan[1], "ts"=>t)
rc4, sim4 = stan(stanmodel, [parameter_estimation_lotka_volterra4_truncated_data], "/Users/ayush/tmp"; CmdStanDir=CMDSTAN_HOME)

I am getting the following error:
ERROR: Return code = -5
Rejecting initial value:
Log probability evaluates to log(0), i.e. negative infinity.
Stan can’t start sampling from this initial value.

@ChrisRackauckas What a neat little webapp. I’m going to bookmark that.

@KrishnaKanhaiya Ah, if you truncate your distributions like that, I think you also need the corresponding constraints on the parameters. Truncating the likelihood statement like that does not keep the sampler from trying to sample weird places. What it’s telling you is that it keeps finding probability zero parameterizations for its initial guesses (which are uniform [-2, 2]). Given one of your constraints is greater than 2, it’ll never actually initialize haha.

If you want different constraints for different thetas, you’ll need to code those as separate variables. Something like:

real<lower = 0.5, upper = 2.5> theta1;

But really priors are preferable to constraints. It’s tempting to put constraints on model parameters if they are trying to go off to some crazy place. That’s a mistake though – there’s usually a lot to learn from this about the model/data! And if you put hard constraints you might miss something interesting or bias your inference is a really harsh way.

That said fitting Lotka-Volterra might be kinda hard considering the reason folks study it is cause it’s so sensitive to stuff.

2 Likes