Persistent rejection of initial value and failure to initialise when fitting ODE model

Hello! I’m having a lot of trouble fitting a model that (on the surface) should be pretty easy to fit…

To make a long story short, I began with Antia et al.'s (1994) model of within host population dynamics, which looks like this:
Screenshot 2022-12-01 at 5.36.45 PM

We can simulate this model deterministically, as follows:

Antia_Model <- function(t, y, p1){
  r <- p1[1]; k <- p1[2]; p <- p1[3]; o <- p1[4] 
  P <- y[1]; I <- y[2]
  # r = Replication rate of parasite (0.1-10)
  # p = Max. growth rate of immune system (1.0)
  # k = Rate of destruction of parasite by immune system (10^(-3))
  # o = Parasite density at which growth rate of IS is half its max. (10^3)
  dP = r*P - k*P*I
  dI = p*I*(P/(P + o))
  list(c(dP, dI))
}
r <- 0.2; k <- 0.001; p <- 1; o <- 1000 
parms <- c(r, k, p, o)
P0 <- 1; I0 <- 1
N0 <- c(P0, I0)
TT <- seq(0, 50, 0.1)
results <- lsoda(N0, TT, Antia_Model, parms, verbose = TRUE)
P <- results[,2]; I <- results[,3];

Which gives us this:

And we can generate demographically stochastic data as follows:

x0 <- c(P = 1, I = 1) 
a <- c("P*r",
       "k*P*I", 
       "p*I*(P/(P + o))")
nu <- matrix(c(+1, -1, 0,
               0, 0, +1), nrow = 2, byrow = TRUE)

r <- 0.2; k <- 0.001; p <- 1; o <- 1000
parms1 <- c(r = r, k = k, p = p, o = o)
tf = 50
method <- "OTL"
simName <- "Antia"
set.seed(6)
Antia_SLC <- suppressWarnings(ssa(x0, a, nu, parms1, tf, method, simName,
                                  verbose = FALSE, 
                                  consoleInterval = 1, 
                                  censusInterval = 1, 
                                  maxWallTime = 30, 
                                  ignoreNegativeState = TRUE)) 
Antia_Sim <- Antia_SLC$data 
Antia_Sim <- as.data.frame(Antia_Sim)
colnames(Antia_Sim) <- c("Time", "Parasites", "ImmuneCells")

Which gives us this:

The gist is, I’m trying to fit the model to the demographically stochastic data, and running into unforeseen problems.

Here’s the model I’m using to do the fitting (I have incredibly tight priors for troubleshooting reasons…):

functions {
  real[] dz_dt(real t,       
               real[] z,     
               real[] theta, 
               real[] x_r,  
               int[] x_i) {
    real P = z[1];
    real I = z[2];

    real r = theta[1];  
    real k = theta[2];
    real p = theta[3];
    real o = theta[4];

    real dP_dt = r*P - k*P*I;
    real dI_dt = p*I*(P/(P + o));
    return { dP_dt, dI_dt };
  }
}
data {
  int<lower = 0> N;           
  real ts[N];                 
  real y_init[2];             
  real<lower = 0> y[N, 2];    
}
parameters {
  real<lower = 0> r; 
  real<lower = 0, upper = 1> k;
  real<lower = 0> p;
  real<lower = 0> o;
  real<lower = 0> z_init[2];  
  real<lower = 0> sigma[2];   
}
transformed parameters {
  real z[N, 2]
    = integrate_ode_rk45(dz_dt, z_init, 0, ts, {r, k, p, o},
                         rep_array(0.0, 0), rep_array(0, 0),
                         1e-5, 1e-3, 5e2);
}
model {
  r ~ uniform(0, 0.5); // r = 0.2
  k ~ uniform(0, 0.1); // k = 0.001
  p ~ uniform(0, 2); // p = 1
  o ~ uniform(990, 1010); // o = 1000
  sigma ~ lognormal(-1, 1);
  z_init ~ normal(1, 1);
  for (m in 1:2) {
    y_init[m] ~ lognormal(log(z_init[m]), sigma[m]);
    y[ , m] ~ lognormal(log(z[, m]), sigma[m]);
  }
}
generated quantities {
  real y_init_rep[2];
  real y_rep[N, 2];
  for (m in 1:2) {
    y_init_rep[m] = lognormal_rng(log(z_init[m]), sigma[m]);
    for (n in 1:N)
      y_rep[n, m] = lognormal_rng(log(z[n, m]), sigma[m]);
  }
}

And here’s what I’m doing to implement the model:

model <- stan_model(here("./03_Fit/Antia_Fit.stan")) # Which is just where I keep the above model

N <- length(Antia_Sim$Time) - 1
ts <- 1:N
y <- as.matrix(InfecData[2:(N + 1), 2:3])
y <- cbind(y[ , 1], y[ , 2]); 
Data <- list(N = N, ts = ts, y_init = y_init, y = y)

fit <- sampling(model,  data = Data,  chains = 2,  iter = 1000,  cores = 2,  seed = 1)

When I try to run the fitting line, both chains pretty immediately fail to initialise:

Chain 2: Rejecting initial value:
Chain 2:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 2:   Stan can't start sampling from this initial value.
Chain 2: 
Chain 2: Initialization between (-2, 2) failed after 100 attempts. 
Chain 2:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

Does anyone know what might be causing this error? I know it’s a pretty standard one, but this seems like a pretty straightforward question and I don’t know what I’m doing wrong.

Any help would be incredibly appreciated :)

Try

  1. increase the 500 maximum iterations to 10^4 to make the ode integrator harder work with difficult seeds.
  2. try out the bdf integrator
  3. you should give stan initial values which - to start with - are equal to the true values
  4. make stan warmup more slowly by setting the steps size to smaller values initially like stepsize=0.01

I hope this puts you on track.

Sebastian

I think the issue is your priors being zero at the initialization points. You’ll either have to constrain your parameters to match the priors, or use priors which are nonzero on all of the parameter domain.

(sorry for being brief, I’m on my way somewhere)

You need a very good reason to use uniform priors for the ODE params. For example, the specified priors for r and k overlap while in reality you may want r>>k. Imagine r happens to be approximately equal to k, you end up with P'=rP(1-I), and when you specify I(t=0)=0 as in the script the ODE solution degenerates to constant.

@wds15: Hi Sebastian, thanks so much for your response! I just have a couple clarifying questions, if that’s okay.

  1. That makes sense––thank you! I’ve also changed my relative and absolute tolerances (from 10^-5 and 10^-3 to 10^-10 and 10^-10) to match the stiff solver you recommended.
  2. Yes, this sounds like a good idea––done!
  3. I’ve tried that and it doesn’t seem to do much… I’ll try again.
  4. This also makes sense, but I’m not sure how to implement that change. Would that be in the sampling() function? I can’t find much of anything online.

Thanks so much again! It’s not working yet, but my fingers are crossed!

@Niko: Hi Niko, thanks so much for your on-the-go response! Yes, I was wondering about this… A lot of my parameters are fairly close to 0, which might pose a problem. I’d like to try to adjust my priors to make them non-zero, but I guess I’m not sure about the best way to go about that? Do you have any ideas? Thanks so much again!

@yizhang: Ahh, this makes sense. Since, in the parameter space, r can range between 0.1 and 10, and k can range between 0 and 1, they’ll have to overlap a bit, but maybe I’ll try keep the bulks of their distributions away from each other, for example? I’ll try that and update the post :) Thank you!

EDIT:

Okay, I’ve implemented your suggestions to the best of my ability, and the problem persists :(
This is my updated model:

functions {
  real[] dz_dt(real t,       
               real[] z,     
               real[] theta, 
               real[] x_r,  
               int[] x_i) {
    real P = z[1];
    real I = z[2];

    real r = theta[1];  
    real k = theta[2];
    real p = theta[3];
    real o = theta[4];

    real dP_dt = r*P - k*P*I;
    real dI_dt = p*I*(P/(P + o));
    return { dP_dt, dI_dt };
  }
}
data {
  int<lower = 0> N;           
  real ts[N];                 
  real y_init[2];             
  real<lower = 0> y[N, 2];    
}
parameters {
  real<lower = 0> r; 
  real<lower = 0, upper = 1> k;
  real<lower = 0> p;
  real<lower = 0> o;
  real<lower = 0> z_init[2];  
  real<lower = 0> sigma[2];   
}
transformed parameters {
  real z[N, 2]
    = integrate_ode_bdf(dz_dt, z_init, 0, ts, {r, k, p, o},
                         rep_array(0.0, 0), rep_array(0, 0),
                         1e-10, 1e-10, 1e4);
}
model {
  r ~ normal(1, 1); // r = 0.2
  k ~ lognormal(-5, 1); // k = 0.001
  p ~ normal(1, 1); // p = 1
  o ~ normal(1000, 1); // o = 1000
  sigma ~ lognormal(-1, 1);
  z_init[1] ~ normal(1, 1); // Initial value of P
  z_init[2] ~ normal(1, 1); // Initial value of I
  for (m in 1:2) {
    y_init[m] ~ lognormal(log(z_init[m]), sigma[m]);
    y[ , m] ~ lognormal(log(z[, m]), sigma[m]);
  }
}
generated quantities {
  real y_init_rep[2];
  real y_rep[N, 2];
  for (m in 1:2) {
    y_init_rep[m] = lognormal_rng(log(z_init[m]), sigma[m]);
    for (n in 1:N)
      y_rep[n, m] = lognormal_rng(log(z[n, m]), sigma[m]);
  }
}

If you have any further suggestions, I’ve love to hear them!
Re: @wds15, I’d like to make the step size smaller, but I’m just not sure how. Other than that, the priors seems like a likely point of issue…

Thank you again for all your help :)

the stepsize can be set in rstan via the control argument, which you pass in a list, see

@wds15,

Oh, I see––thanks so much! In case anyone ends up reading this, I’ll put it here:

sampling(model, data = data, chains = 2, iter = 2000, cores = 2, control = list(stepsize = 0.01), seed = 1)

As a quick update, I’ve had luck switching from the rk45 solver to the pdf solver, decreasing my tolerances, and increasing maximum iterations when fitting different ODE models, which is great! I haven’t had luck with providing a list of initial (true) values, which is weird because that’s a really pervasive piece of advice.

@Niko and @yizhang ,
Back to the model I originally asked about, unfortunately, nothing’s worked. I have no idea what the problem could be; this should be pretty straightforward, but after implementing all your suggestions and scouring the internet for more, I’m still where I started, getting this error immediately after beginning sampling :

Rejecting initial value:
  Log probability evaluates to log(0), i.e. negative infinity.
  Stan can't start sampling from this initial value.

Initialization between (-2, 2) failed after 100 attempts. 
  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."

Thanks so much for all your help.

Any chance you could upload the data so I won’t need to install lsoda?

Yes! Here it is! :)

Antia_StochSim.csv (1004 Bytes)

1 Like

The last two rows of your data have P equal to zero which are rejected according to lognormal likelihood. The model (see updated version below) works after I removed these rows.

The model is updated according to new syntax (array, ode solver, etc), see Stan user guide for details.

functions {
  vector dz_dt(real t, vector z, real r, real k, real p, real o) {
    real P = z[1];
    real I = z[2];

    real dP_dt = r*P - k*P*I;
    real dI_dt = p*I*(P/(P + o));
    return [ dP_dt, dI_dt ]';
  }
}
data {
  int<lower = 0> N;
  array[N] real<lower=0> ts;
  vector<lower=0>[2] y_init;
  array[N] vector<lower = 0>[2] y;
}
parameters {
  real<lower = 0> r;
  real<lower = 0, upper = 1> k;
  real<lower = 0> p;
  real<lower = 0> o;
  vector<lower = 0>[2] z_init;
  vector<lower = 0>[2] sigma;
}
transformed parameters {
  array[N] vector<lower=0>[2] z
  = ode_bdf_tol(dz_dt, z_init, 0, ts, 1e-8, 1e-8, 10000, r, k, p, o);
}
model {
  r ~ normal(1, 1); // r = 0.2
  k ~ lognormal(-5, 1); // k = 0.001
  p ~ normal(1, 1); // p = 1
  o ~ normal(1000, 1); // o = 1000
  sigma ~ lognormal(-1, 1);
  z_init[1] ~ normal(1, 1); // Initial value of P
  z_init[2] ~ normal(1, 1); // Initial value of I
  y_init ~ lognormal(log(z_init), sigma);
  for (i in 1:N) {
    y[i] ~ lognormal(log(z[i]), sigma);
  }
}

The modified data (“data.json”)

{
  "y_init": [1, 1],
  "N": 38,
  "ts": [1.79156001723049, 4.77115097443962, 7.67329746576085, 9.06827951143375, 10.3322216672961, 11.3924573467385, 12.6953388273251, 13.7914651781496, 14.9012504628246, 16.0009128293822, 17.1737619715379, 18.1773972819716, 19.2345730155911, 20.5195776131101, 21.5281455721025, 22.573465964267, 23.6129529680249, 24.6384116624291, 25.6528376200041, 26.6613676481726, 27.6729893293437, 28.6757763225696, 29.6816924935247, 30.6894151142125, 31.7145500585685, 32.7154245526956, 33.7810565434248, 34.799986634436, 35.8372502039475, 36.9004844446448, 37.9403016470873, 39.0062779061335, 40.039998952172, 41.0554900045709, 42.0714919759721, 43.1556516142503, 45.7814252753685, 47.4008352318612],
  "y": [
    [2, 1],
    [3, 1],
    [4, 1],
    [6, 1],
    [7, 1],
    [11, 1],
    [15, 1],
    [16, 1],
    [21, 1],
    [27, 1],
    [31, 1],
    [40, 1],
    [53, 1],
    [63, 1],
    [81, 1],
    [101, 1],
    [115, 1],
    [137, 1],
    [155, 1],
    [189, 1],
    [224, 1],
    [278, 1],
    [340, 1],
    [406, 1],
    [498, 2],
    [593, 5],
    [670, 12],
    [709, 18],
    [718, 23],
    [648, 34],
    [527, 52],
    [292, 78],
    [152, 92],
    [86, 100],
    [31, 106],
    [14, 108],
    [4, 115],
    [2, 120]
  ]
}

The sampling script:

library("cmdstanr")
mod <- cmdstan_model("./ode.stan")
mod$sample(data="./data.json", init=list(list(r=0.2, k=0.001, p=1, o=1000, z_init=c(1,1), sigma=c(0.4,0.5))), chains=1)

Note I only checked the model did sample past the initial stage, not its validity.

3 Likes

Oh wow––thank you so much! This works! Can’t believe I missed this… I have an argument in the data generating function to stop the simulation when a population dies out, but clearly that’s not sufficient. Thank you so much for all your help and perseverance :)

2 Likes