Rejecting initial value

fitting-issues

#1

Hello everybody. I have started using Stan for the first time for a project involving parameter inference in an ODE system (Toggle Switch), but I am having some issues that I do not fully understand unfortunately. Due to the type of data that I am using to fit the model there are two different inputs to the system that change through time (I am stopping and restarting the ODE solver at each switching time), introduced into the solver by x_r. I have tried the code without fitting to ensure that everything works as expected but once I introduce the fitting into the model section I keep receiving this error when using the non-stiff solver integrate_ode_rk45:


Rejecting initial value:
_ Error evaluating the log probability at the initial value._
Exception: normal_lpdf: Location parameter is nan, but must be finite! (in ‘model345468825766_Model2_RStan_BayesianInference_SingleExperiments’ at line 242)

Rejecting initial value:
_ Error evaluating the log probability at the initial value._
Exception: integrate_ode_rk45: initial state[3] is nan, but must be finite! (in ‘model345468825766_Model2_RStan_BayesianInference_SingleExperiments’ at line 228)

Initialization between (-2, 2) failed after 100 attempts. _
_ Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”


and this when using the stiff solver integrate_ode_bdf:


Unrecoverable error evaluating the log probability at the initial value.
Exception: CVode failed with error flag -4 (in ‘model1c387ceb3644_Model2_RStan_BayesianInference_SingleExperiments’ at line 228)

_[1] "Error in sampler$call_sampler(args_list[[i]]) : " _
[2] " Exception: CVode failed with error flag -4 (in ‘model1c387ceb3644_Model2_RStan_BayesianInference_SingleExperiments’ at line 228)"
[1] “error occurred during calling the sampler; sampling not done”


Sometimes, with the non-stiff solver even the rejection of some initial values the code manages to continue, however I do not think that this should happen (let me know if I am wrong please). Also, after checking the code and try different things I have noticed that if when the input values are 0 (introduced through x_r) I modified to be a value close to zero (1e-10) the errors disappear, which I am not sure why.

Since this is my first post, please let me know if something it is not clear or you need some more information about anything. Thanks for your help. My code is at it follows:


functions{
  // Function containing the ODE to be used for the inference
  real[] Toogle_one(real t, real[] y, real[] p, real[] x_r, int[] x_i){
    // Inputs
    real u_IPTG = x_r[1];
    real u_aTc = x_r[2];
    // Parameters
    real k_IPTG = p[1];
    real k_L_pm0 = p[2];
    real k_L_pm = p[3];
    real theta_T = p[4];
    real theta_aTc = p[5];
    real n_aTc = p[6];
    real n_T = p[7];
    real k_T_pm0 = p[8];
    real k_T_pm = p[9];
    real theta_L = p[10];
    real theta_IPTG = p[11];
    real n_IPTG = p[12];
    real n_L = p[13];
    
    //Equations
    real dInd_dt[3];
    dInd_dt[1] = k_IPTG*(x_r[1]-y[1]);
    dInd_dt[2] = ((1/0.1386)*(k_L_pm0+(k_L_pm/(1+((y[3]/theta_T)*1/(1+(x_r[2]/theta_aTc)^n_aTc))^n_T))))-0.0165*y[2];
    dInd_dt[3] = ((1/0.1386)*(k_T_pm0+(k_T_pm/(1+((y[2]/theta_L)*1/(1+(y[1]/theta_IPTG)^n_IPTG))^n_L))))-0.0165*y[3];
    //RESULTS
    return dInd_dt;
  }
  
  vector SteadyState(vector init, vector p, real[] x_r, int[] x_i){
    vector[2] alpha;
    // Parameters
    real k_IPTG = p[1];
    real k_L_pm0 = p[2];
    real k_L_pm = p[3];
    real theta_T = p[4];
    real theta_aTc = p[5];
    real n_aTc = p[6];
    real n_T = p[7];
    real k_T_pm0 = p[8];
    real k_T_pm = p[9];
    real theta_L = p[10];
    real theta_IPTG = p[11];
    real n_IPTG = p[12];
    real n_L = p[13];
    // Equations
    alpha[1] = ((1/0.1386)*(k_L_pm0+(k_L_pm/(1+((init[2]/theta_T)*1/(1+(x_r[2]/theta_aTc)^n_aTc))^n_T))))/0.0165;
    alpha[2] = ((1/0.1386)*(k_T_pm0+(k_T_pm/(1+((init[1]/theta_L)*1/(1+(x_r[1]/theta_IPTG)^n_IPTG))^n_L))))/0.0165;
    // Results
    return alpha;
  }
  
}

data {
  // INPUTS
  int<lower = 0> tsl; // Length of time vector
  real ts[tsl]; // Total time vector
  int tsmax; // Last element of the time vector
  real preIPTG; // Pres values
  real preaTc;
  int Nsp; // Number of event switching points (including initial and final)
  int evnT[Nsp]; // Event switching points (including initial and final)
  real inputs[(Nsp-1)*2]; // Inputs as IPTG, aTc, IPTG, aTc, ...
  real IPTG[Nsp-1]; // Input values at each event
  real aTc[Nsp-1];
  
  // OBSERVABLES
  int<lower=0> stsl; // Number of sampling times
  int sts[stsl]; // Sampling times
  real GFPmean[stsl]; // Mean and standard deviations for proteins
  real RFPmean[stsl];
  real<lower=0> GFPstd[stsl];
  real<lower=0> RFPstd[stsl];
  
  int tonil; // Initial time for steady state
  real toni[tonil]; // Steady state time vector
  
}

transformed data {
  int nParms = 13; // Number of parameters of the model
  int Neq = 3; // Total number of equations of the model
  int Nevents = Nsp-1; // Number of events
  int x_i[0]; // Empty x_i object (needs to be deffined)
  real x_r[(Nevents)*2]=inputs; // Input values for each event ordered as IPTG, aTc, IPTG, aTc, ...
  real ivss[Neq-1]; // Initial experimental values for the calculation of the steady state ordered as IPTG, aTc
  real pre[2]; // Input values during the ON incubation ordered as IPTG, aTc
  
  ivss[1] = RFPmean[1];
  ivss[2] = GFPmean[1];
  
  pre[1] = preIPTG;
  pre[2] = preaTc;
  
}

parameters {
    real<lower=0> k_IPTG;
    real<lower=0> k_L_pm0;
    real<lower=0> k_L_pm;
    real<lower=0> theta_T;
    real<lower=0,upper=100> theta_aTc;
    real<lower=0> n_aTc;
    real<lower=0> n_T;
    real<lower=0> k_T_pm0;
    real<lower=0> k_T_pm;
    real<lower=0> theta_L;
    real<lower=0,upper=1> theta_IPTG;
    real<lower=0> n_IPTG;
    real<lower=0> n_L;

}

transformed parameters {
  real<lower=0> theta[nParms];
  theta[1] = k_IPTG;
  theta[2] = k_L_pm0;
  theta[3] = k_L_pm;
  theta[4] = theta_T;
  theta[5] = theta_aTc;
  theta[6] = n_aTc;
  theta[7] = n_T;
  theta[8] = k_T_pm0;
  theta[9] = k_T_pm;
  theta[10] = theta_L;
  theta[11] = theta_IPTG;
  theta[12] = n_IPTG;
  theta[13] = n_L;

}

model {
  real y_hat[tsl,Neq]; // Object that will include the solution for the ODEs for all the events
  real initialV[Neq]; // Initial value for the ODE each event
  int i; // Increasing index for the inputs
  vector[2] y_al; // Vector that will include the solutio of the algebraic solution for the steady state of the model
  real Y0[Neq]; // Initial values for the ODEs variables at the first event
  real ssv[tonil,Neq];
  // Priors definition 
  k_IPTG ~ normal(0.202,0.099);
  k_L_pm0 ~ normal(0.1515,0.07425);
  k_L_pm ~ normal(50.5,24.75);
  theta_T ~ normal(151.5,74.25);
  theta_aTc ~ normal(50,25);
  n_aTc ~ normal(2.5,1.25);
  n_T ~ normal(2.5,1.25);
  k_T_pm0 ~ normal(0.505,0.2475);
  k_T_pm ~ normal(5.05,2.475);
  theta_L ~ normal(151.5,74.25);
  theta_IPTG ~ normal(0.5,0.25);
  n_IPTG ~ normal(2.5,1.25);
  n_L ~ normal(2.5,1.25);

  // Calculation of initial guesses
  y_al = SteadyState(to_vector(ivss), to_vector(theta), pre, x_i); // Calculation of initial guesses for steady state
  Y0[1] = preIPTG;
  Y0[2] = y_al[1];
  Y0[3] = y_al[2];
  ssv = integrate_ode_rk45(Toogle_one, Y0,0,toni,theta,pre, x_i); // ON incubation calculation for the steady state
  Y0 = ssv[tonil];

  // ODE simulation
  initialV = Y0;
  i = 1;
    // Loop over the number of events
    for (q in 1:Nevents){
      int itp = evnT[q];  // Initial time points of each event
      real itp2 = evnT2[q];
      int lts = num_elements(ts[(evnT[q]+1):(evnT[q+1]+1)]);  // Length of the time series for each event
      real part1[lts,Neq]; // Temporary object that will include the solution of the ODE for each event at each loop
    
      if (q == 1){part1 = integrate_ode_rk45(Toogle_one, initialV,itp,ts[(evnT[q]+1):(evnT[q+1]+1)],theta,to_array_1d(inputs[i:(i+1)]), x_i);}
      else{part1 = integrate_ode_rk45(Toogle_one, initialV,(itp-1e-7),ts[(evnT[q]+1):(evnT[q+1]+1)],theta,to_array_1d(inputs[i:(i+1)]), x_i);}
    
      initialV = part1[lts];
      i=i+2;
      // Introduction of the result of part1 into the object y_hat
      for (y in (itp+1):(itp+lts)){
        y_hat[(y),]=(part1)[(y-itp),];
      };
    };
  
  for (t in 1:stsl){
    RFPmean[t] ~ normal(y_hat[(sts[t]+1),2],RFPstd[t]);
    GFPmean[t] ~ normal(y_hat[(sts[t]+1),3],GFPstd[t]);
  }
}


#2

This is not unusual for ODE models we’ve seen in the past.

First, yes it’s OK to reject and keep going. This is what’ll happen if you start with something like x ~ exp(1) where x is declared to be real—it’ll keep generating values until it gets a positive one from which it can start. Same thing in ODEs—it’ll keep going until it finds one it can process.

You can help that along two ways.

The crude way to do it is to provide initial values.

The better approach is to reparameterize. For example, fi you have a variable that is in the (-0.001, 0.001) range and values of 1 or -1 screw things up, you can recode as:

parameters {
  real theta_raw;

transformed parameters {
  real theta = theta_raw * 0.001;

Then, when theta_raw is initialized in (-2, 2), theta will be initialized in in (-0.002, 0.002).

We’re going to roll out direct support for this kind of linear transform in 2.19.


#3

Thanks for the explanation @Bob_Carpenter, I will try to reparameterize and see if things improve during the inference. However, it is good to know that some rejections are not unusual.
Also, as I referred in my last comment, is there any explanation to the fact that when the forced inputs introduced are never exactly 0 (close to) there is no apparent rejection at all? Or it is a problem that might come from my code?


#4

It would depend on the inputs.

Anyway, once I thought a bit more about what you were doing, it looks like you just have a regression for a bunch of variances.

The standard thing to do would be to use a log link, like in a GLM, so you apply the inverse transform to take the unconstrained linear predictor into the appropriate constrained space. Here, that’s just

vector[K] vars = exp(x * beta);

where x is the matrix of predictors and beta the coefficient vector.


#5

Yes, after some checks I noticed that the problem arises for some input values (particularly 0).
Also, I am afraid that I am not fully following you mean exactly as variances?