Fitting issue while fitting differential equations, compartmental model

Hey,

I am currently trying to fit a system of differential equations to some data using Stan. I am pretty sure that the model should be able to describe the data properly but my model only works sometimes, meaning I can not run it with multiple chains at all because there I always get the error message that the E-BFMI is zero for at least one chain. Running it with only one chain gives me sometimes a fit and sometimes the same error message. If I get a fit it looks very good but my problem is that it does not work all the time and I do not understand why that is the case.
I am using Pystan and the system of differential equations I am trying to fit looks like that:
\frac{dT_C}{dt} = -k_1 T_C + k_2 T_P
\frac{dT_P}{dt} = k_1 T_C - k_2 T_P - V_M T_P
The stan model I am using is the following

functions {
  real[] sho(real t,
             real[] y,
             real[] theta,
             real[] x_r,
             int[] x_i) {
    real T_C = y[1];
    real T_P = y[2];
    real k_1 = theta[1];
    real k_2 = theta[2];
    real V_M = theta[3];
    real dydt[2];
    dydt[1] = -k_1*T_C + k_2*T_P;
    dydt[2] = k_1*T_C - k_2*T_P - V_M*T_P;
    return dydt;
  }
}
data {
  int<lower=1> T;
  int<lower=1> D;
  int<lower=1> K;
  real y[K,T];
  real t0;
  real y0[K,2];
  real ts[T];
  real parameters_log[D];
  real std_log[D];
  real volume_init[2];
  real volume;
}
transformed data {
  real x_r[0];
  int x_i[0];
}
parameters {
  real<lower=0> sigma;
  real<lower=0> theta[4];
}
model {
  real y_hat_log[T,2];
  real y_hat_log_volume[T];
  sigma ~ normal(0.08, 0.01);
  for (d in 1:D){
    theta[d] ~ lognormal(parameters_log[d], std_log[d]);
    }
  for (k in 1:K){
    y_hat_log = integrate_ode_rk45(sho, y0[k], t0, ts, theta, x_r, x_i);
    for (t in 1:T){
      y_hat_log_volume[t] = y_hat_log[t, 1]/theta[4];
    }
    for (t in 1:T){
      y[k,t] ~ normal(y_hat_log_volume[t], sigma*sqrt(y_hat_log_volume[t]^2));
    }
  }
}

The data I am trying to describe looks like that. I am fitting all of them at the same time to get one set of parameters our. Doing one indiviually gives me the same error as when I am doing them all at the same time.

It would be very nice if someone maybe has an idea how I could get this better to run or maybe sees an error I am not seeing at the moment.

Best regards,
akkarin09

Do the fitting troubles remain if you initialize your chains near where you think the solution is?

Yes the fitting trouble remains if I use values that give me a good fit.

@bbbales2
Doesnt this sigma*sqrt(y_hat_log_volume[t]^2) return a vector with just sigma*y_hat_log_volume[t]?

Yeah good catch on the sqrt(x^2) thing.

Ouch, what sort of errors are you getting?

Does the error go away if you set one of the thetas to the true values?

I used the sqrt(x^2) because it fixed a problem I had. If I just used sigma*y_hat_log_volume[t] I got this error message
RuntimeError: Exception: normal_lpdf: Scale parameter is -3.9192e-09, but must be > 0! (in 'unknown file name' at line 56)
which referred to this line,
y[k,t] ~ normal(y_hat_log_volume[t], sigma*y_hat_log_volume[t]);
I am not sure if there is a better way of solving this problem then by doing what I did.

I tried running it again without the sqrt(x^2) and I got 3 chains to work but also not every time. I am still getting the E-BFMI = 0 message after my sampling sometimes.

I am getting the same error if I fix one of my thetas.

Seeing that the error scales with the state, ie is multiplicative, shouldn’t it then be lognormally distributed?

I’ll try the model out in a bit.

It works with simulated data?

What is the reason to use an ODE solver here? This ODE system should be solvable. Mathematica is your friend. You will love the speed of the analytic solution.

An alternative to try is the matrix exponential given that the coefficients are constants.

What are the typical scales? ~1e5 for the initial conditions, and for the parameters (prior mean+std)? Also, definitely try what @wds15 said, the negative values are probably and artifact of the numerical integration, which should be avoided with an analytical solution.

run the integrator with 1E-6 absolute tolerance and just add a 1E-3 to all the outputs from the integrator. This assumes that 1E-3 is sufficiently close to 0 in your problem given your scales. That will avoid negativity problems.

This problem would be solve analytically but once I move to more complex versions of this it is not anymore. Then I would need to fit it like this but if the simple case is not working properly I do not even to try the more complex version.

I have not tried matrix exponential. It might work in this case but, again, if I move to a more complex version I am not able to use it anymore since I going to lose my linearity.

Thanks for the solve for negativity problem. It also helped with my fitting. I got it to run on four chains.

My initial conditions are around 5e+3 and for the parameters I use as mean (0.1, 0.06, 0.03) with std (0.005, 0.005, 0.005) because I know they are quite close to values that can fit my data.

Maybe just assume that these parameters are known? That will massively speed up your model. The question you need to ask yourself is if the uncertainty is worth propagating. Fixing 3 parameters in the ode is equal to solving 6 ODE equations less in your case.

(y). Good idea, also to then ask for pointers here.

Was that regularization enough? With these priors, there shouldn’t be issues anymore, or are there?

Will you also use these priors later? I guess they will be wider?

I don’t want to fix them because I want to get the values from the fit. All of the values are “unknown” to me and I want find them using my fit.

It looks like that might have been enough. I haven’t encounterd any E-BFMI errors since then but the run time varies a lot. I am not sure if that’s on my end or not but I had times where I could run the sampling in something between 20s and 100s and then at other times I can run it for over an hour and nothing happens.

I’m really not the expert, but to me it sounds like your chains may occasionally get stuck in (unphysical) metastable wells in the posterior landscape, which to add insult to injury stress the numerical integrators due to excessive stiffness, resulting in the large runtime variations.

See this thread Fitting ODE models: best/efficient practices? and most importantly the first link in the first post linking to a case study by @charlesm93. Maybe the steps taken in the case study can help you?

Edit: Ah wait, but you say you initialize your chains near the correct values?

It is probably worth doing prior predictive checks to evaluate how the prior is working.

Comment the likelihood out of your model and run it (keep the ODE solve in).

If the model still doesn’t consistently run efficiently, then there’s something we need to do about the prior.

If the model does run efficiently, then look and prior predictions and make sure that we’re setting this model up to be easy to fit (and the prior predictions aren’t predicting stuff we really don’t expect).

1 Like

Yeah it might get stuck sometimes.
Thank you for the recommendation. I haven’t had the the time to read it yet but I will look into it.
My initial values should be pretty close to the correct values.

When I remove the likelihood from my code it gave me results every time I ran it. I guess that show that the priors are fine with the model.

For the prior predictions I ran the model in the generated quantities section with my priors as my parameters and it looks reasonably close to what I would like it to look in the end. It might just be like @Funko_Unko said that I am just getting stuck in some wells in the posterior landscape.

Yeah I guess so. Your model doesn’t have many parameters, so you could try making pairs plots of all the draws, coloring draws by chain, and looking for the differences (or traceplots too).

Either way the goal would be trying to figure out if chains indeed are going to different places.

You could also try setting different combinations of parameters to known values to try to find a minimum model that has these problems.