ODE solver errors & model parameterisations

Hello,

I am working on an age-structured Bayesian SEIR transmission model, fitted to daily mortality/incidence counts. The benchmark is a SEIR model without age structure, yielding the estimated posterior number of new COVID-19 cases in time. The benchmark experiment utilized an Euler forward integration solver, and the posterior outputs agreed with the posterior outputs using the trapezoidal rule.

During the development stage for the age-structured version (max 4 age groups for now), I have implemented several tricks in an effort to scale up my ODE-based model:

  1. Reparameterize the ODE by solving the ODEs for deviations from the baseline, see Sections 5.2-5.3 of this paper.
  2. Work with scaled versions of the number of individuals within each compartment wrt the overall population.
  3. Specify the tolerances manually to rel_tol = 1e-08, abs_tol = 1e-09; these values agree with the order of the ODE solutions. See Stan forum post 1 and forum post 2.
  4. Ensure positivity of the ODE solution: I add 10*abs_tol to the ODE solution, in an attempt to be above the absolute error at which I run the integrator [Stan forum post:]
  5. Then, I play arround with max_num_steps, starting from 1e3 to 1e8, etc.

Since I do not know in advance if the system of ODEs is stiff or not, I try the the RK45 and BDF solvers. I always start with a small MCMC run (~100 iters), then go for the main MCMC run with a setting like:

  • No initial values provided.
  • 4 parallel chains
  • Warmup iterations = 500
  • Post-warmup iterations = 500
  • adapt_delta = 0.8
  • max_treedepth = 19

Question 1 (ODE solver)
I’ve been receiving the CVode error Exception: CVode failed with error flag -4. for one of the Markov chains. Can someone please point out to a source where there is an explanation for all these error messages? I cannot find them in the Documentation for cvode v5.7.0, unless I am missing something.
E.g. Error in integrate_ode_bdf, CVode error flag -1 indicates that the solver reaches maximum time step limit before finding an accurate solution, but this is only explained by @yizhang here.

Question 2 (Parameterisations)
In my different MCMC runs I end up with divergences (around 1/3 of the overall post-warmup iterations), accompanied by the related message to reparameterize the model.
I want to understand what kind of parameterisations are meant in the context of disease transmission model, where the model parameters are constrained by physical reality?
For instance: I assume a fixed incubation and recovery period and I explore a time-varying transmission rate. Previous work supports a parameterization by sampling from a Standardized Normal, then performing a non-centered parameterization (NCP) to recover the transmission rate that enters the ODE.

From experience, are there any other parameterizations that can help the sampler? Eg. it is sensible to assume a Half-Normal or Half-Cauchy prior for some of my parameters. Let y_i \in \mathbb{R}^+, i = 1, \ldots K be the parameters of interest stored in a vector, with
y_i \sim N_+(\mu_i, \sigma_y)~~\sigma_y \sim N_+(0, 5),~~\mu_i\in \mathbb{R}^+
and \mu_i\in \mathbb{R}^+, i = 1, \ldots K is the data-defined location.

The Reparameterisation section of the Stan User Guide does not point out to NCPs on constrained spaces. Suggestions include an extra transformation, see this post but this may change the meaning of the prior, which I wish to avoid happening.

Tagging users @yizhang, @wds15, @srossell, @bbbales2, @storopoli, @Bob_Carpenter based on previous posts.

Computational setting:
RStan 2.21.3
WIndows 10

Thanks.

1 Like

In Stan 2.29 the error msg is enhanced with more details. Flag -4 indicates convergence failure. See CVODES manual appendix “CVODES constants”. Note that “CVODES” and “CVODE” are two different libraries.

Unfortunately it’s practically impossible to make suggestions without knowing more about the model structure. It seems you attempt to use non-centered parameterization but have concerns on the prior. Have you tried prior predictive checks?

Note sure if this is what you’re looking for, but for positive parameters one of the ways you can do something analogous to the non-centered parameterization is instead of

parameters {
   vector<lower-0>[N] beta;
}
model {
   beta ~ lognormal(log(mu), sigma);
}

you can do

parameters {
   vector[N] beta_raw;
}
transformed parameters {
   vector<lower-0>[N] beta = exp(mu + beta_raw);
}
model {
   beta_raw ~ std_normal();
}

Also, 1e-9 seems like a pretty strict tolerance. Do you really need that much error control? A rule of thumb I use is to try to get the solver error a couple orders of magnitude below the SD of the observations. Although I still think choosing your ODE solver error so that your posterior bias is acceptable is an area that needs to be researched more.

@yizhang

Unfortunately it’s practically impossible to make suggestions without knowing more about the model structure. It seems you attempt to use non-centered parameterization but have concerns on the prior.

At this stage, the transmission model I am working with assumes that the population is divided into 3 age groups, for example (0-39, 40-64, 65+). I wish to account for uncertainty and introduce variance in the contact structure while estimating transmission. This leads to 3*3 = 9 parameters. Letting \mu be a synthetic contact matrix available from previous studies, I have \mu =

           0-39     40-64       65+
0-39  7.0502055 0.7511938 0.2052035
40-64 0.8685895 3.0219170 0.2861569
65+   0.3837518 0.4628150 1.2482333

For computational reasons, I then vectorise \mu and introduce it to the stan program via the data block. I define the following priors

\sigma_{cm} \sim Cauchy_+(0, 5)\\ cm_{sample} \sim N_+(\mu, \sigma_{cm}).

Letting A = 3, p_sigmaCM = c(0,5), The relevant parts of the Stan program are:

data {							  
int<lower = 1> A;  // Number of age groups
real mu[A*A]; // Vectorized contact matrix; first A values correspond to number of contact between age class 1 and other classes, etc.
real p_sigmaCM[2]; // Cauchy[Location, scale] on sigma_CM
}

parameters {
  real<lower = p_sigmaCM[1]> sigmaCM;
  real<lower = 0> cm_sample[A*A]; // Sampled contact matrix (vectorized)

model {
  sigmaCM   ~ cauchy(p_sigmaCM[1], p_sigmaCM[2]);
  cm_sample ~ normal(mu, sigmaCM); 
}

While the effective sample sizes are satisfactory, the posterior summaries show elevated SDs compared to the mean and I wish to explore if a NCP would help.

@arya

Note sure if this is what you’re looking for, but for positive parameters one of the ways you can do something analogous to the non-centered parameterization is instead of

Thanks for the suggestion. I tried a simple check in R for entry [1,1] of the matrix, before proceeding to an actual MCMC run:

library(extraDistr)
cm_observed <- 7.0502055
sigma_cm    <- extraDistr::rhcauchy(1000000, 5)
cm_ncp      <- rlnorm(1000000, meanlog = log(cm_observed), sdlog = mean(sigma_cm))

> mean(cm_ncp)
[1] 3.790893e+116

> log( mean(cm_ncp) )
[1] 268.4325

These values are way off, I’m afraid.