Hello,
I am working on an age-structured Bayesian SEIR transmission model, fitted to daily mortality/incidence counts. The benchmark is a SEIR model without age structure, yielding the estimated posterior number of new COVID-19 cases in time. The benchmark experiment utilized an Euler forward integration solver, and the posterior outputs agreed with the posterior outputs using the trapezoidal rule.
During the development stage for the age-structured version (max 4 age groups for now), I have implemented several tricks in an effort to scale up my ODE-based model:
- Reparameterize the ODE by solving the ODEs for deviations from the baseline, see Sections 5.2-5.3 of this paper.
- Work with scaled versions of the number of individuals within each compartment wrt the overall population.
- Specify the tolerances manually to rel_tol = 1e-08, abs_tol = 1e-09; these values agree with the order of the ODE solutions. See Stan forum post 1 and forum post 2.
- Ensure positivity of the ODE solution: I add 10*abs_tol to the ODE solution, in an attempt to be above the absolute error at which I run the integrator [Stan forum post:]
- Then, I play arround with max_num_steps, starting from 1e3 to 1e8, etc.
Since I do not know in advance if the system of ODEs is stiff or not, I try the the RK45 and BDF solvers. I always start with a small MCMC run (~100 iters), then go for the main MCMC run with a setting like:
- No initial values provided.
- 4 parallel chains
- Warmup iterations = 500
- Post-warmup iterations = 500
- adapt_delta = 0.8
- max_treedepth = 19
Question 1 (ODE solver)
I’ve been receiving the CVode error Exception: CVode failed with error flag -4.
for one of the Markov chains. Can someone please point out to a source where there is an explanation for all these error messages? I cannot find them in the Documentation for cvode v5.7.0, unless I am missing something.
E.g. Error in integrate_ode_bdf, CVode error flag -1
indicates that the solver reaches maximum time step limit before finding an accurate solution, but this is only explained by @yizhang here.
Question 2 (Parameterisations)
In my different MCMC runs I end up with divergences (around 1/3 of the overall post-warmup iterations), accompanied by the related message to reparameterize the model.
I want to understand what kind of parameterisations are meant in the context of disease transmission model, where the model parameters are constrained by physical reality?
For instance: I assume a fixed incubation and recovery period and I explore a time-varying transmission rate. Previous work supports a parameterization by sampling from a Standardized Normal, then performing a non-centered parameterization (NCP) to recover the transmission rate that enters the ODE.
From experience, are there any other parameterizations that can help the sampler? Eg. it is sensible to assume a Half-Normal or Half-Cauchy prior for some of my parameters. Let y_i \in \mathbb{R}^+, i = 1, \ldots K be the parameters of interest stored in a vector, with
y_i \sim N_+(\mu_i, \sigma_y)~~\sigma_y \sim N_+(0, 5),~~\mu_i\in \mathbb{R}^+
and \mu_i\in \mathbb{R}^+, i = 1, \ldots K is the data-defined location.
The Reparameterisation section of the Stan User Guide does not point out to NCPs on constrained spaces. Suggestions include an extra transformation, see this post but this may change the meaning of the prior, which I wish to avoid happening.
Tagging users @yizhang, @wds15, @srossell, @bbbales2, @storopoli, @Bob_Carpenter based on previous posts.
Computational setting:
RStan 2.21.3
WIndows 10
Thanks.