I’ve finally submitted my first paper that uses Stan, an attempt to model excess mortality in Canada over the span of fourteen years. It’s just been submitted, so who knows if/when it’ll get published, but in the meantime the preprint and source code are public. The latter will probably be of most interest here, most notably I wrote two different Stan “libraries” for cubic spline interpolation, both of which include routines for integrals and derivatives.
https://gitlab.com/hjhornbeck/excess_deaths_canada_2010_2023/-/blob/main/models/nr_splines.stan
https://gitlab.com/hjhornbeck/excess_deaths_canada_2010_2023/-/blob/main/models/mitchell_splines.stan
There’s also Stan code to handle heaps, reservoir sampling, and double-precision summation. I could have sworn there was a central repository for sharing Stan code, but all of my searches have come up empty, so this will have to do for now. Most of the repository carries a CC BY-SA-NC license, but a lot of simpler low-level functions are CC0.
As for the paper itself, the pre-print is here:
Understanding the causes of death in a population is vital to guide effective health policy. In this paper, we build a novel population-level model of mortality in Canada. It is capable of handling multiple causes of death simultaneously, has a more intuitive interpretation than prior models, allows for variable temporal resolution, and incorporates uncertainty into its outputs. Our model estimates that the official estimates of influenza only capture one in twenty of all deaths associated with influenza, and official COVID-19 mortality is half of all COVID-19-associated mortality. It finds younger demographics appear more susceptible to COVID-19 mortality than estimated, suggests there are multiple periodic patterns to mortality for all demographics, and that the per-capita rate of mortality would have decreased if not for influenza, COVID-19, and drug poisonings. We also examine and critique our model in depth, explore alternative parametrizations, and find it is weakest at capturing susceptibility to influenza and COVID-19-linked mortality. Full results and source code are public.
I don’t know how useful the rest of the Stan code will be; I think there’s some nifty ideas in there, but this was also my first time working with Stan, and my statistical knowledge is largely self-taught. There’s probably some silly or embarrassing stuff as well (at one point I thought log_sum_exp()wasn’t vectorized), but I am only human. Critiques are welcome, of course, it’s tough to improve without peer review.
