This is a nice initiative! Below I give some feedback.
Overall impressions
In the current climate of increasingly complex models which are all flawed in some obvious way, itās refreshing to see someone take a step back, elect a particular (single!) goal and go for it with a model of appropriate complexity. Kudos!
I very much like the idea that while the (first) model cannot estimate the true size of the epidemic, it at least can help us estimate whether control measures are working and how much.
I guess the conclusion that āthese early containment strategies have not yet stopped domestic transmission of the virusā is consistent with the notion that early intervention AND massive testing are whatās needed for substantial reduction of transmission.
The disclaimer about the association being just an association at this point is very apt; there is another factor thatās important in early-epidemic behaviour: stochasticity. With a few introductions, you can get massively different outcomes for the same system if you ārun different seedsā.
Technical comments
-
The use of the word āmutatedā is technically incorrect. The virus mutates. It just does. Whether the mutations it accumulates lead to appreciable difference in infectiviy and transmissibility is determined by a host of factors. Moreover, except for very peculiar situations, it is safe to assume that whatever differences in dynamics are observed between countries are caused by either (i) stochastic factors owing to timing of introduction, presence/absence of large gatherings or (ii) populational characteristics such as early/late institution of social distancing, age-structure, etc or both.
-
Is this āHowever, the number of tests is increasing in Ict conditional on a countryās willingness to test people. That is, regardless of how much a country wants to test people, as the outbreak grows the number of tests will increase though at very different rates.ā a reasonable assumption? Many countries simply lack the resources to test at any appreciable rate, right?
-
I didnāt get how test imperfection is incorporated into the model. Is it ājustā through the constraining of the intercept \alpha_3?
-
Iām not sure I get the figure (in the Estimation section) where most of the population is infected for a while and then the proportion drops down again. I donāt suppose it is meant to be the same as the I(t) curve in a (normalised) SIR model, that shows the incidence. But then if itās not that, I donāt understand what it is.
-
I know you want a simple model and I appreciate why that is. Depending on exactly what data you have at your disposal, however, you could include the reporting delay explicitly in the model relatively easily: https://onlinelibrary.wiley.com/doi/epdf/10.1002/sim.8303. Note I havenāt thought in depth about this suggestion and it might very well be vacuous.
-
Iād put a vertical line at 1 in the random effects plot of the proportion tested versus infected.
-
When you model \log q_{\text{ct}}ā \log I_{\text{ct}}, can you not derive the distribution directly instead of approximating it by a standard normal? I say this because if you change things to, say, a Uniform(0.1, 0.59), you donāt get a standard normal as the ābest fitā anymore.
hereās some R code for you to play with in case youād like to model Z_{\text{ct}} = \log q_{\text{ct}}ā \log I_{\text{ct}} explicitly:
f_y <- function(y, a, b, log = FALSE){
lc <- log(b-a)
ans <- y - lc
if(!log) ans <- exp(ans)
return(ans)
}
#
get_Y_var <- function(a, b){
EXsq <- ( b*(log(b)^2 - 2*log(b) + 2) - a*(log(a)^2 - 2*log(a) + 2) )/(b-a)
EX <- (b*(log(b) - 1) - a*(log(a) - 1) ) /(b-a)
ans <- EXsq - (EX)^2
return(ans)
}
##
analytic_W <- function(w, a, b){ ## assumes i.i.d.
c0 <- all(w > a/b, w < b/a)
k <- (b-a)*(b-a)
m <- max(a, a/w)
M <- min(b, b/w)
soln <- function(L, U) ((U *abs(U))- (L *abs(L)))/(2*k)
d0 <- soln(L = m, U = M)
dens <- c0 * d0
return(dens)
}
analytic_W <- Vectorize(analytic_W)
##
f_z <- function(z, a, b, log = FALSE){
ans <- log(analytic_W(exp(z), a, b)) + z
if(!log) ans <- exp(ans)
return(ans)
}
f_z <- Vectorize(f_z)
###############
M <- 1E6
a <- .01
b <- .5
Y <- log(runif(M, min = a, max = b))
hist(Y, probability = TRUE)
curve(f_y(x, a, b), min(Y), max(Y), add = TRUE, lwd = 2)
abline(v = c(log(a), log(b)), lwd = 2, lty = 2)
integrate(function(x) f_y(x, a, b), log(a), log(b))
###
Y1 <- log(runif(M, min = a, max = b))
Y2 <- log(runif(M, min = a, max = b))
Z <- Y1 - Y2
#
hist(Z, breaks = 50, probability = TRUE)
curve(f_z(x, a, b), min(Z), max(Z), col = 2, add = TRUE)
curve(dnorm(x, 0, sqrt(2*get_Y_var(a, b))), min(Z), max(Z), col = 3, add = TRUE)
abline(v = c(log(a)-log(b), log(b)-log(a)), lwd = 2, lty = 2)
integrate(function(x) f_z(x, a, b), log(a)-log(b), log(b)-log(a))
integrate(function(x) x^2* f_z(x, a, b), log(a)-log(b), log(b)-log(a))
2*get_Y_var(a, b)
var(Z)
typos, nitpicks, etc
- ārecieveā is misspelt
- āas the virus increasesā -> as the number of cases increases / as the epidemic progresses
- Is the term āBinomial proportionā technically sound? if X ~Binomial(N, p), I donāt feel confortable calling X a ābinomial proportionā of N. But I might be very wrong.
- " Stan, a Markov Chain Monte Carlo sampler" Stan is a language. Dynamic HMC is a sampler.