Hierarchical Gompertz model

Hi @mevers - I dont’ mind that, but i’m basically done with it now because non-model problems came up.
@martinmodrak Yes I discovered that ymax needed to be a parameter and I did that using a missing value approach with success! Thanks to you both for your help! Good luck @mevers with your project!

Hello,

I’ve been following this thread for a while because I’m trying to build the same model for spanish regions but for cumulative deaths. I’m a beginner in bayesian stuff but work myself pretty well in R language.

Following @jroon 's code I arrived to this model for spanish data:

library(tidyverse)
library(brms)

raw <- read_csv("spain_data.csv")

formula <- casos ~ A * exp( -exp( -(k * (days - delay) ) ) )

form_mult <- bf(formula,
                A ~ 1 + (1 | ccaa),
                k ~ 1 + (1 | ccaa),
                delay ~ 1 + (1 | ccaa), 
                nl = TRUE)

priors <- c(
  prior(normal(0, 13000), nlpar = "A", lb=200),
  prior(normal(.1, .05), nlpar = "k", lb=0),
  prior(normal(5, 20), nlpar = "delay", lb=0),
  prior(student_t(3, 0, 5000), class = "sigma"),
  prior(student_t(4, 3000, 5000), class = "sd", group = "ccaa", nlpar = "A"),
  prior(student_t(4, 0, 0.03), class = "sd", group = "ccaa", nlpar = "k")
)

mod <- brm(form_mult, 
           data = raw,
           prior = priors, 
           seed = 1234,
           family = gaussian("identity"),
           iter = 4000,
           chains = 4, 
           cores=4, 
           sample_prior = "no", 
           control = list(adapt_delta = 0.99,
                          max_treedepth = 12))

This code gave me this results:

It works well I guess, but the CI seems to me too optimistic for the bigger regions. It looks like the CI are more or less the same size in all regions and that doesn’t seem reasonable to me. Is there a way to correct this and let de CI in each region be different? I thought that the key was in this line of code but I’m not sure I understood what this prior means: prior(student_t(4, 3000, 5000), class = "sd", group = "ccaa", nlpar = "A")

Of course, if you see something wrong with the code (probably there is) please tell me.

Thank you all in advance.

spain_data.csv (21.0 KB)

2 Likes

Arhg… I had a long answer typed up and hit a wrong button and lost it. Ok bullet point version:

  1. I tried to run your model as you posted it using data you posted and it took a long time and there were many divergences and my fit was very different. Is your graph from the exact model you posted? (wondering if I need to update something on my system here).

  2. The model doesn’t include any term for the size of any of the areas and so the CI are mostly governed be the number of datapoints for each area and since your data has 46 points for every area this is why CI similar by area.

  3. Another problem with this model is that it doesn’t allow for the fact that different areas could be at different points in the S curve. As it happens all the Spanish areas are approaching the asymptote so in your plot this doesn’t manifest any problem, but imagine one area was still in the exponential growth phase near the bottom of the S - this model would force and S shape on it even though that area is in effect yet only an exponential curve.

For these reasons and others I moved to a reparameterised model Hierarchical Gompertz model However I gave up on the whole idea before perfecting the model. For some of the issues raised in the thread here, but also because I became more wary about the data that is available for various countries for various reasons I came to the conclusion that do do any useful models would need better data than count data.

Thanks for your reply.

About 1) yes the plot was made with that same model but with the values gotten from the predict() function. Is this right or should I use the conditional_effects() function? I didn’t get correctly the way you control the ‘scales’ argument for the facets, so I did it the long way:

# Fitted
pred <- as.data.frame(predict(mod, probs = c(0.01, 0.99), re_formula = NULL))

colnames(pred)[3] <- "low"
colnames(pred)[4] <- "upp"
pred$ccaa <-  raw$ccaa
pred$days <- raw$days
pred$observed <- raw$casos
pred$pob <- raw$pob
pred$low[pred$low < 0] <- 0

pred %>% 
  ggplot(aes(x = days, y = observed)) + 
  geom_point(aes(color = dentro), size = 0.4) +
  geom_ribbon(aes(ymin = low, ymax = upp ), alpha = 0.3) +
  facet_wrap(~ccaa, scales = "free_y") 

About the time taken for the model to fit, well… I thought it was normal. The divergences had to do with Maximum treedepth exceeding? I did get those but I read the Stan documentation and read that those weren’t so worrying and got to do with sampling efficiency.

  1. How would I include this term for the size of each are? as a prior or do I have to change the formula? Can you elaborate a bit?

  2. When I first tried this model with data for Madrid (so just one area) and left 15 observations out of the sample, I got this which I thought at the moment was a pretty great prediction:

The fitted values for the in sample data points don’t seem like been forced as an S shape. The dashed line (the out of sample prediction) does look like an S, but because it looks like it got right the prediction.

When I fit this model just for one area tweaking the priors (maybe not the way their supposed, I guess, I’m just starting with the bayesian framework and probably doing things incorrectly) I get really impressive results. But again I’m probably doing something wrong.

I’ll try with the reparameterised model you present there. I read that a few days ago but didn’t understand exactly what a “missing value approach” is. I’ll look that up. Just to be clear the formula for that new equation should be wrote in R this way? isn’t Ymax the same as A (the upper asymptote)?:

formula <- casos ~ ymax * exp(exp( -(k * (tmax - d) - exp(-k*(t-d)) ) ) )

Thank you very much for taken your time answering a bayes noob xD

Oh yes that produced the same plot for me. Strange!

Don’t think you can. I didn’t think of a way anyhow. What I came to realise is the Gompertz model is fundamentally a bad model for the task at hand for the reasons we are talking about and probably others.

Yes but that curve was already fairly well advanced. Also because its not hierarchical its borrowing information from other areas and so this one area is less constrained that it would be in a hierarchical model. If you look at some of my monstrosity-plots earlier in the thread you can see this effect happening. The fact that your current model is so slow and gets many divergences is a warning that the model is badly specified - even if the plots look nice.

I could not get any model to work well on the hierarchical level. No you are probably not doing something wrong - its just a badly posed problem I concluded. Ultimately also, if we did get this approach to work, its not really very different to the IHME model which has received alot of criticism: COVID-19
Behind these predictions is a generalized loistic model that is quite similar really: Methods - Curve Fit Part of the reason I stopped working on this is I realisd I could not make a model as good as IHME did, and they got alot of negetive attention over this model.

Sorry - this reflects my confusion at various points. Ymax is the last observed Y which is different to A - its a bad choice of name but my understanding was shifting as I was doing it. By “missing value approach” I meant I treated Ymax as a missing value of Y, which was necessary for the reasons martin pointed out. This was tricky to get work and looking at code now its a real mess, if it was clean i’d share it, but I’d need a couple of hours to sort it out and I dont’ have time right now

Also a noob here please remember!!

Here, for a curve considered, t is varying, while d is a fixed unknown one.
Also, we usually say x and y are non-identifiable when both x and y are parameters.
t is not parameter, while d is a parameter.

How identifability is an issue?

I share with you the model we were building a couple of weeks ago. The last thing we tried to do was to fit a function on the c parameter to add more flexibility. Although predictions improved for some countries (we’ve been testing for the last 2 weeks), for others are pretty bad. Also, interpretability is almost gone by this time-based function on c. This is due to fact that c is a relative measure with respect to A, and it will increase when the curve is close to the flattening part.

How do you know it is not a model problem? I am not sure which model you are referring to, but I don’t think the models presented here are flexible enough to evaluate the performance of the HMC. For the Gibbs sampler we need the full conditional distributions, right?

Yes, I agree with you that Gompertz is not flexible enough. Clearly the measures taken by governments and other factors influenced the curve for the past weeks. But I can’t see how we can evaluate HMC over this.

Please read our paper:
It seems that our problem is similar

We have used the Richard’s curve: please read our paper if interested. Richard’s curve is actually more frequently used in describing infection trajectory in the literature.

1 Like

Please read our published paper on PLOS ONE.
We also shared working code.

2 Likes