Hierarchical Gompertz model

jroon · March 27, 2020, 5:27pm

Would it be easier only to model an effect on one parameter?

Are you having any divergences in the model ?

a represents the upper asymptote if I’m not mistaken? Shouldn’t it be different by country ?
There is an issue with identifiability in these models - see section on sigmoid curves here: Identifying non-identifiability

jroon · March 27, 2020, 6:02pm

What data are you looking for?

Juan_Ignacio_de_Oyarbide · March 27, 2020, 6:58pm

It has different meanings, since they represent different things on the function.

No, I am not having any divergence.

It is different by country, but they share information through the common intercept.

Juan_Ignacio_de_Oyarbide · March 28, 2020, 10:56am

I don’t see the hierarchy. You are not pooling the A_i's since they come from different priors N(0,10000). A complete pooling model is the one that sets A∼N(0,10000), a no-pooling model A_i∼N(0,10000) and a partial-pooling model or hierarchical model A_i∼N(\gamma,\sigma_A) and you have to define hyperpriors on \gamma and \sigma_A.

jroon · March 28, 2020, 11:04am

Sorry Juan I made a mistake in that formula - it is actually only one pooled prior in the code:

A_group ~ normal(0, 10000);

That model fits well! I run in trouble to add hierarchy to the other parameters due to identifiability issues.

By the way this might be the shortest and yet clearest explanation I’ve seen yet on pooling. Thanks!

jroon · March 28, 2020, 11:11am

If this is useful for Ireland:
March 12th partial restrictions brought in.
March 28th (today!) more restrictions brought in.
More information here: https://www.citizensinformation.ie/en/health/covid19_overview.html

I’m sure you can get case counts from usual sources

Juan_Ignacio_de_Oyarbide · March 28, 2020, 11:12am

A_group is a vector so it is not a complete pooled model. If you want to put a hierarchy on A, the model should be the following

A_i\sim N(\gamma_{group}, \sigma_A)

\gamma_{group} \sim N(0,10000)

So every level of A_i share information through \gamma_{group} (global intercept) and the variation across A_i is captured by \sigma_A.

Juan_Ignacio_de_Oyarbide · March 28, 2020, 11:18am

It could be nice to somehow measure the speed of testing and the health system response. But very hard to put it in a factor.

jroon · March 28, 2020, 11:20am

So - in essence I wanted each A to have its own mean, but no I don’t want to share information across levels of A. I’m not convinced that it should be pooled apriori. In this parameterisation, A represents the upper asymptote which is very different by country and also multifactoria with many factors unknown to usl. I don’t think it make sense to share information across this parameter. I do think it makes sense to pool on the k parameter of my model this relates more closely to the properties of the virus itself.

jroon · March 28, 2020, 11:21am

It seems to be really hard to get good data on number of tests run - I think countries are not doing as much as they want and so reluctant to share

Juan_Ignacio_de_Oyarbide · March 28, 2020, 11:22am

Yes, sure. It’s just I thought you were talking about a Hierarchical model :).

The A is the big question and concern that we all have right now.

jroon · March 28, 2020, 11:42am

Yes but it doesn’t need to be fully hierarchical on every variable! Given the huge differences in A between countries I think it would be borrowing too much information from the few countries with really large case counts.

Juan_Ignacio_de_Oyarbide · March 28, 2020, 11:55am

You can check the picture I uploaded before about A. I still don’t get how those posteriors are generated but you get very different HDIs.
In the other hand, it would be reasonable to put a dependence structure between c and A to capture the reaction from governments to flatten the curve. I don’t know how to do this.

jroon · April 1, 2020, 6:01pm

Martin - how did you come up with the substitution/ transformation ? I had pen and paper out to try to work out an analogous transformation for the following model but i got hopelessly stuck:

\\Y = Ae^{-exp(-k( t - d ))}

Identifiability issues occur when (t - d) = 0 (I believe)

Edit: My code for the model is in this thread + post: Hierarchical Gompertz model

Edit 2: note t is data = time in days

mevers · April 1, 2020, 9:57pm

@jroon I think in your case of the three-parameter Gompertz function

f(t) = A \exp{\left(-\exp{\left(-k(t - d)\right)}\right)}\,,

there are two potential non-identifiability “problem areas”

As you mentioned, the case t = d, which corresponds to f(d) = A e^{-1}, and
the asymptotic limit for large values of t, \lim_{t\to\infty} f(t) = A.

Both areas leave k and d unidentifiable.

jroon · April 2, 2020, 10:07am

Regards 2 - sorry I should have said - t is data (time). In my use its number of days the Covid outbreakis happening so its nowhere close to infinity thankfully!

martinmodrak · April 3, 2020, 7:24am

No magic trick I fear - I just thought hard about what can be learned from the data and what cannot, did quite a bit of math and then tested a bunch of things until I found the one that worked. Also simulating data and seeing how changes in the parameters change the shape of the curve helped me get some insight.

If I get it right, t is known while A,k,d are paraemters, right? So treating Y = f(A, k, d, t) One way to go about this might be to take take t_{min}, t_{max} as the min/max t you observe and try to use y_{min} = f(A, k, d, t_{min}) and y_{max} = f(A, k, d, t_{max}) as new parameters - if you can solve for the other parameters given y_{min}, y_{max} - but I am really just guessing here. The value at midpoint of the observed t range might also be a good parameter. The point is that such values are by construction well constrained by data (but might be impractical because it is hard to derive the parameters you need from them).

EDIT: To be a bit more specific, since Gompertz is also a sigmoid curve. Here are some specific ideas I used with the logistic sigmoid:

If I observe only the start of the curve, the upper plateau (A) is not determined.
So instead we use the value at midpoint of observed data as a parameter. When I observe the upper plateau, this would roughly correspond to A/2, but it is constrained by data even when I only see the lower plateau.
If the inflection point of the curve is far from the observed data range, I only see the lower or the upper plateau, i.e. almost constant function. In other words if the inflection point is far from observed data, it has little influence on the actual shape of the curve. In this case the “slope” of the curve is also not informed by the data. To overcome this, we used the location of the inflection point on the x-axis as another parameter AND we put an informative prior on it to constrain it “close” to the range of observed data. This makes the slope somewhat identified as well.
- This changes the interpretation of the model fit! If the posterior for the inflection point has notable mass outside the observed data range, we have to be aware that this part of the posterior is influenced almost exclusively by the prior and the actual inflection point can be much further from the observed data range than what the posterior might suggest. In this case the fitted slope is also just a consequence of the prior.

For the logistic sigmoid those resulted in reasonably neat formulae, not sure if that would be the case for Gompertz, so other tricks might be necessary.

jroon · April 4, 2020, 10:00am

Great answer thanks - lots to think on!!

jroon · April 5, 2020, 4:22pm

Ok just to check my thinking here before I model this:
Given my gompertz function, where t = time in days:

\\Y = f(t) = Ae^{-exp(-k( t - d ))}

I said let:

\\y_{max} = Ae^{-exp(-k( t_{max} - d ))}

Applied some algebra and got:

\\f(t) = y_{max} \cdot e^{(exp(-k(t_{max} - d) - exp(-k(t - d) ))}

where y_max, t and t_max are all taken from data meaning I now have a 2 parameter model.

Am I in thinking along the right lines here? Presumably then after I would fit this I then calculate A in the generated data section using the middle equation here ?

@mevers - I hope you don’t mind me part-invading your thread!

Edit: test for single country fits - i.e. no hierarchy look good so far! Will update as I progress!
Edit2: Algebra incorrect deriving last equation - fixed now.

martinmodrak · April 14, 2020, 7:33am

Hi,
it looks roughly good (didn’t check the algebra), but y_{max} will be another parameter, not taken from data - if you take y_{max} from data, you are ignoring the observation noise for this value.

Does that make sense?

Topic		Replies	Views
Fit Bayesian hierarchical model with three levels using brms Modeling brms	3	844	March 24, 2021
Multilevel model using >16GB of RAM Modeling	12	1058	March 27, 2020
How to fit a nested hierarchical model? Modeling specification , hierarchical-model , brms	6	211	February 1, 2025
Hierarchical CAR model in brms Modeling	5	1457	June 21, 2021
Erratic running time for hierarchical poisson model brms	4	501	May 11, 2020

Hierarchical Gompertz model

Related topics