Hierarchical Gompertz model

maxbiostat · March 25, 2020, 1:29pm

Ok, I’m going to stick my neck out here:

I find it very unlikely that the parameters can just be anything. The restrictions are needed in order for the curve to make sense.

That happens with probability zero, no?

jroon · March 25, 2020, 1:34pm

I don’t see a principled reason that k cannot be lower than zero !?

No. X is a positive integer so if T is positive number if it can definitely happen that (X - t) =0 or very very close to 0.

andre.pfeuffer · March 25, 2020, 1:59pm

pairsplot

I am refering now to:
A * exp(-exp(-k * (X - T)))

and not the wikipedia version, where restriction makes sense or not, I don’t won’t this discuss yet.

if k < 0 the curve is decreasing, but ok.
And the pairs plot shows the unconstrained parameters. Do you see any pathologies?

Max_Mantei · March 25, 2020, 2:01pm

I didn’t say this! I’ve put the exp in the equation, so that the mean and random effects can both vary freely on the log scale. k and delay are still strictly positive in this specification. Sorry, if that was unclear.

andre.pfeuffer · March 25, 2020, 2:31pm

-c * (k * X) I wonder about the reference to this expression in your code?

jroon · March 25, 2020, 3:36pm

Sorry guys I’ve alot going on today hard to keep up with this too. Thank you all for your input.

Sorry I misunderstood you above so!

I don’t understand what your question is ?

andre.pfeuffer · March 25, 2020, 8:02pm

[quote=“jroon, post:38, topic:13724”]

May I ask to specify the reference to the expression above, it’s in the code you’ve posted.
It’s something I don’t understand.

jroon · March 25, 2020, 8:35pm

Sorry purely a language thing - I don’t know what “specify the reference to the expression above” means!!

Juan_Ignacio_de_Oyarbide · March 27, 2020, 9:08am

Hi everyone,

I wrote a vectorised version of the Hierarchical Gompertz model, but it is not in brms. It is currently working but there are still problems to solve in terms of fit. I don’t know if you want to discuss it here even that it is not in brms.

data {
  int<lower=0> N;
  int<lower=0> M;
  int<lower=0> K_a;
  int<lower=0> K_c;
  // int<lower=0> n_continents;
  vector[N] y;
  vector[N] day;
  vector[M] day_projected;
  int<lower=0> K; //population effects
  matrix[N,K] design_matrix_1;
  matrix[K,K_a] design_matrix_a;
  matrix[K,K_c] design_matrix_c;
  // matrix[K,n_continents] design_matrix_a_intercept;
  matrix[M,K] design_matrix_proj;
}
parameters {
  vector<lower=0>[K] a;
  real<lower=0> a_intercept;
  real<lower=0> sigma_a;
  
  vector<lower=0>[K] b;
  real<lower=0> b_intercept;
  real<lower=0> sigma_b;
  
  vector<lower=0>[K] c;
  real<lower=0> c_intercept;
  real<lower=0> sigma_c;
  
  vector<lower=0>[K_a] a_2;
  // real<lower=0> a_intercept;
  // real<lower=0> sigma_a;
  
  vector<lower=0>[K_c] c_2;
  // real<lower=0> c_intercept;
  // real<lower=0> sigma_c;
  
  real<lower=0> sigma;
}
transformed parameters{
  vector[N] a_index= design_matrix_1 * a;
  vector[N] b_index = design_matrix_1 * b;
  vector[N] c_index = design_matrix_1 * c;

  vector[N] mu= a_index .* exp(- b_index .* exp(-c_index .* day));
  
  // vector[K] a_intercept_index= design_matrix_a_intercept * a_intercept;
}
model {
//priors
target+= normal_lpdf(a|a_intercept + design_matrix_a * a_2 ,sigma_a);
target+= normal_lpdf(a_intercept|6,3);
target+= cauchy_lpdf(sigma_a|0,3);

target+= normal_lpdf(b|b_intercept,sigma_b);
target+= normal_lpdf(b_intercept|5,10);
target+= cauchy_lpdf(sigma_b|0,25);

target+= normal_lpdf(c|c_intercept + design_matrix_c * c_2,sigma_c);
target+= normal_lpdf(c_intercept|0.1,0.5);
target+= cauchy_lpdf(sigma_c|0,1);

target+= student_t_lpdf(sigma|5,0,1.5);

target+= lognormal_lpdf(y|mu,sigma);
}
generated quantities{
 real yrep[N]= lognormal_rng(mu,sigma);
 real yproj[M]= lognormal_rng(design_matrix_proj*a .* exp(-design_matrix_proj*b .* exp(-design_matrix_proj*c .* day_projected)),sigma);
}
´´´

jroon · March 27, 2020, 10:18am

Hi @Juan_Ignacio_de_Oyarbide yes for sure welcome - in recent posts we moved to coding in Stan directly anyhow. What fit problems are you having ?

Also I don’t understand the design matrix coding you are doing - I see that also in brms generated code. What data do you pass into design_matrix_1 for example ?

Juan_Ignacio_de_Oyarbide · March 27, 2020, 10:32am

Well, first I think I should improve the plots so any idea is welcomed. The outcomes are the following:

In the second level regression (state level), I am trying to analyze the effect of late lockdown on the “a” and “c” parameters. The point is that we only have experienced the cases of China (very early lockdown) and Italy (we don’t see that have clues about the reaching point).
In the previous plots the trend is explosive for some countries, US, France, UK, Germany, but I am not able to explain where it comes from. The predictors are just static because they are demographic factors. The lockdown effect is measured by days between the first case and the day of lockdown, so it is how much it affects the late decision of social distance, but not really observed. If the model is extrapolating from the very short lockdown in China (and 80.000 confirmed cases), then maybe that’s why we get such explosive curves (not that unreasonable though).
Anyways I think it is worth of discussion the priors and predictors on “a”, as well as any data available that we could get.

Juan_Ignacio_de_Oyarbide · March 27, 2020, 10:42am

The design matrix is just the X of this product X\beta. It has N rows corresponding to the number of observations and K columns corresponding to the number of parameters. If you have \beta_i where i=1,...,9 and n=1,....1000, which means 10 parameters (maybe 2 predictors with 4 levels in each and 1 intercept) and 1000 observations.

You can generate this matrix with the function model.matrix(~ predictor1+ predictor2, data).

jroon · March 27, 2020, 11:34am

Not sure I can answer your questions I’m haven’t been able to make a hierarchical model work at all - my Stan is just not good enough. One thought though is are you assuming the same lockdown effect across countries or a hierarchical one. Every country is doing slight variations of lockdown at different times so effect may vary ?

Juan_Ignacio_de_Oyarbide · March 27, 2020, 12:47pm

Yes, the effect is linear on the number of days from first case to day of lockdown. It’s giving the penalization on the mean of “a” for each extra day after first case. But I think it’s not doing much, I changed the lockdown date of China to see what happens and the results are nearly the same. I don’t get where the varying effects of “a” are coming from, given that there are no much to learn from the data (no flat curves whatsoever). There are 19 countries in the model so far:

jroon · March 27, 2020, 12:48pm

My best attempt at a hierarchical model - trying to add hierarchy just to the A term:

data {
  int<lower=1> N;  // number of observations
  vector[N] Y;    // dependent variable
  vector[N] X;    // independent variable
  int<lower=1> N_group;   // number of groups
  int groups[N];  //  group assigments
}

parameters {
  vector[N] A;  // population-level effects
  real <lower=0>k;  // population-level effects
  real delay;  // population-level effects

  vector[N_group] A_group; // Vector of group effects
  
  real <lower=0> sigma;  // residual SD
}

transformed parameters{
  vector[N] mu; // population mean
  vector[N] nlp_A;

  for (n in 1:N) {
    // initialize linear predictor term
    nlp_A[n] = A_group[ groups[n] ] * A[n];
    // compute non-linear predictor value
    mu[n] = nlp_A[n] * exp( -exp( -(k * (X[n] - delay))) );
  }
}

model {

  // priors
  nlp_A ~ normal(0, 100);
  A_group ~ normal(0, 100);
  k ~ normal(0, 20);
  delay ~ normal(0, 20);
  sigma ~ student_t(3, 0, 1000);
  

  // likelihood
  Y ~ normal(mu, sigma);

}

Loads of divergences. suspect I’ve misspecified something but its so long since I worked direct with Stan code can’t figure it out. Anyone see a problem?

Juan_Ignacio_de_Oyarbide · March 27, 2020, 1:17pm

There are a few things to review. But you can add me to skype and I can explain it to you. I’d also recommend you to write the model down.

maxbiostat · March 27, 2020, 1:25pm

I would second that. It’s important to bounce back and forth between Stan code and maths. Kudos to all of you for collaborating in this time of crisis!

jroon · March 27, 2020, 1:49pm

In that model I’m trying to do this:

cum_cases ~ A * exp( -exp( -(k * (day - delay) ) ) ),
             A ~ 1 + (1 | Country.Region) ,
             k ~ 1 ,
             delay ~ 1

(in brms equation format because I don’t know how to do Latex here)

@Juan_Ignacio_de_Oyarbide thanks for offer but i’m on a work zoom call here can’t split my brain quite that much 😅 - also I’m using a slightly different parametrisation than you are - but its basically equivalent.

@maxbiostat - I just can’t bear to look at anymore badly extrapolated exponential fits that everyone with a copy of Excel seems to be making 🙈🙈

Max_Mantei · March 27, 2020, 3:34pm

You can use Latex like this: $latex expression$ , for inline and

$$
multi-line
latex expression
$$

for multi-line (without ticks obviously). Your model would be something like

\begin{align} \mu &= Ae^{-\exp(-k(\text{day} - \text{delay}))} \\ \log(A) &\sim \text{Normal}(\mu_A, \sigma_A) \text{ ...or whatever} \\ y &\sim \text{Lognormal}(\mu, \sigma) \text{ ...or whatever} \\ &\text{...priors for k and delay} \end{align}

Cheers!
Max

jroon · March 27, 2020, 4:05pm

Thanks Max.

My current model is:

\mu = A_ie^{-exp(-k(day - delay))} \\Y \sim N(\mu, \sigma) \\ A \sim N(0, 10000) \\k \sim N(0,10) \\ delay \sim N(0,10) \\\sigma \sim Student_t(3, 0, 1000)

(I hope I correctly depicted the hierarchy in the equation).

My code for this is:

data {
  int<lower=1> N;  // number of observations
  vector[N] Y;    // dependent variable
  vector[N] X;    // independent variable
  int<lower=1> N_group;   // number of groups
  int groups[N];  //  group assigments
}

parameters {
  real <lower=0>k;  // population-level effects
  real delay;  // population-level effects

  vector[N_group] A_group; // Vector of group effects
  
  real <lower=0> sigma;  // residual SD
}

transformed parameters{
  vector[N] mu; // population mean

  for (n in 1:N) {
    // initialize linear predictor term
    // compute non-linear predictor value
    mu[n] = A_group[groups[n]] * exp( -exp( -(k * (X[n] - delay))) );
  }
}

model {

  // priors
  A_group ~ normal(0, 10000);
  k ~ normal(0, 20);
  delay ~ normal(0, 20);
  sigma ~ student_t(3, 0, 1000);
  

  // likelihood
  Y ~ normal(mu, sigma);

}

Note the code it different to previous post I changed it.

Edit - actually - this code works now after I loosened the priors 😇
Edit2 - fixed formula above thanks to error pointed out by @Juan_Ignacio_de_Oyarbide

Topic		Replies	Views
Fit Bayesian hierarchical model with three levels using brms Modeling brms	3	702	March 24, 2021
Hierarchical CAR model in brms Modeling	5	1417	June 21, 2021
Erratic running time for hierarchical poisson model brms	4	500	May 11, 2020
Bayesian hierarchical model - how to spot convergence issues early Modeling rstan , fitting-issues , epidemiology , covid-19	1	490	October 19, 2022
Hierarchical Non-linear Gompertz growth curve in BRMS Modeling hypothesis-testing , brms	2	834	March 28, 2023

Hierarchical Gompertz model

Related topics