Hierarchical Gompertz model

Hi. I’m trying to fit a hierarchical growth model to Coronavirus case counts. I can fit a simple model to a single country with reasonable results, but as soon as I try to introduce country as a new level the results no longer make sense. I attach a file with data for a few countries: corona_counts.csv (10.5 KB) and the following should hopefully illustrate the problem.

library(tidyverse)
library(brms)
library(bayesplot)

case_counts <- read_csv("corona_counts.csv")
case_counts$Country.Region <- as.factor(case_counts$Country.Region)

### Fit single country
form_1 <- bf(cum_cases ~ A * exp( -exp( -(k * (day - delay) ) ) ),
             A ~ 1 ,
             k ~ 1 ,
             delay ~ 1 ,
             nl = TRUE)

priors1 <- c(
  prior(lognormal(0.65, 0.5), nlpar = "A", lb=0),
  prior(lognormal(0.65, 0.5), nlpar = "k", lb=0),
  prior(lognormal(0.65, 0.5), nlpar = "delay", lb=0),
  prior(student_t(5, 0, 1.5), class = "sigma"))

mod1 <- brm(form_1, data = case_counts %>% filter(Country.Region == "China"),
           prior = priors1, seed = 1234,
           family = lognormal(link = "identity", link_sigma = "identity"),
           chains = 4, cores=4, sample_prior = "no")
summary(mod1)
plot(conditional_effects(mod1), points = TRUE)

This gives me a not great but not totally out of the ballpark fit:

However when I try to make a hierarchical model by country:

### Fit multiple countries with hierarchical model
form_mult <- bf(cum_cases ~ A * exp( -exp( -(k * (day - delay) ) ) ),
             A ~ 1 + (1 | Country.Region),
             k ~ 1 + (1 | Country.Region),
             delay ~ 1 + (1 | Country.Region),
             nl = TRUE)

modmult <- brm(form_mult, data = case_counts,
           prior = priors1, seed = 1234,
           family = lognormal(link = "identity", link_sigma = "identity"),
           chains = 4, cores=4, sample_prior = "no")
summary(modmult)


conditions <- make_conditions(case_counts, "Country.Region")
plot(conditional_effects(modmult, conditions = conditions ), points = TRUE)

Things are rather ugly:


It appears to give me the same linear fit no matter the country.

Where am I going wrong here? Any tips appreciated!

  • Operating System: Catalina
  • brms Version: 2.12
1 Like

Could it be that Country.Region is from type <chr> and not a factor?

Please add the library specification to your R code next time. It makes life so much easier.
I tried to figure out which one you are using and assume read_csv is from package readr,
but we cannot not be 100% sure.

1 Like

Hi Andre - sorry I was pretty tired when I wrote the query and forgot to put in the library() commands. I added those above now.
I also tried changing Country.Region to a factor - no difference in results.

 modmult <- make_stancode(form_mult, data = case_counts,
             prior = priors1, seed = 1234,
             family = lognormal(link = "identity", link_sigma = "identity"),
             chains = 4, cores=4, sample_prior = "no")

Let’s look at the generated code:

  for (n in 1:N) {
    // compute non-linear predictor values
    mu[n] = nlp_A[n] * exp( - exp( - (nlp_k[n] * (C_1[n] - nlp_delay[n]))));
  }

and

  if (!prior_only) {
    target += lognormal_lpdf(Y | mu, sigma);
  }

In your case the likelihood should not be log_normal,
but target += log(mu).

I don’t have too much experience with brms, so it requires to change the family in your specification.
You may define an own Gompertz_lpdf and use instead of log_normal.

1 Like

Why do you say this? I chose lognormal based on the example here: Use domain knowledge to review prior distributions | mages' blog
The logic being lognormal outrules negative values for cumulative cases.
Also note the lognormal works fine for the single country - things only go wrong for multiple countries in hierarchical model.

I’m afraid I don’t know how to define my own Gompertz_lpdf 😢

I see, I thought the Gompertz as a distribution function instead of Bernoulli.
I was totally off track. (But, there was no specification.)

The parameter mu must be on log-scale.

nlform2 <- bf(loss_ratio ~ log(ulr * (1 - exp(-(dev_year*phi)^omega))),
             ulr ~ 1 + (1|AY), omega ~ 1, phi ~ 1, 
             nl = TRUE)
m2 <- brm(nlform2, data = loss, 
family = lognormal(link = "identity", link_sigma = "log"),

See the log after ~.

1 Like

Ok I tried the following for single country and multiple countries respectively

form_1 <- bf(cum_cases ~ log( A * exp( -exp( -(k * (day - delay) ) ) ) ),
             A ~ 1 ,
             k ~ 1 ,
             delay ~ 1 ,
             nl = TRUE)

mod1 <- brm(form_1, data = case_counts %>% filter(Country.Region == "China"),
           prior = priors1, seed = 1234,
           family = lognormal(link = "identity", link_sigma = "log"),
           chains = 4, cores=4, sample_prior = "no")

# Multiple countries as hierarchical model
form_mult <- bf(cum_cases ~ log( A * exp( -exp( -(k * (day - delay) ) ) ) ),
             A ~ 1 + (1 | Country.Region),
             k ~ 1 + (1 | Country.Region),
             delay ~ 1 + (1 | Country.Region),
             nl = TRUE)

modmult <- brm(form_mult, data = case_counts,
           prior = priors1, seed = 1234,
           family = lognormal(link = "identity", link_sigma = "log"),
           chains = 4, cores=4, sample_prior = "no")

It made the single country model conditional effects fit much, much worse, while the mulitple country model improved just a little:


Edit: But I note that the multiple countries fit is the same regardless of country - i.e. the blue line looks identical in every country. Have I misspecified the hierarchy in the model somehow ?

Just wondering - could priors be a factor here? Do I need to change the priors for the hierarchical version ?

I am not so familar with brms, maybe @paul.buerkner can help us.

1 Like

I would start using a Gaussian() instead of a lognormal() family. This is of course not entirely accurate as it predicts negative values for small case counts but should make it way more easier to set up the model in general. Once that works, you can still go ahead and expend it to more appropriate families/distributions.

1 Like

Ok I tried these:

mod1 <- brm(form_1, data = case_counts %>% filter(Country.Region == "China"),
           prior = priors1, seed = 1234,
           family = gaussian(),
           chains = 4, cores=4, sample_prior = "no")

modmult <- brm(form_mult, data = case_counts,
           prior = priors1, seed = 1234,
           family = gaussian(),
           chains = 4, cores=4, sample_prior = "no")

And the post pred plots look worse again:

I must say I’m confused as to why you are advising a Gaussian family. The first model above performs well for a single country with the lognormal family. I don’t understand why the hierarchical version doesn’t produce different conditional effects by country which surely should be true regardless of family ?

The log-normal family is numerically tougher and might have additional pathologies that interact non-linearly with the conditional mean specification. Simplifying is almost surely always the way.

I tend to shy away from brms questions because I really don’t use the package and am not very familiar with its syntax, but let’s give this a try: can you (i) show the output for the estimates of the parameters in both cases (single country and hierarchical) and (ii) show the output of make_stancode() for both models? I find it easier to read what the Stan code is doing.

3 Likes

I believe, the log() around the formula is no longer necessary for the Gaussian() family.

2 Likes

Ok so I fit a single country as follows:

### Fit single country
form_1 <- bf(cum_cases ~ A * exp( -exp( -(k * (day - delay) ) ) ),
             A ~ 1 ,
             k ~ 1 ,
             delay ~ 1 ,
             nl = TRUE)

priors1 <- c(
  prior(lognormal(0.65, 0.5), nlpar = "A", lb=0),
  prior(lognormal(0.65, 0.5), nlpar = "k", lb=0),
  prior(lognormal(0.65, 0.5), nlpar = "delay", lb=0),
  prior(student_t(5, 0, 1.5), class = "sigma"))

mod1 <- brm(form_1, data = case_counts %>% filter(Country.Region == "China"),
           prior = priors1, seed = 1234,
           family = gaussian,
           chains = 4, cores=4, sample_prior = "no")

Stancode is:

> stancode(mod1)
// generated with brms 2.12.0
functions {
}
data {
  int<lower=1> N;  // number of observations
  vector[N] Y;  // response variable
  int<lower=1> K_A;  // number of population-level effects
  matrix[N, K_A] X_A;  // population-level design matrix
  int<lower=1> K_k;  // number of population-level effects
  matrix[N, K_k] X_k;  // population-level design matrix
  int<lower=1> K_delay;  // number of population-level effects
  matrix[N, K_delay] X_delay;  // population-level design matrix
  // covariate vectors for non-linear functions
  vector[N] C_1;
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
}
parameters {
  vector<lower=0>[K_A] b_A;  // population-level effects
  vector<lower=0>[K_k] b_k;  // population-level effects
  vector<lower=0>[K_delay] b_delay;  // population-level effects
  real<lower=0> sigma;  // residual SD
}
transformed parameters {
}
model {
  // initialize linear predictor term
  vector[N] nlp_A = X_A * b_A;
  // initialize linear predictor term
  vector[N] nlp_k = X_k * b_k;
  // initialize linear predictor term
  vector[N] nlp_delay = X_delay * b_delay;
  // initialize non-linear predictor term
  vector[N] mu;
  for (n in 1:N) {
    // compute non-linear predictor values
    mu[n] = nlp_A[n] * exp( - exp( - (nlp_k[n] * (C_1[n] - nlp_delay[n]))));
  }
  // priors including all constants
  target += lognormal_lpdf(b_A | 0.65, 0.5)
    - 1 * lognormal_lccdf(0 | 0.65, 0.5);
  target += lognormal_lpdf(b_k | 0.65, 0.5)
    - 1 * lognormal_lccdf(0 | 0.65, 0.5);
  target += lognormal_lpdf(b_delay | 0.65, 0.5)
    - 1 * lognormal_lccdf(0 | 0.65, 0.5);
  target += student_t_lpdf(sigma | 5, 0, 1.5)
    - 1 * student_t_lccdf(0 | 5, 0, 1.5);
  // likelihood including all constants
  if (!prior_only) {
    target += normal_lpdf(Y | mu, sigma);
  }
}
generated quantities {
}

Results are:

> summary(mod1)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: cum_cases ~ A * exp(-exp(-(k * (day - delay)))) 
         A ~ 1
         k ~ 1
         delay ~ 1
   Data: case_counts %>% filter(Country.Region == "China") (Number of observations: 54) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Population-Level Effects: 
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
A_Intercept         2.19      1.16     0.73     5.16 1.00     5317     3138
k_Intercept         2.19      1.19     0.73     5.14 1.00     4740     3055
delay_Intercept     2.16      1.15     0.74     5.08 1.00     4522     2962

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 59736.57   5678.82 49850.16 71878.55 1.00     4358     2940

Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

And here are the conditional effects plot and a pairs plot for good measure:

Any thoughts ?

The scale of ‘day’ is likely too big to go though exponential twice without causing numerical instability. Try to model day / 100 or weeks or months, basically scale the unit down a little by a linear transformation. At least this would be my next try.

3 Likes

Ok changed to weeks - little difference:

> summary(mod1)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: cum_cases ~ A * exp(-exp(-(k * (weeks - delay)))) 
         A ~ 1
         k ~ 1
         delay ~ 1

Population-Level Effects: 
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
A_Intercept         2.16      1.13     0.74     4.99 1.00     4647     2898
k_Intercept         2.17      1.10     0.75     4.97 1.00     4708     2901
delay_Intercept     2.18      1.18     0.73     5.07 1.00     4545     3217

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 59785.03   5531.32 50140.05 71966.85 1.00     4470     3057

Conditional effects plot looks exactly the same so I won’t post it.

Are the parameters’ priors appropriate? Based on your y-axis, I doubt that a prior of lognomal(0.65, 0.5) is sensible.

1 Like

I could probably say something more intelligent here, but let’s try something basic. Can you experiment with

gompertz <- function(time, A, k, delay){
   lf <- log(A) -exp(-(k * (weeks - delay)))
  ans <- exp(lf)
  return(ans)
}

by trying a few values for the parameters until you get something that at least resembles the observed data? @paul.buerkner’s suggestion of keeping things on a unit scale is a great one. It’ll theoretically make prior specification easier (though not easy as this is a non-linear model) and help sampling to boot.

EDIT: Also, your specification doesn’t seem to match the one here. It might be they’re algebraically equivalent, but I don’t see yet.

@paul.buerkner priors were tuned for the original model with lognormal family but changing them for this gaussian doesn’t seem to affect anything.

@maxbiostat sorry I should have said at the start the model I’m using is equation 1 in this paper (as a starting point): https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0178691#sec005

Just a reminder - in my first post I was happy with the single country plot with the lognormal family model. the problem is only with the hierarchical version which gives crappy conditional plots, and I’ve just noticed there is a Neals funnel in the pairs plot between some parameters.

For this model introducing using the lognormal family (bear with me) and introducing a random effect by country for one parameter:

form_mult <- bf(cum_cases ~ A * exp( -exp( -(k * (day - delay) ) ) ),
             A ~ 1,
             k ~ 1,
             delay ~ 1 + (1 | ID | Country.Region),
             nl = TRUE)

mult_priors <- c(
  prior(lognormal(0.65, 0.5), nlpar = "A", lb=0),
  prior(lognormal(0.65, 0.5), nlpar = "k", lb=0),
  prior(lognormal(0.65, 0.5), nlpar = "delay", lb=0),
  prior(student_t(5, 0, 1), class = "sigma"), 
  prior(student_t(5, 0, 1), class = "sd", group = "Country.Region", nlpar = "delay"))

modmult <- brm(form_mult, data = case_counts,
           prior = mult_priors, seed = 1234,
           family = lognormal(link = "identity", link_sigma = "log"),
           chains = 4, cores=4, sample_prior = "only")
summary(modmult)

I get the following output (EDIT: at time of posting I didnt’ realise this model was set to draw from priors only!):

> summary(modmult)
 Family: lognormal 
  Links: mu = identity; sigma = identity 
Formula: cum_cases ~ A * exp(-exp(-(k * (day - delay)))) 
         A ~ 1
         k ~ 1
         delay ~ 1 + (1 | ID | Country.Region)
   Data: case_counts (Number of observations: 256) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Group-Level Effects: 
~Country.Region (Number of levels: 5) 
                    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(delay_Intercept)     0.95      0.89     0.03     3.19 1.00     3858     1671

Population-Level Effects: 
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
A_Intercept         2.16      1.13     0.73     5.19 1.00     8786     2828
k_Intercept         2.17      1.15     0.71     5.14 1.00     6966     3180
delay_Intercept     2.16      1.14     0.72     5.11 1.00     9355     2750

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.95      0.95     0.03     3.30 1.00     4668     2093

Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

With the following condition and pairs plots

Looks like a funnel between vars sd_Country.Region__delay_Intercept and r_Country.Region__delay[China,Intercept] ?

Are you sure? What happens if you change the log-normal priors to something like


mult_priors <- c(
  prior(lognormal(0, 1), nlpar = "A", lb=0),
  prior(lognormal(0, 1), nlpar = "k", lb=0),
  prior(lognormal(0, 1), nlpar = "delay", lb=0),
  prior(student_t(5, 0, 1), class = "sigma"), 
  prior(student_t(5, 0, 1), class = "sd", group = "Country.Region", nlpar = "delay"))

?

Notice that A is in a very different scale to k_G and T_i (delay). Can you show simulations from the prior? It should be possible with brms, but I’m not familiar with the specifics.

2 Likes

OMG I just noticed that last post WAS only simulations from the prior 🤦‍♂️ Sorry guys I’m doing too many things at once and stressed out. Ok that in that case the priors look too restrictive.

@maxbiostat thank you for suggestion. Here is simulation from prior only - its better than the one above but still too tight. I’ll have another think about the priors. Thanks.

1 Like