Difficulties with logistic population growth model

I’ve been struggling with this model for awhile and have finally gotten around to making some fake data to post that’s similar to the real data I want to analyze.fake_stan_data.txt (308.0 KB)

I’m getting a ton of divergences when I run Stan, I’m guessing because the model is complex and maybe because I need stronger priors. I’ve tried the non-central parametrization of the hierarchical model, tried reparametrizing the Student t distribution as a normal gamma mixture (as described in the Stan manual).

I’m sure one answer is to reparametrize the model in some way, though I haven’t found any good suggestions on how to do that, but one issue with that is in my real case I have specific information I want to bring in with my priors and if I reparametrize the growth curves it could become more difficult to do that. The form I’d like to work with is:

f(x) = \frac{A}{1 + exp\left (\frac{4D(M - x)}{A}\right )}

Where A is the curve’s right asymptote, D is the curve’s the maximum derivative, and M is the location of the sigmoid curve. Things I know about my real data are that 1) all values are between 0 and 1, though the right asymptote is not necessarily 1; 2) D is greater than 0; 3) the experiment will lead to hierarchical data–each individual will yield two growth curves.

Below is the Stan code I wrote to try to make inferences about these parameters. My questions are, is there a better way to parametrize this that would still allow me to use what I know about parameters A, M and D? Is there a way to specify priors to find the breaking point (I’ve tried various way to tightening priors with little effect)?

data{
  int nptid;
  int nobs;
  int ptid_visit[nobs];
  int ptid[nobs];
  int visit[nobs];
  vector[nobs] conc;
  vector[nobs] val;
  vector[nptid] with_cat;
}

parameters{
  real a;
  real m;
  real s;
  vector[nptid] a1_raw;
  real<lower=0> a1_sd;
  vector[nptid] m1_raw;
  real<lower=0> m1_sd;
  vector[nptid] s1_raw;
  real<lower=0> s1_sd;
  vector[2] a2_raw[nptid];
  real<lower=0> a2_sd;
  vector[2] m2_raw[nptid];
  real<lower=0> m2_sd;
  vector[2] s2_raw[nptid];
  real<lower=0> s2_sd;
  real<lower=0> error;
}

transformed parameters{
  vector[nptid] a1 = a1_sd * a1_raw;
  vector[nptid] m1 = m1_sd * m1_raw;
  vector[nptid] s1 = s1_sd * m1_raw;
  vector[2] a2[nptid];
  vector[2] m2[nptid];
  vector[2] s2[nptid];
  vector[2] asym[nptid];
  vector[2] xmid[nptid];
  vector[2] slope[nptid];
  vector[2] scale[nptid];
  
  for(i in 1:nptid){
    a2[i] = a2_sd * a2_raw[i];
    m2[i] = m2_sd * m2_raw[i];
    s2[i] = s2_sd * s2_raw[i];
    for(j in 1:2){
      asym[i,j] = inv_logit(a + a1[i] + a2[i,j]);
      xmid[i,j] = m + m1[i] + m2[i,j];
      slope[i,j] = exp(s + s1[i] + s2[i,j]);
      scale[i,j] = (4 * slope[i,j]) / asym[i,j];
    }
  }
}

model{
  vector[nobs] mu;
  
  a ~ normal(0,2);
  m ~ normal(0,4);
  s ~ normal(0,1);
  
  a1_raw ~ std_normal();
  m1_raw ~ std_normal();
  s1_raw ~ std_normal();
  
  for(i in 1:nptid){
    a2_raw[i] ~ std_normal();
    m2_raw[i] ~ std_normal();
    s2_raw[i] ~ std_normal();
  }
  
  a1_sd ~ student_t(5,0,1);
  m1_sd ~ student_t(5,0,2);
  s1_sd ~ student_t(5,0,0.5);
  
  a2_sd ~ student_t(5,0,1);
  m2_sd ~ student_t(5,0,2);
  s2_sd ~ student_t(5,0,0.5);
  
  error ~ student_t(5,0,1);
  
  for(i in 1:nobs){
    mu[i] = asym[ptid[i],visit[i]] / (1 + exp((xmid[ptid[i],visit[i]] - conc[i]) * scale[ptid[i],visit[i]]));
  }
  
  val ~ normal(mu, error);
}
1 Like

These two threads go into some detail with the problems with similar models.

In general, the problem is that the parameters are not very strongly identified. For instance, within the exponent, every increase in D can be offset by a decrease in A. That means that if you don’t have enough data (in a subgroup) to estimate the plateau, A, D is not identified either. One option could be to only allow for hierarchical parameters on D or A but not for both if you can justify it. Another option is to reparametrize in terms of A and ratio = D/A. If A > 0, then the maximum derivative scaled by the maximum value could make sense.

Me and @stemangiola eventually ended up using a different reparametrization of the sigmoid for a very similar use case (there is a paper in final stages of preparation including that):

f(x) = \frac{y_0 (1 + exp( \beta M))}{1 + exp(\beta (M - x ))}

in this parametrization, y_0 = f(0) and \beta = \frac{4D}{A} is “slope” parameter. Those parameters are in our experience mostly well identified. Note that, we centered and scaled x before fitting the model, so y_0 is the value at the midpoint of data, which tends to be well constrained by the data (if you want to avoid centering, then use “value at midpoint” as a parameter). Another important point is that your prior should constrain M to not be far from the observed range of x, as once the midpoint drifts far from the data, your inferences become insensitive to changes to M. If your model then fits M outside of the observed range of x, you should interpret it as “midpoint not reached by the data” and avoid stating any conclusions about the actual value of M (in this case, exact D is also not determined and exact A can only be determined if M \ll x)

Finally, when x is scaled and centered, you can put very sensible priors on \beta - you just need to let it go large enough to allow for almost instantaneous changes \beta \sim N(0,5) has worked quite nice for me.

Note: please check/re-derive my formulas before using them in a model, I have done silly mistakes in posts here in the past.

Also I would totally recommend to first write a model that fits a single curve (from simulated data), debug it really well and incorporate it into a bigger model once you understand it over and out.

3 Likes

I was thinking of replying but you got first :) Great explanation.

Indeed, we have spent some time to come up with a robust parametrisation of the generalised sigmoid function, that would perform for linear, exponential and sigmoid trends, also in case of slope 0.

Fell free to go with it. I will link the article when possible.

1 Like

Has this appeared anywhere yet?

On the edge of a big publication… I have to complete some more analyses and we’ll get there! ;)

Will be published before end of the year. Let me know if you need some info to implement, but I think the formula has been already given by Martin.

1 Like