Sampler returns the same value for each parameter in Weibull Model

We use Stan to do non-linear regression on summary datasets. Specifically, we’re having an issue fitting our Weibull model. y = a + (1 - a) \times (1 - e^{-c \times x^{b}}).


Our Stan model is below and the priors for parameters a, b, and c are all uniform distributions with bounds [0,1], [0, 15] and [0, 50] respectively. In this case the pwr_lbound is 1.

data {
  int<lower=0> len; 
  array[len] real<lower=0> y;
  array[len] real<lower=0> n;
  array[len] real<lower=0> x;
  real pwr_lbound;
  array[2] real p_a; // prior for a
  array[2] real p_b; // prior for b
  array[2] real p_c; // prior for c
}
parameters {
  real<lower=0, upper=1> a;
  real<lower=pwr_lbound> b;
  real<lower=0> c;
}
model {
  a ~ uniform(p_a[1], p_a[2]);
  b ~ uniform(p_b[1], p_b[2]);
  c ~ uniform(p_c[1], p_c[2]);
  for (i in 1 : len) {
    real theta;
    theta = a + (1 - a) * (1 - exp(-c * x[i] ^ b));
    target += lgamma(n[i] + 1) - lgamma(y[i] + 1) - lgamma(n[i] - y[i] + 1)
              + y[i] * log(theta) + (n[i] - y[i]) * log(1 - theta);
  }
}

This works pretty well for 99% of the datasets we’ve used so far, but recently we’ve run into the following issue for a few datasets. For instance, the following dataset has this issue.

x n y
0 8 0
0.03 8 0
0.1 8 1
0.22 8 2
0.56 8 3
1 8 8

By the time the sampler finishes the warmup it seems to be stuck in some sort of loop so all the sampled iterations are the exact same value for each parameter. I’ve attached the csv output with save warmup on here as well.
weibull-20231023120901.csv (1.9 MB)


Amongst other problems this causes the values for certain parameters in the summary to be NaN:

Mean MCSE StdDev 2.5% 5% 50% 95% 97.5% N_Eff N_Eff/s R_hat
lp__ -3.275210 NaN 1.225730e-13 -3.275210 -3.275210 -3.275210 -3.275210 -3.275210 NaN NaN NaN
a 0.182526 NaN 1.654290e-14 0.182526 0.182526 0.182526 0.182526 0.182526 NaN NaN NaN
b 12.949200 NaN 8.526800e-13 12.949200 12.949200 12.949200 12.949200 12.949200 NaN NaN NaN
c 36.967500 NaN 2.366190e-12 36.967500 36.967500 36.967500 36.967500 36.967500 NaN NaN NaN

Does anyone have advice on how to avoid this sort of issue? We’re using stan 2.33.0 and cmdstanpy 1.2.0 inside a Linux Docker container if that matters for any reason.

For this dataset, given your likelihood function, likelihood is maximized by the returned c value for all values of a and b. Given that maximizing c value, the likelihood function is maximized by the returned b value for all values of a. If you graph the likelihood functions for each x and y pair, given those maximizing b and c values, for all values of a, and you take the average of the a values at the peaks of the likelihood function, the mean is the returned a value. You can also see that the reverse is true: given the returned c value, the likelihood function is maximized at the returned a value for all values of b, and if you graph the likelihood functions for each x and y pair, given those maximizing a and c values, and you take the average of the b values at the peaks of the likelihood function, the mean is the returned b value.

I suspect that these parameter values correspond to the inside of a very steep-walled valley in the negative log likelihood function (at the true global minimum?) for the given parameter space. If this were MLE, I suspect this is where the optimization would stop. But your results having essentially no noise at all suggests that the prior is contributing next to nothing to the posterior. You essentially have delta distributions over the means of your sampled parameter values. What are the parameters being read into the program for those uniform distributions? Are they all 0 and 1? Because that would make their cdf values 1, and the log of the cdf would be 0, contributing nothing to the posterior.

You can prove to yourself what I said about the a, b, and c values by graphing your likelihood for all x/y pairs on desmos.com, replacing each parameter with x in turn, and using the sliders for the remaining two parameters to see how the maxima of the likelihood function change.