GLFP (Weibull Mixture) model

I have simulated breakdown stress data for some components.

I tested whether a Weibull or lognormal distribution best fit the data. Please see Figures below. The figures provide median and 95\% prediction intervals in black. The green lines are the non-parametric fit (observations).

It looks like the Weibull distribution fits poorly at the lower tail compared to the lognormal distribution. That is, the Weibull fit overestimates the probability of failing at stress lower than (approximately) 45.

I think, from a visual check alone, that the lognormal fit is best.

However, I am interested in the lower tail of the breakdown stress distribution, and I believe the lognormal distribution is too optimistic. For example, a 95\% prediction interval for the 0.0001\% quantile under the lognormal model is [26.0,28.2], compared to [10.4,13.7] under the Weibull model.

I have also read (no references at hand, sorry) that the lognormal distribution can provide estimates that are too optimistic (in this case predicting that the 0.0001\% quantile is much higher than it actually is).

Write the 95\% prediction interval for the 0.0001\% quantile as [\tau_l,\tau_u]. The biggest error in my decision problem would be a component failing at stress s^* < \tau_l. For this reason, I would like a conservative estimate of the 0.0001\% quantile.

Generally, the Weibull model is preferred over the lognormal model when small quantiles are of interest since the Weibull model provides a more conservative estimate. However, for my data, I think these estimates are too conservative (and wrong) because the model fits poorly in the region I am most interested in.

I then tested whether a generalized limited failure population (GLFP) model would work. This model is a mixture of Weibull distributions. My hope was that a mixture of Weibull distributions would fit better in the lower tail and would also provide more conservative estimates than the lognormal fit. The cdf of the GLFP distribution is written as

F(t; \pi,\alpha_1, \beta_1, \alpha_2, \beta_2) = 1-(1-\pi F_1(t;\alpha_1, \beta_1))(1-F_2(t;\alpha_2, \beta_2)),

where F_1(t;\alpha_1, \beta_1) and F_2(t;\alpha_2, \beta_2) are Weibull cdf’s, where \alpha_1, \alpha_2 are scale parameters and \beta_1, \beta_2 are shape parameters.

I had trouble estimating the parameters of this mixture distribution. I played around with this distribution and found a decent fit to the data when (\alpha_1,\beta_1,\alpha_2,\beta_2) = (42.5,100, 44.7,14.3).

I tried refitting the GLFP model in Stan with priors centred around the above values that I found worked well (this is probably not how one should construct prior distributions but I think the model parameters were having identifiability issues). I used the following prior distributions:

\alpha_1 \sim N(40,1) \\ \alpha_2 \sim N(44,1) \\ \beta_1 \sim N(100,1) \\ \beta_2 \sim N(14,1)

I have included the GLFP fit below. It fits much better than the Weibull fit in the lower tail and provides more conservative estimates than the lognormal fit. It looks a little strange to me, since there is a kink around 40, but other than that a decent fit all round. I only care about the lower tail anyway. A 95\% prediction interval for the 0.0001\% quantile under the GLFP model is [13.1,16.3].

My concerns are that:

(1) \alpha_1 \approx \alpha_2. If I fit the model with \alpha_1 = \alpha_2 and use \alpha \sim N(40,5) as the prior, the model runs fine (no divergent transitions, etc), but the fit is poor in the lower tail (the important part). Therefore, I need to estimate both parameters, which are similar in size but different.
(2) I tried using slightly more diffuse priors for \beta_1 and \beta_2, but this caused the fit to be poor. More specifically, I tried \beta_1 \sim N(100,2) and \beta_2 \sim N(14,2). The model ran with no divergent transitions, etc, but did not fit the data well.
(3) I tried using more diffuse priors on \alpha_1 and \alpha_2, but this caused the model to fit poorly (very poor). More specifically, I tried \alpha_1 \sim N(40,4) and \alpha_2 \sim N(44,4).

Please see final three figures below.

The above three points show that the mixture model is very sensitive. I had to use trial and error to find parameter values that looked reasonable. These values were used to construct very strict priors, and any deviation from these priors causes the model to fit poorly.

Moreover, I have many simulated datasets. The GLFP model that worked well (with the initial strict priors) often diverges (or fits poorly) for these alternative datasets (since the priors are so strict).

I don’t think I should construct priors for each dataset. I have too many datasets, but even if I only had a few I think this would be frowned upon.

I am also worried about the stability of the GLFP model.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

A summary:

I have some simulated data. I am interested in modelling the 0.0001\% quantile of this data. I tried a Weibull model and a lognormal model. The lognormal model fits the data much better in the lower tail, but I am worried the 0.0001\% quantile is too optimistic (I think it should be lower). I have read that the lognormal distribution can be too optimistic. I would prefer to use the Weibull distribution to model such a small quantile (as it is much more conservative), however, the Weibull fits poorly in the lower tail. I then tried a GLFP model (Weibull mixture). This model fits okay if I use very strict priors. I am worried about the stability of this model since any small change in any of the prior distributions causes the model to fit poorly.

(1) Does anyone have any thoughts about the mixture model I am using and what I can do about it?
(2) Is there an alternative to the mixture approach that would be more stable? I.e. perhaps telling Stan that I would like to sacrifice some model accuracy in the higher tails to gain more accuracy in the lower tails?

Note: I have read Mixture Model Documentation. I already use constraints to insist \alpha_2 \geq \alpha_1.






Sorry this didn’t get answered, @JoLee. And I’m afraid I’m just here to say these are hard questions to answer and I don’t have the experience with extreme tail estimation to provide any guidance at all.

How did you find versions that fit well in the first place? Was the problem identifiability of the mixtures or with the priors themselves?

1 Like

Thank you for the reply.

Initially I tried to fit the model, with fairly diffuse priors, and insisted that \alpha_2 > \alpha_1. The model had over 1000 divergent transitions and so I just played around with values in R to see which values could work (just plotting curves over the nonparametric (green) estimates).

The (single) Weibull fit gave estimates for \alpha and \beta and I think I just set \alpha_1 = \alpha_2 = \alpha and \beta_1 = \beta_2 = \beta and varied the values a little until the curve fit the lower tail well.

I then set strict priors around these values and the model ran with no divergent transitions. This resulted in the first GLFP plot above. The fit looked a little strange.

After this I wanted to see if the model would run with less strict priors. These fits were poor.

Are there any useful diagnostic plots that would highlight if it is an identifiability problem?

When using the failure-time models, and especially when the goal is to estimate small quantiles, it is always a good idea to plot on appropriate probability axes (e.g., such that a lognormal or Weibull distribution will plot as a straight line). This is especially important because it provides a more clear visualization of the extrapolation you are doing. Check Chapter 6 of our SMRD2 book.

There can certainly be estimability problems in estimating the proportion of defective units in the population (pi). There is a nice example in my 1987 LFP paper in Technometrics. It may be harder to visualize the identifiability problem with the GLFP. Profile likelihood plots are my favorite tool for studying identifiability in models with 5 or fewer parameters. Then you might only need a tight prior only on the pi and you could experiment with different priors to see how changes affect the inferences of interest.

Bill Meeker

1 Like