Using Weibull PDFs to Model Bathtub Curve for Failure Rate Prediction

Hello Stanians, I’m still a novice at modeling so please mind my lack of structured definitions.

I’m trying to fit a Poisson model to predict failures in shipboard engines. I’ve come across some interesting methods to represent failure rate, notably the [Bathtub Curve]. So I implemented the functionality within my Poisson model by modeling \lambda as the sum of two Weibull PDFs, each representing early and wear-out failures. Additionally, in order to incorporate known features, such as the relative size proportion of the engine system and unique shiptypes, an additional coefficient is multiplied to wear-out, which consists of a linear combination of features and parameters.

The full model would be as follows:

y \sim \textrm{poisson}(\lambda_{ship,age}) \\ \textrm{log} \ \lambda_{ship, age} = \textrm{failure_form}(\alpha_{ship}, \beta_{ship}, age) + wear_{ship, age} \times \textrm{failure_form}(\gamma_{engine}, \delta_{engine}, age) \\ wear_{ship} = complexity_{ship} \times engine\_count_{ship} \times \eta_{engine} - \theta_{ship} \ \textrm{ln} \ relative\_displacement \\ \textrm{failure_form}(shape, scale, age) = \textrm{min}(\frac{shape}{scale}(\frac{age}{scale})^{shape - 1} e^{-(\frac{age}{scale})^{shape}}, 1.5) \\ \alpha \sim \textrm{normal}(0.5, 0.3), \, \alpha \in [0, \infty) \\ \beta \sim \textrm{normal}(1.2, 0.05), \, \beta \in [0, \infty) \\ \gamma \sim \textrm{normal}(2.5, 0.5), \, \gamma \in [0, \infty) \\ \delta \sim \textrm{normal}(1.2, 0.15), \, \delta \in [0, \infty) \\ \eta \sim \textrm{normal}(0.5, 0.7) \\ \theta \sim \textrm{normal}(0.2, 0.3)

complexity, engine_type, relative_displacement are features unique to each data. Age is normalized to [0, 1], which is used as the x value for Weibull PDF(failure_form).

I gave presumably quite strong priors for the Weibull PDF parameters, since I know that vaguely early and wear curves respectively follow an exponentially decaying/increasing shape.

Parameters in wear(eta, theta) are set to have the output in a reasonable scale respective to the observed data, through prior checks.

The model implemented in stan would be the following:

functions{
    real failure_form(real shape, real scale, real age){
    
        return fmin((shape/scale) * pow(age/scale, shape-1) * exp(-pow(age/scale, shape)), 1.5);
    } 
}

data {
    int N; // number of observed data values
    int engine_types; // number of unique engines in engine_type
    int y[N]; // observed data
    real complexity[N]; // feature1
    real<lower=0,upper=1> age[N]; // age at failure
    int engine_type[N]; // unique engine type
    real relative_displacement[N]; // feature2
    int engine_count[N]; // feature3
    int ship_number[N]; // ship type
    int ship_number_max; // number of unique ship types

    int N_pred;
    real<lower=0, upper=1> age_pred[N_pred];
    int engine_type_pred[N_pred];
    real complexity_pred[N_pred];
    real relative_displacement_pred[N_pred];
    int engine_count_pred[N_pred];
    int ship_number_pred[N_pred];
}

parameters {
    real<lower=0> alpha[ship_number_max];
    real<lower=0> beta[ship_number_max];
    real<lower=0> gamma[engine_types];
    real<lower=0> delta[engine_types];

    real eta[engine_types];
    real theta[ship_number_max];
}

transformed parameters {
    real wear[N];
    real lambda[N];
    
    for(i in 1:N){
        wear[i] = complexity[i] * engine_count[i] * eta[engine_type[i]] - theta[ship_number[i]] * log(relative_displacement[i]);

        lambda[i] = failure_form(alpha[ship_number[i]], beta[ship_number[i]], age[i]) + wear[i] * failure_form(gamma[engine_type[i]], delta[engine_type[i]], age[i])

        lambda[i] = exp(lambda[i]);
    }
}

model {
    for(i in 1:engine_types){
        gamma[i] ~ normal(2.5, 0.5); //weibull shape
        delta[i] ~ normal(1.2, 0.15); // weibull scale
        
        eta[i] ~ normal(0.5, 0.7); //wear parameter 1
    }
    for(i in 1:ship_number_max){
        alpha[i] ~ normal(0.5, 0.3); //weibull shape
        beta[i] ~ normal(1.2, 0.05); //weibull scale
        
        theta[i] ~ normal(0.2, 0.3); //wear parameter 2
    }
    
    
    for(i in 1:N){
        y[i] ~ poisson(lambda[i]);
    }
    
}

generated quantities{
    int y_post_pred[N_pred];
    for(i in 1:N_pred){

        real wear_pred = complexity_pred[i] * engine_count_pred[i] * eta[engine_type_pred[i]] - theta[ship_number_pred[i]] * log(relative_displacement_pred[i]);

        real lambda_pred = failure_form(alpha[ship_number_pred[i]], beta[ship_number_pred[i]], age_pred[i]) + wear_pred * failure_form(gamma[engine_type_pred[i]], delta[engine_type_pred[i]], age_pred[i]);

        y_post_pred[i] = poisson_rng(exp(lambda_pred));
    }
}

I’m following Michael Betancourt’s Bayesian Workflow process. All went well up to samples from prior were used to fit the model. However, when the observed data is fitted, all sorts of errors would occur. The most common problem would be divergences, which varied from 50% to 100% divergence rate, depending on how I set my priors. Observing Rhat values would give obscene results, with practically zero parameters being remotely near 1.
Additionally, PyStan sometimes will refuse to sample, while spewing out “Initialization Failed”, with messages along the lines of, “Rejecting initial value: Gradient evaluated at the initial value is not finite. Stan can’t start sampling from this initial value.”.
Trace plots of some parameters, show that chains do not mix at all, and from my interpretation, seem like they get “stuck” around certain values:

So my questions are,

  1. Would my approach of modeling \lambda with the sum of two Weibull PDF be a valid method both computationally and inferentially? I’m assuming this would be the culprit since I found no examples of using user defined functions in such a way. Additionally, since the X range is bounded to [0, 1], but Y is not, I just used a very crude method of forcing an upper limit with fmin(), which is, frankly, very very skeptical.
  2. What are some better ways to identify what’s causing the problem? I can see from pair plots of shape, scale parameters that clearly something’s severely wrong with sampling(see plot below), but can’t wrap my head around on figuring out a fix for it, or perhaps making a somewhat correct diagnostic at all.
    Screenshot from 2020-09-28 18-34-53

(https://en.wikipedia.org/wiki/Bathtub_curve)

1 Like

Hi and welcome to the community!

First, I’d like to apologize that it took so long to reply to your post. Sometimes things slip through the crack. Next I have to apologize I won’t be able to help you directly, but I have a feeling @mike-lawrence might be able to do that :)

2 Likes

Would my approach of modeling \lambda with the sum of two Weibull PDF be a valid method both computationally and inferentially?

I don’t think so. The bathtub curve concept seems to reflect a three component mixture model(though as you have done it can be simplified to a two-component mixture if you don’t need the uniform component), so that would be the best way to model the data if I understand the scenario correctly.

2 Likes