Hello Stanians, I’m still a novice at modeling so please mind my lack of structured definitions.

I’m trying to fit a Poisson model to predict failures in shipboard engines. I’ve come across some interesting methods to represent failure rate, notably the [Bathtub Curve]. So I implemented the functionality within my Poisson model by modeling \lambda as the sum of two Weibull PDFs, each representing early and wear-out failures. Additionally, in order to incorporate known features, such as the relative size proportion of the engine system and unique shiptypes, an additional coefficient is multiplied to wear-out, which consists of a linear combination of features and parameters.

The full model would be as follows:

y \sim \textrm{poisson}(\lambda_{ship,age}) \\ \textrm{log} \ \lambda_{ship, age} = \textrm{failure_form}(\alpha_{ship}, \beta_{ship}, age) + wear_{ship, age} \times \textrm{failure_form}(\gamma_{engine}, \delta_{engine}, age) \\ wear_{ship} = complexity_{ship} \times engine\_count_{ship} \times \eta_{engine} - \theta_{ship} \ \textrm{ln} \ relative\_displacement \\ \textrm{failure_form}(shape, scale, age) = \textrm{min}(\frac{shape}{scale}(\frac{age}{scale})^{shape - 1} e^{-(\frac{age}{scale})^{shape}}, 1.5) \\ \alpha \sim \textrm{normal}(0.5, 0.3), \, \alpha \in [0, \infty) \\ \beta \sim \textrm{normal}(1.2, 0.05), \, \beta \in [0, \infty) \\ \gamma \sim \textrm{normal}(2.5, 0.5), \, \gamma \in [0, \infty) \\ \delta \sim \textrm{normal}(1.2, 0.15), \, \delta \in [0, \infty) \\ \eta \sim \textrm{normal}(0.5, 0.7) \\ \theta \sim \textrm{normal}(0.2, 0.3)

*complexity*, *engine_type*, *relative_displacement* are features unique to each data. Age is normalized to [0, 1], which is used as the x value for Weibull PDF(failure_form).

I gave presumably quite strong priors for the Weibull PDF parameters, since I know that vaguely early and wear curves respectively follow an exponentially decaying/increasing shape.

Parameters in *wear*(eta, theta) are set to have the output in a reasonable scale respective to the observed data, through prior checks.

The model implemented in stan would be the following:

```
functions{
real failure_form(real shape, real scale, real age){
return fmin((shape/scale) * pow(age/scale, shape-1) * exp(-pow(age/scale, shape)), 1.5);
}
}
data {
int N; // number of observed data values
int engine_types; // number of unique engines in engine_type
int y[N]; // observed data
real complexity[N]; // feature1
real<lower=0,upper=1> age[N]; // age at failure
int engine_type[N]; // unique engine type
real relative_displacement[N]; // feature2
int engine_count[N]; // feature3
int ship_number[N]; // ship type
int ship_number_max; // number of unique ship types
int N_pred;
real<lower=0, upper=1> age_pred[N_pred];
int engine_type_pred[N_pred];
real complexity_pred[N_pred];
real relative_displacement_pred[N_pred];
int engine_count_pred[N_pred];
int ship_number_pred[N_pred];
}
parameters {
real<lower=0> alpha[ship_number_max];
real<lower=0> beta[ship_number_max];
real<lower=0> gamma[engine_types];
real<lower=0> delta[engine_types];
real eta[engine_types];
real theta[ship_number_max];
}
transformed parameters {
real wear[N];
real lambda[N];
for(i in 1:N){
wear[i] = complexity[i] * engine_count[i] * eta[engine_type[i]] - theta[ship_number[i]] * log(relative_displacement[i]);
lambda[i] = failure_form(alpha[ship_number[i]], beta[ship_number[i]], age[i]) + wear[i] * failure_form(gamma[engine_type[i]], delta[engine_type[i]], age[i])
lambda[i] = exp(lambda[i]);
}
}
model {
for(i in 1:engine_types){
gamma[i] ~ normal(2.5, 0.5); //weibull shape
delta[i] ~ normal(1.2, 0.15); // weibull scale
eta[i] ~ normal(0.5, 0.7); //wear parameter 1
}
for(i in 1:ship_number_max){
alpha[i] ~ normal(0.5, 0.3); //weibull shape
beta[i] ~ normal(1.2, 0.05); //weibull scale
theta[i] ~ normal(0.2, 0.3); //wear parameter 2
}
for(i in 1:N){
y[i] ~ poisson(lambda[i]);
}
}
generated quantities{
int y_post_pred[N_pred];
for(i in 1:N_pred){
real wear_pred = complexity_pred[i] * engine_count_pred[i] * eta[engine_type_pred[i]] - theta[ship_number_pred[i]] * log(relative_displacement_pred[i]);
real lambda_pred = failure_form(alpha[ship_number_pred[i]], beta[ship_number_pred[i]], age_pred[i]) + wear_pred * failure_form(gamma[engine_type_pred[i]], delta[engine_type_pred[i]], age_pred[i]);
y_post_pred[i] = poisson_rng(exp(lambda_pred));
}
}
```

I’m following Michael Betancourt’s Bayesian Workflow process. All went well up to samples from prior were used to fit the model. However, when the observed data is fitted, all sorts of errors would occur. The most common problem would be divergences, which varied from 50% to 100% divergence rate, depending on how I set my priors. Observing Rhat values would give obscene results, with practically zero parameters being remotely near 1.

Additionally, PyStan sometimes will refuse to sample, while spewing out “Initialization Failed”, with messages along the lines of, “Rejecting initial value: Gradient evaluated at the initial value is not finite. Stan can’t start sampling from this initial value.”.

Trace plots of some parameters, show that chains do not mix at all, and from my interpretation, seem like they get “stuck” around certain values:

So my questions are,

- Would my approach of modeling \lambda with the sum of two Weibull PDF be a valid method both computationally and inferentially? I’m assuming this would be the culprit since I found no examples of using user defined functions in such a way. Additionally, since the X range is bounded to [0, 1], but Y is not, I just used a very crude method of forcing an upper limit with fmin(), which is, frankly, very very skeptical.
- What are some better ways to identify what’s causing the problem? I can see from pair plots of shape, scale parameters that clearly something’s severely wrong with sampling(see plot below), but can’t wrap my head around on figuring out a fix for it, or perhaps making a somewhat correct diagnostic at all.