Over the past six years, my coauthors and I have used Stan/rstan for dozens of applications and many of these applications have involved hierarchical Bayesian (HB) models. We now appreciate the many advantages of the HB approach relative to our previous use (starting in the 1990s) of lme and nlme. The development of Stan models has generally gone smoothly. For the past few weeks, however, we have been struggling with a particular application. We have been trying to reproduce the HB results that Val Johnson and others got in their 1999 discussion of our paper on the random fatigue limit (RFL) model. Our model/data runs fine (no warnings) but gives answers that do not agree with the HB analysis in the Johnson_etal1999Technometrics.pdf discussion and maximum likelihood results in the PascualMeeker1999Technometrics.pdf paper (which do agree reasonably well).
One thing that is different in the RFL model (compared with our other HB applications) is that the RFL has a lognormal distribution (our other applications always used a normal (sometimes multivatiate) distribution for the random parameter(s)). We suspect that the distribution of the random RFL parameter (gamma) is not being specified correctly because of the nonlinear (antilog) transformation.
We use
gamma = exp(mu_log_gamma + Z_log_gamma[n]sigma_log_gamma);
where Z_log_gamma[n] is (for each observation in the data set) NOR(0,1) and this followed by
mu = beta0 + beta1log(stress[n] - gamma);
which is the location parameter of the log-location-scale failure-time distribution. That is the RFL model in a nutshell. The Stan model carefully does the right thing (we believe) when gamma>stress[n].
The estimates of gamma and the distribution of gamma should be centered around 220 [somewhat below the smallest level of stress with failures (270 MPa)]. But, using the median of the marginal posterior) exp(mu_log_gamma) = exp(1.702) = 5.48. We tried specifying tight prior distributions around the correct results (from Johnson et al.) but the joint posterior just goes to the same wrong answers (it just takes longer to get there).
We experimented with lots of alternatives like specifying the lognormal distribution directly and then we thought there might be a subtle need for a Jacobian correction, but nothing we have tried worked (at least in the way we tried to implement).
I put example codes in
To reproduce our results, all one has to do is clone the repo, open an instance of R in the folder, and execute all of the commands in the file RunRFL.R. Copies of the RFL paper and discussion are also in the repo.
Any help or other advice would be welcome.
Bill Meeker