Bayesian hierarchical models for manufacturing processes (aeronautics)

Hello everyone,

I am new to the Bayesian “world” and I have a few fundamental questions regarding hierarchical models. But before going any further, I thought it would be useful to introduce myself. I am a materials engineer working for an aeronautical company. One of my responbilities is to ensure that the raw materials we receive from our supplier is compliant to our requirements. On a monthly basis, I receive mechanical test data performed on the products manufactured from our suppliers. In my department, I am the lucky guy that performs the statistical analysis and I admit that before I bumped into the Bayesian methods, I was mainly using frequentist methods. Among the frequentist methods, I was using hypothesis testing such as 2 sample t-test, parameters estimation and also the calculation of tolerance intervals (99 quantile with 0.95 confidence intervals). In my field… Bayesian methods is inexistant and I would like to start introduce it to answer statistical questions in a more rigorous and elegant way.

I was seduced by Bayesian Hierarchical models because I could use the prior knowledge on legacy data to answer questions when there is not much data available. In fact, the mechanical testing our suppliers perform is destructive and has a cost!

So here are the questions:

1) Is there a limit of levels in modelling hierarchical models with Stan?

Typically, the parts we receive from our suppliers can be grouped in batches. But If we consider the whole process, the batches can also be grouped in further batches. Additionally, a part can be manufactured by different suppliers - so it would be interesting to take this into account.
Is there any consensus on how many levels can have a hierarchical model?

2) Can I use hierarchical models to monitor deviation in the products I receive from my supplier?
Sometimes, there can be shift (average shifted significantly ) or trends in the mechanical properties from the products we receive from our suppliers.

However, If I assume the exchangeability assumption at every level of my hierarchical model, will I be able to spot any shift or trend? If not, how do I capture it?

3) Can I use Stan to model the 99th quantile?
The question says it all - but in aeronautics we are very interesting in estimating the 99th quantile of mechanical property to set our safety margins. After I built my hierachical model, and assuming that at every level I have a gaussian distribution, can I generate the distribution of 99th quantile using Stan?

Many thanks !
J-C

3 Likes

Nope! At least, I’ve never seen anything explicitly showing that deeper hierarchies fail in any way.

For sure! You’d be doing this in the generated quantities section, using the complementary cumulative distribution function for whatever distribution you’re modelling as generating your data. If you post an example model, I can show you how to add this.

This is an interesting one. Let’s assume you’re dealing with a single supplier and multiple batches per supplier with multiple samples per batch. And I assume that the thing you’re interested in is the magnitude of the noise term, which I assume you’re modelling as a fixed value per batch, but with the potential for the noise term to vary across batches. A first approach would be to apply partial-pooling to the noise term; this is usually done on the log-scale for noise terms where you do a standard hierarchical model with a mean log-noise and a variability term reflecting batch-to-batch variability in log-noise, then you exponentiate a given batch’s log-noise to get it’s noise. With this approach, you still get a posterior on the noise for a specific batch and you can watch the timecourse of this posterior over time as you add new batches. The partial-pooling will pull any improved batches back toward the previous mean log-noise, so this would be a somewhat conservative approach to detecting improvements in batches. There might be other approaches too; I’ll have to think about it a bit.

1 Like

These sorts of tail quantities are hard to get right, both in terms of estimation (here and here has info) and in terms of modeling.

So yeah you can get answers for arbitrary intervals but it’s up to you to validate/understand if they’re useful.

The voting stuff has lots of hierarchy. This might give you ideas about the scope of what is done: http://www.stat.columbia.edu/~gelman/research/published/misterp.pdf

Edit: I shouldn’t imply the total scope though – that paper is fairly old and voting isn’t the only application. It’s a scope though.

Not that applicable to your post but I thought you’d appreciate this blog post on Bayesian decision theory given your role:

https://twiecki.io/blog/2019/01/14/supply_chain/

It’s not using Stan but pretty simple to port over.

1 Like