- Operating System: Windows 10 64-bit
- brms Version: 2.4.0
I am unsure how to optimally model a data set obtained from a crude experiment to compare performance of two plastic molding tools used to manufacture widgets. Widgets are tested and results are recorded as either a pass or fail.
Tool 1 has two molding cavities, A2 and B2, meaning that two widgets can be produced at once. Tool 2 has 16 cavities, but they are paired according to the letter prefix; for example, C1 and C2 are more similar to each other than C1 and F1. Tool 2 was also evaluated under two operating conditions: “previous” and “new”. However, I only have data for five out of 16 cavities under the “previous” condition. Pass and fail counts grouped by Cavity are shown below:
> Tool Condition Cavity n_pass n_fail
> 1 Control A2 177 2
> 1 Control B2 168 1
> 2 New C1 159 1
> 2 New C2 160 1
> 2 New D1 154 1
> 2 New D2 176 3
> 2 New F1 147 0
> 2 New F2 168 0
> 2 New G1 166 0
> 2 New G2 166 0
> 2 New P1 163 1
> 2 New P2 172 0
> 2 New R1 135 2
> 2 New R2 175 1
> 2 New S1 169 4
> 2 New S2 171 2
> 2 New T1 176 2
> 2 New T2 148 0
> 2 Previous C1 138 2
> 2 Previous C2 23 0
> 2 Previous F1 30 0
> 2 Previous F2 21 1
> 2 Previous G1 96 0
Based on the number of failures, I want to determine the best tool and operating condition, as well as the performance of each cavity.
Since the data is available as pass/fail, my first thought was to use logistic regression. I’m unsure how to correctly specify the model, however, given the imbalanced nested structure of the data. My first attempt was
formula = Fail ~ Tool + (1|Tool/Condition/Cavity)
fit = brm(formula, data=data, family=bernoulli, warmup=1000, iter=2000, chains=4, control=list(adapt_delta-0.99)
where Fail is a binary variable with 1 indicating a failure and 0 a pass.
This led to 46 divergent transitions and 800 transitions exceeding max treedepth. I’m currently rerunning the model with an adapt_delta of 0.999 and max_treedepth of 20, but I suspect that I may not have enough data to do all the inference I want, and it would be difficult to come up with strong priors. Is my model correctly specified? Is there a better way of approaching this problem?
Thanks!