Gage R&R - Residuals and Repeatability

Hi,

I have tried to run something similar to Gage R&R and I have got the following results.

According to Gage R&R the residuals should be the Repeatability which here should be 20.90. However, as there are 4 products (the intercept and other 3) there is the question, is the repeatability the same across the 4 products? Can I find the residuals for each product?
I am tempted to say that the repeatability for the base is 13.318 (MAD_SD for intercept) but I am not sure that is correct.

Eventually, I am asking if there is any link between the Residual Std.Dev and the predictor’s MAD_SD.

Thanks for your time.

Hi,
unfortunately, I don’t really understand the question - it uses a lot of terms that seem domain-specific. Here is what I understood well:

You should be able to get the predicted means for each observation and posterior sample via posterior_epred - comparing to the actual observations should then give you residuals per observation and sample, which you can then group in whichever way you like. I however can’t say to what extent this corresponds to any quantity you actually care about.

Residual standard deviation is AFAIK modelled by the sigma parameter. MAD_SD is AFAIK just computed from the posterior samples of the relevant parameters (in the output you’ve shown it is there only for sigma).

As I am less familiar with stan_lmer, I’d ask @jonah to double check my answer is correct.

Best of luck with your model!

1 Like

Hi,

I am going to follow your suggestion because sounds reasonable

You should be able to get the predicted means for each observation and posterior sample via posterior_epred - comparing to the actual observations should then give you residuals per observation and sample, which you can then group in whichever way you like.

However, do you think that the MAD_SDs of predictor’s levels and the residuals from posterior_epred (after comparing them with the actual observations) should be related?

Thanks for your time.

No, I don’t see any reason they should be related in some interpretable way. Residuals should IMHO be primarily related to how well does the model “explain” the data (i.e. primarily whether the predictors - as used by the model - contain a lot of information about the outcome), while the MAD_SD of a model parameter represents how certain are you about the exact value of the parameter, which is primarily a function of the amount of data you have.

So you can have lot of data but a model that doesn’t “explain” the outcome well (small MAD_SD, big residuals) or a model that “explains” the outcome quite well, but you have small dataset (big MAD_SD, small residuals).

1 Like

Yeah sounds correct to me!