Pairs plots are super handy

Howdy! This is not a question. Just an interesting example that could be helpful/memorable.

I just thought I would share an interesting modeling anecdote to encourage newbies (and everyone) to use pair plots.
Someone sent me a brms model file from a multiple regression model fit. Here are the pairs plots of the coefficients.

Well, that’s the most perfect correlation that I have ever seen in a pairs plot. So perfect in fact, that it was like the two predictors were the same. And indeed, when I pulled the data from the model and checked, sure enough, two columns had the same data but different labels.
Now of course you don’t need a pairs plot to see this. You might say, “Well, they should have plotted all the raw data first.” Well they did, but then they made some simple transforms to make some additional variables. Apparently some of the code got copied. Anyhow, easy mistake to make and not see in the raw data, but also easy to catch in a regression. One could look at the coefficients and standard errors compared to the other predictors and see something was wrong, but for detecting the scale of the problem, the pairs plot was perfect.
This is an extreme/obvious real-world example that’s sort of fun to come across, but I’ve found in general it’s a good idea to always look at a pairs plot for various combinations of parameters in your model. It can be pretty helpful/revealing, even when you don’t have a warning that suggests to look at it.

16 Likes