Multivariate model interpreting residual correlation and group correlation

Ah, I see. This stuff can get confusing, so don’t be afraid to ask for clarifications.

If you only cared about how well the measurements agree in the experiment you just did, you don’t need a statistical model - you could just compute the correlation matrix (or a similar measure) between observed A,B,C and D and be done. All there is to know about the experiment you ran has been observed. But I guess you wouldn’t be happy with that approach - and I guess that is because you actually want to generalize. The phrase “how well the measurements agree” implies that you care about some general “truth”. You care about generalization - to other centers, other times from baselines, other subjects. If you don’t care about generalization I honestly think there is little reason do statistical modelling - just describe the data you’ve seen - that’s also good (statistical model can be also a tool to just describe data, but then you IMHO actually start to play by almost the same rules as if you consider generalizations and the distinction is IMHO only of theoretical interest).

One way (but not the only one) to think about this generalizations is to imagine hypothetical new experiments/measurements. And looking at individual coefficients can be seen as a special case of this. If I have a simple linear regression y ~ x and inspect the fitted coefficient for a binary predictor x, then I am looking at how big difference between averages of y for the two groups I would see in a hypothetical new set of measurements with no noise/unlimited data. However, for more complex models the individual coefficients can correspond to quite bizarre hypothetical quantities.

But the framework of hypothetical future experiments is helpful because it let’s us be explicit about the details of what you care about - I wrote about this at Inferences marginal of random effects in multi-level models - #3 by martinmodrak so I won’t repeat it here.

As a (slightly contrived) example of how to use predictions and one reason why it can matter consider this regression:

library(brms)
d <- data.frame(x = c(3,5))
fit <- brm(x ~ 1, data = d)
fit

The summary of the parameters is:

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept     3.96      1.25     1.36     6.46 1.00     1164     1313

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     2.13      1.51     0.65     6.06 1.00     1242     1257

But that’s hiding an important part of the picture - the estimates for Intercept and sigma are not idependent - the data is consistent with the Intercept around 4 and small sigma but also with large sigma and the Intercept in a wide range.

pairs(fit)

For some inferences, this doesn’t really matter. I can interpret the Intercept as the mean of a hypothetical future measurement with unlimited data. Let’s compare this with prediction for a “big” experiment:

dummy_data <- data.frame(.dummy = rep(1, length.out = 1000)) # Data frame with 1000 rows, to predict 1000 observations, normally this would house covariates
pred <- posterior_predict(fit, newdata = dummy_data)
samples_mean <- rowMeans(pred)
mean(samples_mean)
# 3.961717
quantile(samples_mean, c(0.025,0.975))
#     2.5%    97.5% 
# 1.345227 6.462462 

Yay, we recovered the summary for Intercept. But what if we care just about a single new value? How big could it get? So the Intercept is likely not bigger than 6.46, but the sigma can plausibly get to 6.06. How likely is that both are large? There is no straightforward way to see that from the parameter summaries, but we can use predictions.

samples_single <- posterior_predict(fit, newdata = data.frame(.dummy = 1)) 
mean(samples_single)
# [1] 3.962272
quantile(samples_single, c(0.025,0.975))
#      2.5%     97.5% 
# -1.663526  9.813133 

This dependency between parameters is usually not that strong in practice, but it is a similar issue to the one you are facing. Here, there are two sources of variability that cannot be fully understood separately. In your case, there are two sources of correlation and considering them separately is weird as you noted. The easiest way to consider them together is to use predictions.

Hope that clarifies more than confuses.

3 Likes