Understanding group-level correlations

amynang · June 19, 2025, 8:29am

I am trying to fit a multivariate nonlinear model that describes the reaction norms of phytoplankton gross photosynthesis and respiration rates to temperature. The data are rate measurements at different temperatures for 18 species of marine phytoplankton from this study. The relationship of each rate to temperature (K) is described by the following formula:

\text{rate} = c \cdot \frac{ \exp\left(\frac{E_a}{8.62 \times 10^{-5}} \left(\frac{1}{T_c} - \frac{1}{K} \right)\right) }{ 1 + \exp\left(\frac{E_h}{8.62 \times 10^{-5}} \left(\frac{1}{T_h} - \frac{1}{K} \right)\right) }

where T_c is a constant.

So my model has the following form:

  bf(
    rateGP ~ c * exp(Ea/(8.62 * 10^-5) * (1/293.15 - 1/K)) / (1 + exp(Eh/(8.62 * 10^-5) * (1/Th - 1/(K)))),
      c ~ 1 + (1|species),
     Ea ~ 1 + (1|species),
     Eh ~ 1 + (1|species),
     Th ~ 1 + (1|species)
    ,sigma ~ 1 + (1|species)
    ,nl = T
  ) +
    bf(
      rateR ~ c * exp(Ea/(8.62 * 10^-5) * (1/293.15 - 1/K)) / (1 + exp(Eh/(8.62 * 10^-5) * (1/Th - 1/(K)))),
       c ~ 1 + (1|species),
      Ea ~ 1 + (1|species),
      Eh ~ 1 + (1|species),
      Th ~ 1 + (1|species)
      ,sigma ~ 1 + (1|species)
      ,nl = T
    ) + set_rescor(FALSE)

My main motivation for reanalyzing these data is that I would like to generate ecologically plausible combinations of parameter values, to use in simulations of multispecies communities. I think one way to do that would be by sampling draws from the posterior. I would like to understand the implications of (not) estimating group-level correlations for the different parameters.

Am I right in thinking that, if the data imply correlations between the different parameters, these will influence the joint posterior even if I am not explicitly estimating them, so that draws from the posterior would be constrained by these correlations. If instead I wanted to generate predictions from this model, then explicitly estimating correlations (~ 1 + (1|p|species) across all parameters) would be necessary for the predictions to be constrained by these correlations.

Does that make sense?

Corey.Plate · June 19, 2025, 11:19pm

If I understand your question correctly, you will want to explicitly model the correlations to ensure that all of the variation in the model is being properly assigned. We found that when we did a correlation recovery analysis on our simulated behavioral data, the inclusion of an explicitly modeled correlation systematically improved the recoverability of the parameters of interest in addition to modeling correlations better than post-hoc analyses with posterior point estimates.

amynang · June 20, 2025, 10:37am

Thank you for your reply. I am prone to wishful thinking about what a model does under the hood…

So far, explicitly modeling these correlations has been a major pain in my own posterior, which is why I am inclined to ignore them. The model does capture the variation among species, which makes me wonder about the gains of the added complexity.

But if I understand correctly, drawing samples from a fitted model that does not have correlated group-level terms is no better than transforming the means and confidence intervals reported in the paper to means and standard errors and generating samples based on those (leaving aside any differences in estimates that come from using a different modeling framework).

Edit: I forgot that the paper does not report group level uncertainty. So this was not a pointless exercise.

jsocolar · June 20, 2025, 9:54pm

This is an example where some very accessible intuition goes a long way towards sophisticated understanding.

In the event that groups are informed by lots and lots of data, modeling the correlations explicitly doesn’t matter much (just as modeling the margins as random effects doesn’t matter much). Shrinkage becomes irrelevant because the true value is very strongly identified by the data.

In less data-rich cases, when modeled correlations are strong, the upshot is that the joint posterior gets shrunk towards the line of correlation (consider what happens if the correlation is fixed to one).

In cases where there is a moderate or strong correlation in the generating process, and the correlation is modeled, then posterior replicates should look like samples from the generating correlation. If the correlation is not modeled, then the posterior sample correlation between the fitted intercepts will tend to underestimate the true correlation. This is true even when not predicting to new unobserved groups–it is true for any group with less than a whole lot of data (again, with a whole lot of data the position of the intercept pair is effectively fixed by the data, and shrinkage becomes irrelevant).

Topic		Replies	Views
One chain not moving: sampling for complex nonlinear mixed effects model brms techniques , fitting-issues , specification , ecology	14	3351	November 6, 2020
Advice on a non-linear regression with divergences Modeling	0	466	December 30, 2018
Problems fitting multivariate model with option set_rescor brms brms	4	63	June 17, 2025
Multivariate model interpreting residual correlation and group correlation brms interpret-results	14	3844	September 12, 2020
Bernoulli variables of multivariate brms model poorly estimated brms	4	815	May 7, 2019

Understanding group-level correlations

Related topics