Regression of slopes of slopes in brms: is this possible?

I want to make a model which does two separate regressions, each with species-level effects, and then does a (phylogenetic) regression on those two slopes (with a pair of slope values for each species).
b ~ beta0 + beta1*year // Year regression on b
beta0 + beta1 ~ 1 | species // Make it species-level
a ~ hatbeta0 + hatbeta1*year // Year regression on a
hatbeta0 ~ 1 | species // Make it species-level
hatbeta1 ~ 1 + beta1 // Do a regression on the two slopes

Is this possible in brms? I wrote a Stan version of this model, but I’m in integration hell right now trying to get it work, and I wondered if maybe I’m rebuilding the wheel here.

This is what I have in the way of brms code. (My data frame is species,year,b,a.)

brm(
  bf(
    b ~ beta0 + beta1*year,
    beta0 + beta1 ~ 1 | species,
    nl=TRUE
  ) + bf(
    a ~ hatbeta0 + hatbeta1*year,
    hatbeta0 ~ 1 | species,
    hatbeta1 ~ 1 + beta1,
    nl=TRUE
  ) + set_rescor(FALSE),
  data=species_measures,
  chains=1
)

But it doesn’t like that I have beta1 in in the second bf and defined in the first bf… If I put them in the same bf then it suddenly doesn’t recognize a as data and the whole thing breaks.

I’m very excited to hear what you all have to say!

(More formally, this is the model:

Let b_i be b measurements associated with year observations y_{i}, and a_j be a measurements associated with year observations \hat y_{i}. For each of these s_{i}\in\{1,\cdots,n\} and \hat s_{j}\in\{1,\cdots,n\} are the corresponding species indices, for n species total. Then let \beta_{0,k},\beta_{1,k},\sigma_{k} and \hat\beta_{0,k},\hat\beta_{1,k},\hat\sigma_{k} be species-level intercepts, slopes, and standard deviations for each species, for b and a measurements respectively.

We want to fit the parameters B_0,B_1, and \Sigma (eventually \Sigma will be the combination of a Pagel’s lambda parameter, a scaling coefficient, and a phylogenetic distance matrix; for now, it is simply a scalar).

The model is as follows:
b_i\sim\mathcal N(\beta_{0,s_{i}}+\beta_{1,s_{i}}y_{i}, \sigma_{i})
a_j\sim\mathcal N(\hat \beta_{0,\hat s_{j}}+\hat \beta_{1,\hat s_{j}}\hat y_{j}, \hat\sigma_{j})
\hat\beta_k\sim\mathcal N(B_0+B_1\beta_k,\Sigma)

I care about the estimation of B_1.)

I think you want something like
b ~ year + (1 || species) + (year | g1 | species)
a ~ year + (1 || species) + (year | g1 | species)

Explanation: your
b ~ beta0 + beta1*year, beta0 + beta1 ~ 1 | species
is equivalent to
b ~ year + (1 || species) + (year || species)

where || indicates that we don’t want the random intercept to be modeled as correlated with the random slope of year. (At least I think this is true; I’m not 100% sure whether beta0 + beta1 ~ 1 | species models the effects as correlated or uncorrelated. I’ll give the model where everything is correlated at the end of this post.)

The | g1 | random effect specification is telling brms to model the random slopes as correlated across the formulas. Here g1 is just an arbitrary non-empty character string identifying a set of random effects that share a grouping variable to be modeled as correlated. These correlated random effects are equivalent to what you are hoping to achieve by modeling one random effect as though it’s regressed on another, under the assumption that both effects are normally distributed.

If you want a model where the intercepts are also correlated with the slopes, do
b ~ year + (1 | g1 | species) + (year | g1 | species)
a ~ year + (1 | g1 | species) + (year | g1 | species)

or equivalently

b ~ year + (1 + year | g1 | species)
a ~ year + (1 + year | g1 | species)

Edit:
The above gives something that should recreate the model you wrote down in pseudocode, but I don’t know of a clear way in brms to extend this to a phylogenetic regression. I’ll mull that over, but it might be easiest in raw Stan.

2 Likes

I think (x | g) is expanded to (1 + x | g), so you’d need (0 + year | g1 | species), right?

I think adding the genetic covariance is straight-forward in principle:

b ~ year + (1 + year | g1 | gr(species, cov = phy))
a ~ year + (1 + year | g1 | gr(species, cov = phy))

although the phylogenetics vignette warns that models with phylogenetic slopes will be slow to sample from.

1 Like

Thank you both for all of your advice! I’m very new to brms, and there is clearly a lot of syntax I’m not familiar with yet!

One small question, and one bigger one:

#1 Would it be unadvisable to skip the overall year slope, and just go for

b ~ (1 || species) + (0 + year |p| species)
a ~ (1 || species) + (0 + year |p| species)

I don’t want any of the species-signal to get filtered out as “overall year slope.” I’m not sure it would be a problem if it did (assuming it did it uniformly of course), but at the very least it feels like an extra parameter to fit, and I can’t think of why I would actively want it to do this. (Does it help standardize the species-level slopes?)

#2 I saw the |p| notation when I was reading through the multivariate/multilevel modeling vignette and wondered if that would be the right choice here, but I was concerned that “correlated group-level effects” wasn’t the same as a linear regression on the group-level effects. Behind the scenes, it is doing the same thing? (For instance, could I pull out a posterior for the slope from this correlation?) It sounds like this is what you’re saying, but I just wanted to check!

(Special thank you to @jsocolar who has now answered all three of my questions about this friggin’ model. That’s a hat-trick!)