# Latent mean centering (latent covariate models) in brms

Hi all,

I am trying to latent cluster-mean center my predictors in a multilevel model instead of using the observed means to center them, in order to avoid bias in the resulting parameter estimates.

(If we’re able to solve this, I’d write it up properly and submit to AMPPS as a practical tutorial, and I’d be more than happy to have coauthors! 😃 I have a longer blog post draft exploring this, massively WIP but with code and data, here)

## Background

The issue relates to centering observed predictor variables in hierarchical models of multiple people’s data in order to separate within-people and between-people associations. If one uses the raw values of observed predictors in hierarchical models, the resulting coefficients will be mixtures of within- and between-person associations. This is not optimal because psychologists, like me, like to know what is going on within a person. As a solution, psychologists have long known to person-mean center their variables in order to study within-person (causal) processes that are isolated from between-person associations (e.g. Enders and Tofighi, 2007).

However, especially if you have few observations per person, centering on the observed mean will bias your parameter estimates, because the actual mean is not an observed thing but rather a latent quantity that should be appropriately estimated. The solution to this, latent person-mean centering (Asparouhov & Muthen, 2010, McNeish & Hamaker, 2020), is available in the commercial MPlus software. (The relevant equations are in McNeish and Hamaker, which I’ve found online.)

The problem with this is, of course, that it’s not in Stan it’s not in brms most people don’t have access to MPlus because it is not FOSS. My goal is to figure out how to latent mean center predictor variables in hierarchical models of longitudinal data with brms, so that the procedure, which provides unbiased estimates, would be available to as many researchers as possible.

## Analysis

Say I have many measurements over time from many individuals on their urge to smoke and depression, and I am interested in how depression and urge to smoke on a previous occasion predict the urge to smoke within-person. Here’s some data from McNeish & Hamaker, 2020 (my blog code loads and processes this automatically in R if you’d like to follow along):

person time urge dep
1 1 0.34 0.43
1 2 -0.48 -0.68
1 3 -4.44 -1.49
1 4 -4.19 -0.74
1 5 -0.91 -0.52
2 1 1.65 0.68
2 2 0.31 1.49
2 3 0.46 0.03
2 4 -1.09 -1.02
2 5 1.67 1.07

(Just showing a few observations for 2 people in that table.) A knee-jerk multilevel model of this is

\begin{align*} \text{Urge}_{ti} &\sim N(\mu_{ti}, \sigma^2), \\ \mu_{ti} &= \bar{\alpha} + \alpha_i + (\bar{\varphi}+\varphi_i)\text{Urge}_{(t-1)i} + (\bar{\beta} + \beta_i)\text{Dep}_{ti}, \\ (\alpha_i, \varphi_i, \beta_i) &\sim MVN(\pmb{0}, \Sigma), \end{align*}

where Urge _{ti} is the urge to smoke at time t for person i. We then have population-level effects (those with bars) and person-specific deviations (without bars). As noted above, if we use the observed Urge (t-1)_{ti} and Depression _{ti}, the resulting parameter estimates will be biased. (Specifically, they will suffer from Nickell’s (negative bias in autoregressive effects) and Lüdtke’s (bias in other time-varying effects) biases [McNeish & Hamaker, 2020].)

Latent mean centering pivots on recognizing that (McNeish & Hamaker, 2020, p. 618)

\begin{align*} \text{Urge}^n_{(t-1)i} &= \text{Urge}^c_{(t-1)i} + \text{Urge}^b_i, \\ \text{Dep}^n_{ti} &= \text{Dep}^c_{ti} + \text{Dep}^b_i: \end{align*}

The observed values of urge to smoke (on a previous occasion) and depression are sums of the latent person means (with superscripts b) and their occasion-level deviations (superscripts c).

## Code

In case you didn’t notice, I am following the MPlus tutorial of McNeish & Hamaker, 2020, whose MPlus code for this model is here. It is not at all transparent to me, but maybe someone understands it:

I have tried to replicate this in brms with the following code (see my blog for reproducible code):

m4_data <- d %>%
# Create lagged variable
group_by(person) %>%
mutate(urge1 = lag(urge)) %>%
ungroup()

m4_latent_formula <- bf(
urge ~ alpha + phi*(urge1 - alpha) + beta*(dep - depb),
alpha ~ 1 + (1 | person),  # Latent intercept of urge
# urge1b ~ 1 + (1 | person),  # This is alpha! Weird & awesome (M&H2020 p.618)
depb ~ 1 + (1 | person),  # Latent mean of depression
phi ~ 1 + (1 | person),  # Slope of latent mean-person centered lagged urge on urge
beta ~ 1 + (1 | person),  # Slope of latent mean-person centered depression on urge
nl = TRUE
) +
gaussian()

# Hacky way to set some pretty random priors
p <- get_prior(m4_latent_formula, data = m4_data) %>%
mutate(
prior = case_when(
class == "b" & coef == "Intercept" ~ "normal(0, 1.5)",
class == "sd" & coef == "Intercept" ~ "student_t(7, 0, 1)",
TRUE ~ prior
)
)

m4_latent <- brm(
m4_latent_formula,
data = m4_data,
prior = p,
control = list(adapt_delta = 0.99),
file = "m4_latent"
)


(I specified the above code based on @martinmodrak’s suggestion here. I haven’t used the nonlinear syntax before so this is probably quite wrong (for other reasons too no doubt!) The model samples well but the estimates don’t quite match those reported in the MPlus tutorial of McNeish & Hamaker, 2020

variable median q2.5 q97.5
b_alpha_Intercept -0.11 -0.32 0.09
b_depb_Intercept -0.10 -0.22 0.04
b_phi_Intercept 0.21 0.18 0.25
b_beta_Intercept 0.78 0.61 0.94
sd_alpha_Intercept 0.56 0.41 0.76
sd_depb_Intercept 0.00 0.00 0.05
sd_phi_Intercept 0.02 0.01 0.03
sd_beta_Intercept 0.75 0.57 0.99
sigma 1.14 1.10 1.19

The estimates for phi and beta are the same, but depb and alpha are way off. The variances seem very similar though? I don’t know why, I think I’ve specified the right model. Maybe MPlus has weird priors?

Help needed :)

## Footnote

This issue has been discussed here before, with no resolution:

2 Likes

I don’t see anything obviously wrong with that, but I also cannot really invest a lot of time into this. One possibility is that brms by default mean-centers predictors (at least in linear models, not sure about non-linear ones, if yes, you’d see it in the generated Stan code), while MPlus might not do that.

In any case, in Bayesian workflow we have tools to check if our model is implemented correctly and this involves simulations, SBC being IMHO the most comprehensive way to do this.

We even have a vignette showing how to use SBC with brms models. What you want is to write a custom generator that should match the model you believe you have implemented (generating data from the brms model won’t work with non-linear models and is not very useful for your case anyway). If there’s a mismatch SBC will let you know. Note that you need to make sure that you have sensible priors.

In parallel, you may want to try running the MPlus code with multiple chains and longer chains (possibly thinned) to see if it has actually converged.

Good luck with the project!

Awesome, thank you! I’ll definitely take a look at SBC.

This is a cool project. Are the data publicly available?

I might be overlooking something, but it doesn’t look like depb is actually modelled as the person-mean of dep. Maybe it is induced through the first line? Either way, I wonder if adding dep ~ depb would work.

Hi @simonbrauer 👋,

The code and data are in the blog post (there’s a link to the data, but the script also downloads it automatically). I found the McNeish and Hamaker paper online here.

Looking at equation 4b in McNeish and Hamaker, I think you might be right that we’d also need to model dep, but I don’t know how I’d write that in the model.

I think if this requires modelling dep, it might not be possible to do with brms because you cannot pass parameters across formulas in a way that I think that might require.

Here’s my naive first try:

m4_latent_formula <- bf(
urge ~ alpha + phi*(urge1 - alpha) + beta*(dep - depb),
alpha ~ 1 + (1 | person),  # Latent intercept of urge
dep ~ 0 + depb,
depb ~ 1 + (1 | person),  # Latent mean of depression
phi ~ 1 + (1 | person),  # Slope of latent mean-person centered lagged urge on urge
beta ~ 1 + (1 | person),  # Slope of latent mean-person centered depression on urge
nl = TRUE
) +
gaussian()

p <- get_prior(m4_latent_formula2, data = m4_data) %>%
mutate(
prior = case_when(
class == "b" & coef == "Intercept" ~ "normal(0, 1.5)",
class == "sd" & coef == "Intercept" ~ "student_t(7, 0, 1)",
TRUE ~ prior
)
)

Error: The following variables can neither be found in 'data' nor in 'data2':
'depb'


OK, let’s try to model it as a missing variable somehow (I’m not familiar with the mi() syntax):

m4_latent_formula <- bf(
urge ~ alpha + phi*(urge1 - alpha) + beta*(dep - depb),
alpha ~ 1 + (1 | person),  # Latent intercept of urge
dep ~ 0 + mi(depb),
depb | mi() ~ 1 + (1 | person),  # Latent mean of depression
phi ~ 1 + (1 | person),  # Slope of latent mean-person centered lagged urge on urge
beta ~ 1 + (1 | person),  # Slope of latent mean-person centered depression on urge
nl = TRUE
) +
gaussian()

m4_data <- m4_data %>%
mutate(depb = NaN)

p <- get_prior(m4_latent_formula2, data = m4_data) %>%
mutate(
prior = case_when(
class == "b" & coef == "Intercept" ~ "normal(0, 1.5)",
class == "sd" & coef == "Intercept" ~ "student_t(7, 0, 1)",
TRUE ~ prior
)
)

Error: Variables in 'mi' terms should also be specified as response variables in the model. See ?mi. Error occurred for 'depb'.


What do you think?

1 Like

Are correlations between the random effects modeled by the brms model? In Mplus, I think they are not unless this is specified in the code. If they are only included in one version, this could explain differences.

Hi @Nic,

I didn’t include the random effect correlations because I wanted to exactly replicate the McNeish and Hamaker analysis (they set the covariances to zero).

Thanks for pointing me to the data. I’ve been fiddling around with brms without much luck. Adding dep ~ depb, dep ~ 0 + depb, etc. always threw an error stating that depb couldn’t be found in the dataset.

I did want to check my intuition with hard-coded Stan. Once I added the equivalent of dep ~ depb, I produced very similar results to McNeish and Hamaker. (ignore me being a bad Bayesian and not putting priors on all of the parameters.) So I think that’s the most likely culprit.

model.stan (1.7 KB)

1 Like

That’s really cool! Thanks.

I was afraid that there would have to be a Stan solution to this, rather than a brms one. I’ll take a closer look at this (next week hopefully), and see if I can find a way to add the term into brm().

Otherwise, I think I might be able to hack it together with stanvar(). I’ll update here when I have some progress.

(I’d personally be OK with a Stan solution, but my whole motivation here is to make latent mean centering more available to applied research, because in my field it is currently basically hidden in MPlus and most people wouldn’t be comfortable writing Stan code.)

1 Like

I think that this should work, as long as you add the response pattern for depb on dep. For example, when using brms to write a CFA https://scottclaessens.github.io/blog/2020/brmsLV/ you see that you have to specify priors for the regressions of the latent variable to the obsevred one.

I think that if you specify the prior for the response dep, should work

2 Likes

I completely agree. If it can be elegantly done in brms then that would be a huge asset to the average user. Hopefully @Mauricio_Garnier-Villarre 's idea will work out.

1 Like

I’ve been trying to do that in the edited blog entry but have run into some problems. This might be my unfamiliarity with mi(), but you might not be able to specify the terms in the way that’s required for this model even using that syntax.

However, I did find this discussion which seems related (@Stephen_Martin ?)

I’ll try to dig deeper and will post back when I have updates. Thanks all for your comments and suggestions! 💚

1 Like

@matti

I am checking the Stan syntax by @simonbrauer that repliacted the results, and see that the relevant section is this in the model block, right?

vector[M] alpha_di = alpha_d0 + alpha_di_raw * gamma_ad;
vector[N] dep_t_center = dep_t - alpha_di[I];


So, you want to use dep_t_center  as the predictor. With a phantom latent variabe like this you probably need to se some idenitifcation constraint with brms. Could you point me to your brms code, what parameter would be alpha_di  ? I might be able to see how to identfy it with mi()

There are a group of us in the Netherlands who were interested in figuring this out as well. We have mostly reaction time data, so might offer a nice robustness check across data types/models, and we are really motivated to figuring this out in a program other than Mplus (we’ve done initial things in Mplus so we have results to check against).

Hi, thanks for helping out!

The source code for my blog post draft is here. The first proper attempt at this model is in this section, where I first try (and fail with) the mi() syntax (here’s a link to the line in the source code).

I’m not sure if I understand the Stan syntax; alpha_di is the latent person-mean of “depression” in this example, right? That is called depb in my version of the brms syntax (the use of mi() here is not right, and probably the other lines that include it are also wrong):

m4_latent_formula <- bf(
urge ~ alpha + phi*(urge1 - alpha) + beta*(dep - mi(depb)),
alpha ~ 1 + (1 | person),
dep ~ 0 + mi(depb),
depb | mi() ~ 1 + (1 | person),
phi ~ 1 + (1 | person),
beta ~ 1 + (1 | person),
nl = TRUE
) +
gaussian()


Does that help?

Hi @e.m.mccormick 👋, welcome to the Stan forum!

I think we’re (well @Mauricio_Garnier-Villarre and @simonbrauer are, I’m poking in the dark really 😆) making some progress on a brms implementation in this thread. As you’ll notice, several Stan implementations are already available, so you might be interested in checking those out. It would be great to make this work for your data as well! (But if your N per person is quite large the benefits of this over using simple observed means should be increasingly small.)

Coincidentally I am moving to the Netherlands soon, perhaps we can chat longitudinal models over coffee some time.

1 Like

I think the solutions is something like this, where the mean of depb is pass to the first regression. But I havent figured out how to pass alphadep  from one bf() to be used in the other formula

m4_latent_formula <- bf(
urge ~ alpha + phi*(urge1 - alpha) + beta*(dep - alphadep ),
alpha ~ 1 + (1 | person),
dep ~ 0 + mi(depb),
phi ~ 1 + (1 | person),
beta ~ 1 + (1 | person),
nl = TRUE) +
bf(depb | mi() ~ alphadep,
alphadep ~ 1 + (1 | person),
nl=T) +
gaussian() + set_rescor(FALSE)


PS: I am also in the Netherlands, in Amsterdam, we can meet sometime

1 Like

Thanks! I think I see how that might work.

I’ll try this out as well and see what I can do. I believe it is not possible to pass parameters between separate bf() formulas, so it might need some kind of a workaround. I’ll keep you posted.

Would be lovely to have a coffee and chat about Stan 👍. Is everyone here in the NL? 😆

You’ve got it right.

• alpha_ui: Person-specific mean of urge to smoke
• alpha_di: Person-specific mean of depressive symptoms
• phi_i: Person-specific autoregressive effect of latent-mean-centered urge to smoke
• beta_i: Person-specific effect of latent-mean-centered depressive symptoms

I’m not an experienced brms user. But conceptually it seems like you should be able to setup the same model for both urge and dep, just flipped (as in the formulas below). It’s not working for me, so obviously what is conceptually possible is not actually possible with such a simple switcheroo.

m4_latent_formula <- bf(
# urge model
urge ~ alphaU + phiU*(urge1 - alphaU) + betaU*(dep - alphaD),
alphaU ~ 1 + (1 | person),
phiU ~ 1 + (1 | person),
betaU ~ 1 + (1 | person),

# dep model
dep ~ alphaD + phiD*(dep1 - alphaD) + betaD*(urge - alphaU),
alphaD ~ 1 + (1 | person),
phiD~ 1 + (1 | person),
betaD ~ 1 + (1 | person),
nl = TRUE
) +
gaussian()


It seems like you could almost do it in double-long form, where each row is outcome variable (urge vs dep) nested in time nested in person. But even if that were possible, I don’t think it would solve the initial goal of having a simple brms solution to a complex model.

I think that’s exactly right. You could write that model out as a multivariate brms model, and in that case you cannot use parameters from one outcome’s model in the model for the other outcome.

And yes, if you stack the data into “double long”, you could probably use an index variable approach (or interactions) to get out the required terms. I’ll have to try it out!

I’ll look into this. Cheers!