Estimating a Latent Difference Score - change predicting change

Jesse_Fagan · May 29, 2020, 12:59pm

Hello all,

I have a very rich dataset composed of observational communications (e.g. email data), psychometric surveys, and individual outcomes (e.g. salary, performance reviews). And it’s all recorded on two different occasions 12 months apart. My hypotheses are centered on “change predicts change”. That is a change in the communication network should be associated with some change in individual outcomes.

To estimate the effects of change of one variable on the change of another another variable from one occasion to the next, I’ve been using a Latent Difference Score estimated using lavaan. This has served pretty well, except it has some pretty substantial limitations. For one it’s difficult to estimate interactions in structural equation models. Also I want to estimate different effects for each division in the organization, which implies a multi-level model, but lavaan isn’t well-suited for this. For now I’ve been splitting the sample, but I don’t like that approach.

Here is a discussion of latent difference scores here for further reading:
Gollwitzer, M., Christ, O., & Lemmer, G. 2014. Individual differences make a difference: On the use and the psychometric properties of difference scores in social psychology. European Journal of Social Psychology , 44(7): 673–682.

And this is a figure of the model. The latent “delta” is the estimated latent difference score.

Here is some simulated example data.

set.seed(42)
N <- 100
x1 <- rnorm(N, mean = 3, sd = 1)
d <- tibble(id = 1:N, x1 = x1)
d$x2 <- d$x1 + rnorm(N, mean = 3.2, sd = 1)
d$sx <- NA_real_

In lavaan specifying this model is straightforward. The sx is the latent difference score (slope of x).

library(lavaan)
m1 <- '# set the autoregressive path to 1
x2 ~ 1 * x1
# define the latent variable
sx =~ 1 * x2
# means (no mean for the non-baseline exogenous covariates)
sx ~ 1
x1 ~ 1
x2 ~ 0
# exogenous covariances
x1 ~~ x1 + sx
sx ~~ sx
# disturbances set to 0
x2 ~~ 0 * x2
'
f1_lv <- lavaan(m1, d, fixed.x=F)
summary(f1_lv)

This fits well. Then you can estimate an LDS for each measure. Then create a structural model of the different paths between the estimated LDS to test the hypotheses.

> mean(predict(f1_lv)[,'sx'])
[1] 3.112516

When I try to translate this in to brms, I don’t have the same success.

f1_formula <- bf(x2 ~ mi(sx) + x1 + 0) +
  bf(x1 ~ 1) +
  bf(sx | mi() ~ 1 + x1) +
  set_rescor(rescor = FALSE)
get_prior(f1_formula, data = d)
f1_prior <- prior(constant(1), resp = 'x2', coef = 'x1') +
  prior(constant(1), resp = 'x2', coef = 'misx') +
  prior(normal(0, 0.001), resp = 'x2', class = 'sigma')
f1 <- brm(formula = f1_formula,
          prior = f1_prior,
          data = d)
mcmc_plot(f1)

I’m asking if anyone has a good idea of how to translate the lavaan code to brms (or RStan, if I can have to learn the low-level code, I guess this is a good time to get started).

Alternatively, is there a better way to examine these hypotheses? Does change predict change?

For instance here’s another simulated dataset of two variables. In this case the change in x impacts the change in y. A bigger change in x causes a bigger change in y. How would you model something like this?

Thank you much for reading!

set.seed(42)
# 100 subjects
N <- 100
d <- tibble(id = 1:N, 
            x1 = rnorm(N, mean = 3, sd = 1)) %>% 
  mutate(x2 = x1 + rnorm(N, mean = 1, sd = 1)) %>% 
  mutate(xdiff = x2 - x1) %>% 
  mutate(y1 = rnorm(N, mean = 3, sd = 1)) %>% 
  mutate(y2 = y1 + rnorm(N, mean = xdiff, sd = 1)) %>% 
  select(-xdiff) %>% 
  gather('k', 'value', -id) %>% 
  mutate(variable = map_chr(str_split(k, pattern = ''), 1)) %>% 
  mutate(time = map_chr(str_split(k, pattern = ''), 2)) %>% 
  select(-k) %>% 
  spread(variable, value)

> glimpse(d)
Observations: 200
Variables: 4
$ id   <int> 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10, 11, 11, 1…
$ time <chr> "1", "2", "1", "2", "1", "2", "1", "2", "1", "2", "1", "2", "1", "2", "…
$ x    <dbl> 4.370958, 6.571924, 2.435302, 4.480053, 3.363128, 3.359920, 3.632863, 6…
$ y    <dbl> 0.9990708, 3.1954154, 3.3337772, 6.1387705, 4.1713251, 4.2071074, 5.059…

andrjohns · May 29, 2020, 1:41pm

Have you looked into the blavaan package? That will let you use lavaan syntax, but run the models with Stan on the back-end

Jesse_Fagan · May 31, 2020, 9:23am

Thanks for the tip! I check it out and it looks like a really great package.

I don’t know that it solves the problem though. Many of the challenges are still there. I need to be able to compared the effects of multiple groups and introduce some interaction effects. SEM is very clunky and convoluted when it comes to that, which is why I was hoping for a solution using RStan or BRMS.

I think I may rephrase my question again with another post but leave lavaan and latent variable estimation out of it. I really want to focus on two-occasion / two-wave data and estimates of how change in one variable impacts change in another variable.

Thank you!

erognli · May 31, 2020, 6:31pm

I have coded an analysis using latent difference scores in Stan, with the latent difference scores serving as a predictor in an ordinal regression model. Code is available here: https://osf.io/7xt86/

Might not be completely relevant to your case, my application was for informant discrepancies. But I’d be happy to have a look at your code if you give it a go directly in Stan.

abartonicek · May 31, 2020, 9:35pm

I know pretty much nothing about SEM’s, but wouldn’t potentially a simple model predicting individual outcomes from time, communication, and time-by-communication interaction also answer your question? It’d be easier to fit and expand with random effects. I’d imagine that estimating both non-linear effects and random effects at the same time might lead to some very weird geometry, and may require very strong assumptions/priors.

Jesse_Fagan · June 1, 2020, 10:58am

That’s very interesting. I’ll look through it. I’m still getting used to Stan syntax, so it may be some time before I can parse this enough to understand what it’s doing. Thank you!

erognli · June 1, 2020, 1:12pm

Glad it might be useful! Feel free to ask for explanations of the code here or by pm - I’m sure it’s not written in the most comprehensible way. I’ve gotten tons of help from this community, so it feels nice to be able to forward the favor.

Topic		Replies	Views
Are differences in the posterior linear predictor meaningful/useful? brms ordinal-response	2	447	April 7, 2021
Divergent transitions in a latent variable model Modeling	6	1345	June 23, 2017
Hierarchical Models for Longitudinal Bayesian SEM / Multidimensional IRT with Latent Regression Modeling specification , irt , hierarchical-model	1	695	June 7, 2022
Identification problems in a one-factor latent variable model Modeling	7	1505	June 2, 2023
Capturing difference in within-subject variance in multilevel model with map2stan Modeling specification	10	921	November 22, 2020

Estimating a Latent Difference Score - change predicting change

Related topics