Latent Mediation

Hey there!

I am just starting to get my feet wet with Stan and am running into a few issues. I am ultimately trying to build a model that has 1 measured predictor X, 5 latent mediators M with 5 measured indicators, and 1 measured predictor Y. I have been working my way through simple models and have been able to do this with all 5 mediators as directly measured, and then only have 1 latent mediator. Each of those I have been able to get the chains to converge, n_eff looks good, Rhat looks good, and the estimate are reasonable.

I start running into problems when I try to extend past 1 latent mediator. When I attempt 2 latent mediators the chains mix, however my mean estimates of the regressions between the X and M and the M and Y are sometimes the correct magnitude, but off in the direction, and once I try to go to 3 latent factors, I cannot get the chains to mix at all.

So here is my R code simulating my data for 2 latent variables: (Note that the data is simulated using the lavaan package)

set.seed(1234) 
  
  beta=c(0.59,0.36,0)    
  lambda = c(0.7, 0.7, 0.7, 0.7, 0.7)  
  
  N=1000 # Number of observations
  J=2   # Number of Mediators
  K=5   # Number of Indicators
  
  # SEM model for a 5 mediator model
  pop_model = paste('
# Measurement model for the latent mediator
     M1 =~ ', lambda[1], '*I1.1 + 
           ', lambda[2], '*I1.2 + 
           ', lambda[3], '*I1.3 + 
           ', lambda[4], '*I1.4 + 
           ', lambda[5], '*I1.5
           
     M2 =~ ', lambda[1], '*I2.1 + 
           ', lambda[2], '*I2.2 + 
           ', lambda[3], '*I2.3 + 
           ', lambda[4], '*I2.4 + 
           ', lambda[5], '*I2.5

# Residuals
     X  ~~ 1*X
     
     I1.1 ~~ 1*I1.1
     I1.2 ~~ 1*I1.2
     I1.3 ~~ 1*I1.3
     I1.4 ~~ 1*I1.4
     I1.5 ~~ 1*I1.5
     
     I2.1 ~~ 1*I2.1
     I2.2 ~~ 1*I2.2
     I2.3 ~~ 1*I2.3
     I2.4 ~~ 1*I2.4
     I2.5 ~~ 1*I2.5

     Y  ~~ 1*Y
                           
# Regressions 
  M1 ~ ',beta[1],'*X
  M2 ~ ',beta[2],'*X

  Y ~ ',beta[3],'*X + 
     ',beta[1],'*M1 + 
     ',beta[2],'*M2  

')
  
  # Simulate the data 
  med_data = simulateData(pop_model, sample.nobs=N)
  
}

#################################################
# Run the Stan Model
#################################################
{
  # Prepare data for Stan
  
  I1 <- array(dim = c(N,K))
  I2 <- array(dim = c(N,K))
  for(k in 1:K){
      I1[ ,k] <- med_data[,k]
      I2[ ,k] <- med_data[,k+K]
  }
  
  stan_data <- list(N = N, J = J, K = K, X = med_data$X, Y = med_data$Y,
                    I1 = I1,
                    I2 = I2
                    )
  
  # Fit the model
  options(mc.cores = parallel::detectCores())
  fit <- stan(file = med2BL_noreg_ex.stan, data = stan_data, iter = 2000, chains = 4)
}

And then here is my Stan code

data {
  int<lower=1> N;            // Number of observations
  int<lower=1> K;            // Number of indicators per mediator (assuming 5 for both M1 and M2)
  matrix[N, K] I1;           // Indicator data for mediator 1
  matrix[N, K] I2;           // Indicator data for mediator 2
  vector[N] X;               // Independent variable
  vector[N] Y;               // Dependent variable
}

parameters {
  vector[K] alpha1;          // Intercepts for indicators of M1
  vector[K] alpha2;          // Intercepts for indicators of M2
  vector[K] lambda1;         // Loadings for indicators of M1
  vector[K] lambda2;         // Loadings for indicators of M2
  real beta_a1;             // Effect of X on M1
  real beta_a2;             // Effect of X on M2
  real beta_b1;             // Effect of M1 on Y
  real beta_b2;             // Effect of M2 on Y
  real beta_c;              // Direct effect of X on Y
  vector[N] eta1;            // Latent scores for M1
  vector[N] eta2;            // Latent scores for M2
  real<lower=0> sigma_Y;     // Std dev of Y
  real<lower=0> sigma_I1[K]; // Std dev of indicators of M1
  real<lower=0> sigma_I2[K]; // Std dev of indicators of M2
}

model {
  // Priors
  alpha1 ~ normal(0, 1);
  alpha2 ~ normal(0, 1);
  
  lambda1 ~ normal(0, 1);
  lambda2 ~ normal(0, 1);
  
  beta_a1 ~ normal(0, 1);
  beta_a2 ~ normal(0, 1);
  beta_b1 ~ normal(0, 1);
  beta_b2 ~ normal(0, 1);
  beta_c ~ normal(0, 1);
  
  sigma_Y ~ normal(0, 1);
  sigma_I1 ~ cauchy(0, 2);
  sigma_I2 ~ cauchy(0, 2);
  
  // Measurement model for M1
  for (k in 1:K) {
    I1[, k] ~ normal(alpha1[k] + lambda1[k] * eta1, sigma_I1[k]);
    I2[, k] ~ normal(alpha2[k] + lambda2[k] * eta2, sigma_I2[k]);
  }
  
  
  // Structural model
  eta1 ~ normal(beta_a1 * X, 1); // M1 influenced by X
  eta2 ~ normal(beta_a2 * X, 1); // M2 influenced by X
  Y ~ normal(beta_c * X + beta_b1 * eta1 + beta_b2 * eta2, sigma_Y); // Y influenced by X, M1, and M2
}

Any help would be greatly appreciated

Welcome to the Discourse. You might have more luck if you constrain the first loading in each set to 1 and then estimate the variances. You can do this by specifying one fewer loading in the parameters block and then reconstructing the full vector of loadings (with the first being one) in the transformed parameters block.

You might also look at blavaan. Since you’re using normal distributions for the likelihood throughout, you’re well set-up to use the more efficient formulation that marginalizes out the latent variables. You could do this with your own Stan code of course but blavaan makes it much easier.

Thanks so much for the suggestion and quick response. I will for sure try to switch out where the constraint is. I have used blavaan in the past for a similar model. This current code is just a first step with Stan before I dive into a much more complex model that I am investigating where I will be utilizing priors outside of normal. I am just trying to get this simpler model done for troubleshooting so that I am only trying to figure out one problem at a time.

1 Like