Identification problems in a one-factor latent variable model

I am fitting a simple, one-factor latent variable model in Stan, comparing the results to the output of lavaan, but am running into problems estimating models with different identification constraints. In particular, setting the latent variable to be a unit normal and estimating all factor loadings and intercepts.

Consider J measured variables y_{ij} for instances i = 1 ... N and variables j = 1 ... J, each with their own intercept parameter \alpha_j and factor loading \beta_j on latent variable x:

y_{ij} = \alpha_j + \beta_j * x_i

where x is normally distributed:

x_i \sim N(\mu_x, \sigma_x)

The residual covariances are assumed to be zero between the y_j.

The most common constraint to ensure identification is to fix one of the \beta_j to 1, fix the location of the latent variable to 0, and estimate the variance of the latent variable. I have implemented this in Stan, and it works well:

data{
  int<lower=0> N;
  int<lower=0> J;
  matrix[N, J] y;
}

parameters{
  vector[J] alpha;
  vector[J-1] beta;
  vector<lower=0>[J] sigma;
  vector[N] x;
  real<lower=0> x_sigma;
}

transformed parameters{
  // the fixed parameter
  real beta_fix[J];
  beta_fix[1] = 1;
  for(j in 1:J-1){
    beta_fix[j+1] = beta[j];
  }
}

model{
  matrix[N,J] mu;
  x_sigma ~ normal(0, 1);
  alpha ~ normal(0, 10);
  beta ~ normal(0, 1);
  sigma ~ normal(0, 1);

  // latent variable
  x ~ normal(0, x_sigma);

  for(i in 1:N){
    for(j in 1:J){
      mu[i,j] = alpha[j] + beta_fix[j] * x[i];
      y[i,j] ~ normal(mu[i,j], sigma[j]);
    }
  }
}

Running this model using some data from the lavaan package using both lavaan and Stan:

require(lavaan)
data("HolzingerSwineford1939")

# lavaan model
HS.model <- 'visual  =~ x1 + x2 + x3'

lavaan_fit1 <- cfa(HS.model, data=HolzingerSwineford1939, meanstructure=TRUE)

summary(lavaan_fit1)

# the Stan model
y <- as.matrix(HolzingerSwineford1939[,7:9])
N <- dim(y)[1]
J <- dim(y)[2]

stan_data <- list(
  y = y, N = N, J = J
)

# fixed indicator/beta coefficient model
stan_fit_ff <- stan(file = "Stan-LV-fixed-indicator.stan", 
                 data = stan_data)

print(stan_fit_ff, pars=c("alpha","beta","sigma","x_sigma"))

A different parameterization is fix both the latent variable’s location to 0 and scale 1, and estimate all the indicator coefficients.

I have tried implementing this is Stan but the \beta_j coefficients very infrequently converge, with effective sample sizes usually of 2. Sometimes the chains converge and I get identical estimates to lavaan. The model code and R code to run this model is below.

data{
  int<lower=0> N;
  int<lower=0> J;
  matrix[N, J] y;
}

parameters{
  vector[J] alpha;
  vector[J] beta;
  vector<lower=0>[J] sigma;
  vector[N] x;
}

transformed parameters{
  // no fixed parameters
}

model{
  matrix[N,J] mu;
  alpha ~ normal(0, 10);
  beta ~ normal(0, 1);
  sigma ~ normal(0, 1);

  // latent variable
  x ~ normal(0, 1);

  for(i in 1:N){
    for(j in 1:J){
      mu[i,j] = alpha[j] + beta[j] * x[i];
      y[i,j] ~ normal(mu[i,j], sigma[j]);
    }
  }
}

require(lavaan)
data("HolzingerSwineford1939")

# lavaan model
HS.model <- 'visual  =~ x1 + x2 + x3'

lavaan_fit2 <- cfa(HS.model, data=HolzingerSwineford1939, meanstructure=TRUE, std.lv=TRUE)

summary(lavaan_fit2)

# the Stan model
y <- as.matrix(HolzingerSwineford1939[,7:9])
N <- dim(y)[1]
J <- dim(y)[2]

stan_data <- list(
  y = y, N = N, J = J
)

# standardized latent variable
stan_fit_std <- stan(file = "Stan-LV-std.stan", 
                 data = stan_data)

print(stan_fit_std, pars=c("alpha","beta","sigma"))

Does anyone know why the second model is not converging well, and what I could do about it?

Thanks for your help!

Hello!

I took a look at your problem, but I have no evident answer. From my experience with multiple factors, I know that constraining the loading matrix is inevitable, but I am not verse in factor analysis in general.

I would suggest to try with blavaan, which will produce a stan code you will be able to look at (if you set the sampler as “stan” and the writing option to TRUE).

Lucas

I would try to fix the sign of the first beta with something like

parameters{
  vector[J] alpha;
  real<lower = 0> beta_fix;
  vector[J-1] beta_unfix;
  vector<lower=0>[J] sigma;
  vector[N] x;
}

transformed parameters{
  // the fixed parameter
  real beta[J];
  beta[1] = beta_fix;
  for(j in 1:J-1){
    beta[j+1] = beta_unfix[j];
  }
}

EDIT : removed x_sigma

Thank you @ldeschamps!

I actually forgot that some programs constrain the first \beta_j to be positive. When I do this in JAGS, and run the models above, I get the same results as lavaan when setting std.lv = TRUE.

JAGS code:

model{
  for( i in 1:N){
    for(j in 1:J){
      y[i,j] ~ dnorm( mu[i,j] , 1/sigma[j]^2 )
      mu[i,j] <- alpha[j] + beta[j] * x[i]
      }

    # latent variable
    x[i] ~ dnorm(0, 1)
    }
    
    # keeping beta[1] positive
    beta[1] ~ dnorm(0,.001)T(0,)

    for(j in 2:J){
      beta[j] ~ dnorm( 0, .001)
      }
    for(j in 1:J){
      alpha[j] ~ dnorm(0, 1/10^2)
      sigma[j] ~ dunif(0, 50)
    }
  }

However, trying to do this in Stan doesn’t work either…

data{
  int<lower=0> N;
  int<lower=0> J;
  matrix[N, J] y;
}

parameters{
  vector[J] alpha;
  real<lower=0> beta1; // a separate variable for the constrained variable, for clarity
  vector[J-1] beta; // the non-constrained betas
  vector<lower=0>[J] sigma;
  vector[N] x;
}

transformed parameters{
  // no fixed parameters
}

model{
  matrix[N,J] mu;
  alpha ~ normal(0, 10);
  beta1 ~ normal(0, 10);
  beta ~ normal(0, 10);
  sigma ~ normal(0, 5);

  // latent variable
  x ~ normal(0, 1);

  for(i in 1:N){
    mu[i,1] = alpha[1] + beta1 * x[i];
    mu[i,2] = alpha[2] + beta[1] * x[i];
    mu[i,3] = alpha[3] + beta[2] * x[i];

    for(j in 1:J){
      y[i,j] ~ normal(mu[i,j], sigma[j]);
    }
  }
}

Whether fixing the first loading to be positive will identify the model well depends strongly on the relationship between the latent variable and y1. E.g., if it’s near zero, then the latent variable doesn’t contribute much at all to the likelihood of y1, and the other loadings can freely oscillate between negative and positive values.

A more stable approach is to force all betas to be positive, and have a method factor for anything that is negatively scaled. I.e., start from theory: If your indicators are positively scaled, then force their betas to be positive. If some indicators are negative scaled, then reverse them, force their loadings to be positive, and add a method factor to those items.

1 Like

Thanks @Stephen_Martin! I tried the all-positive loading constraint a few days ago and it worked great. It’s interesting because I have rarely seen that constraint. Most papers I have read about identification/rotational invariance in latent variable models hold one loading to be positive in the standardised latent variable models. Thanks again!

@Stephen_Martin Could you possibly clarify how one would implement this?

There are many ways, and ‘method factor’ should be a searchable term.

But one way could be as simple as literally imposing an additional latent factor on the negative-worded items, just to ‘control’ for any exogenous wording-direction effect.