How to deal with within-chain sign flipping?

tomas.rossetti · November 7, 2022, 7:30pm

I’m having chains that flip back and forth between signs (or, at least, they look like that). As far as I can tell, my model is identified so there shouldn’t be issues on that front. How can I ensure that my chains will be stable without imposing sign constraints?

I’m estimating a hybrid logistic regression with individual-level effects. The model has two components: a CFA that generates latent variables, and a logistic regression that uses those latent variables as predictors (together with other variables). The individual-level effects are introduced through the latent variables x_i^* in the intercepts of v_i. (Note that, in the actual code, I’m setting one element in \alpha to zero so the model is identified. This means that one row of \Gamma is also equal to zero.)

The model looks something like this:

\begin{align} y_i &\sim \mathtt{categorical\_logit}(v_i) \\ v_i &= \alpha_i + X_i \beta \\ \alpha_i &= \bar{\alpha} + \Gamma x_i^* \\ \therefore v_i &= \left(\bar{\alpha} + X_i \beta\right) + \left(\Gamma x_i^*\right) \\ I_i &= D x_i^* + \eta_i \\ x_i^* &\sim N(0, \Omega) \end{align}

where i indexes the decision maker, y_i are observed choices made by I, I_i are de-meaned psychometric indicators. Matrix D contains factor loadings and \eta_i are measurement errors.

Most parameters look OK except for \Gamma. Here are the chains for one element:

To generate \alpha_i I used a standard normal variable that I then transformed using the covariance matrix (in a nutshell, z ~ std_normal() and \Gamma x_i^* is z * L_Omega * Gamma). Those variables also looks strange:

Oddly, the chains for \alpha_i mix properly.

I think the math is correct and the model is identified, so there must be some kind of issue with sign-switching between z and What can I do to make sure that my chains are stable?

Here’s the code:

data {
  int<lower=1> N; // n of observations (number of individuals times number of choice scenarios per individual)
  
  int<lower=2> J; // n of alternatives
  int<lower=1> K; // n of choice covariates
  int<lower=0,upper=J> choices[N]; // choices
  matrix[J,K] X[N]; // [N, J, K] array of choice attributes
  int task_individual[N]; // for each choice situation N, who was the corresponding individual?
  
  int<lower=1> I; // n of individuals
  int<lower=1> M; // n of factors/latent variables
  int<lower=1> P; // n of observed indicators
  vector[P] Ind[I]; // matrix[N,P] of indicators matrix[I,P] Ind; // matrix[N,P] of indicators
  int<lower=0, upper=1> D_skeleton[P,M]; // which LVs are related to which indicators? 1 if yes, 0 if not
  int Dvec_length; // equal to sum(D_skeleton)
}

transformed data {
  matrix[I,P] Ind_center;
  matrix[P,P] Ind_scatter;

  int constrained_loading[M] = find_firstLoading(P, M, D_skeleton); // for each LV, the index of the indicator that will be constrained to always be positive

  for(p in 1:P) {
    Ind_center[,p] = Ind[,p] - mean(Ind[,p]);
  }

  Ind_scatter = Ind_center' * Ind_center; // observed covariance of indicators
}

parameters {
  vector[J-1] alpha_raw; // J-1 free alphas, one alpha is always = 0
  vector[K] beta;
  matrix[J-1,M] Gamma_raw; // J-1 free rows, one is always = 0
  
  matrix[I,M] z; // individual-level random disturbances of latent variables
  
  vector[Dvec_length] Dvec; // factor loadings in vector form that will then be converted into a matrix
  vector<lower=0>[P] psi; // variance of indicators
  cholesky_factor_corr[M] L_Omega; // cholesky factor of the covariance of LVs. (Actually a correlation so the variance of the LVs = 1)
}

transformed parameters {
  matrix[I,J] alpha_indiv = append_col(
    rep_vector(0, I),
    rep_matrix(alpha_raw', I) + z * L_Omega * Gamma_raw'
    ); // alpha_indiv is equal to the mean alpha + Gamma * LV, where LV = L_Omega * z[i]
  
  matrix[P,M] D_raw = Dvec_to_D(P, M, D_skeleton, Dvec); // convert to a matrix of factor loadings. See the function below.
}

model {
  matrix[P, P] Ind_cov = add_diag(tcrossprod(D_raw * L_Omega), psi);
  
  alpha_raw ~ normal(0, 2);
  beta ~ normal(0, 2);
  to_vector(Gamma_raw) ~ normal(0, 1);
  
  to_vector(z) ~ std_normal();
  
  Dvec ~ normal(1, 2);
  psi ~ inv_gamma(1, 2);
  L_Omega ~ lkj_corr_cholesky(4);
  
  Ind_scatter ~ wishart(I, Ind_cov);
  
  for (i in 1:N) {
    choices[i] ~ categorical_logit(alpha_indiv[task_individual[i]]' + X[i] * beta);
  }
}

generated quantities {
  matrix[M,M] Omega;
  matrix[P,M] D;

  Omega = multiply_lower_tri_self_transpose(L_Omega);
  
  // Deal with sign flipping
  D = D_raw;
  for (m in 1:M) {
    if (D_raw[constrained_loading[m],m] < 0) {
      D[,m] = -1 * D_raw[,m];
      Omega[m] = -1 * Omega[m];
      Omega[,m] = -1 * Omega[,m];
    }
  }
}

Topic		Replies	Views
Artifacts of sign-fixing in models with sign ambiguity Modeling	9	1115	May 14, 2018
Latent factor loadings Modeling fitting-issues , specification	20	3548	January 23, 2021
Chains don't mix where I think they should Modeling	2	48	September 8, 2024
Troubleshooting chain mixing Modeling	2	985	June 16, 2021
Warning : Chains in stan are not mixed Modeling rstan , fitting-issues , specification	2	562	August 16, 2020

How to deal with within-chain sign flipping?

Related topics