# How to deal with within-chain sign flipping?

I’m having chains that flip back and forth between signs (or, at least, they look like that). As far as I can tell, my model is identified so there shouldn’t be issues on that front. How can I ensure that my chains will be stable without imposing sign constraints?

I’m estimating a hybrid logistic regression with individual-level effects. The model has two components: a CFA that generates latent variables, and a logistic regression that uses those latent variables as predictors (together with other variables). The individual-level effects are introduced through the latent variables x_i^* in the intercepts of v_i. (Note that, in the actual code, I’m setting one element in \alpha to zero so the model is identified. This means that one row of \Gamma is also equal to zero.)

The model looks something like this:

\begin{align} y_i &\sim \mathtt{categorical\_logit}(v_i) \\ v_i &= \alpha_i + X_i \beta \\ \alpha_i &= \bar{\alpha} + \Gamma x_i^* \\ \therefore v_i &= \left(\bar{\alpha} + X_i \beta\right) + \left(\Gamma x_i^*\right) \\ I_i &= D x_i^* + \eta_i \\ x_i^* &\sim N(0, \Omega) \end{align}

where i indexes the decision maker, y_i are observed choices made by I, I_i are de-meaned psychometric indicators. Matrix D contains factor loadings and \eta_i are measurement errors.

Most parameters look OK except for \Gamma. Here are the chains for one element:

To generate \alpha_i I used a standard normal variable that I then transformed using the covariance matrix (in a nutshell, z ~ std_normal() and \Gamma x_i^* is z * L_Omega * Gamma). Those variables also looks strange:

Oddly, the chains for \alpha_i mix properly.

I think the math is correct and the model is identified, so there must be some kind of issue with sign-switching between z and What can I do to make sure that my chains are stable?

Here’s the code:

data {
int<lower=1> N; // n of observations (number of individuals times number of choice scenarios per individual)

int<lower=2> J; // n of alternatives
int<lower=1> K; // n of choice covariates
int<lower=0,upper=J> choices[N]; // choices
matrix[J,K] X[N]; // [N, J, K] array of choice attributes
int task_individual[N]; // for each choice situation N, who was the corresponding individual?

int<lower=1> I; // n of individuals
int<lower=1> M; // n of factors/latent variables
int<lower=1> P; // n of observed indicators
vector[P] Ind[I]; // matrix[N,P] of indicators matrix[I,P] Ind; // matrix[N,P] of indicators
int<lower=0, upper=1> D_skeleton[P,M]; // which LVs are related to which indicators? 1 if yes, 0 if not
int Dvec_length; // equal to sum(D_skeleton)
}

transformed data {
matrix[I,P] Ind_center;
matrix[P,P] Ind_scatter;

int constrained_loading[M] = find_firstLoading(P, M, D_skeleton); // for each LV, the index of the indicator that will be constrained to always be positive

for(p in 1:P) {
Ind_center[,p] = Ind[,p] - mean(Ind[,p]);
}

Ind_scatter = Ind_center' * Ind_center; // observed covariance of indicators
}

parameters {
vector[J-1] alpha_raw; // J-1 free alphas, one alpha is always = 0
vector[K] beta;
matrix[J-1,M] Gamma_raw; // J-1 free rows, one is always = 0

matrix[I,M] z; // individual-level random disturbances of latent variables

vector[Dvec_length] Dvec; // factor loadings in vector form that will then be converted into a matrix
vector<lower=0>[P] psi; // variance of indicators
cholesky_factor_corr[M] L_Omega; // cholesky factor of the covariance of LVs. (Actually a correlation so the variance of the LVs = 1)
}

transformed parameters {
matrix[I,J] alpha_indiv = append_col(
rep_vector(0, I),
rep_matrix(alpha_raw', I) + z * L_Omega * Gamma_raw'
); // alpha_indiv is equal to the mean alpha + Gamma * LV, where LV = L_Omega * z[i]

matrix[P,M] D_raw = Dvec_to_D(P, M, D_skeleton, Dvec); // convert to a matrix of factor loadings. See the function below.
}

model {
matrix[P, P] Ind_cov = add_diag(tcrossprod(D_raw * L_Omega), psi);

alpha_raw ~ normal(0, 2);
beta ~ normal(0, 2);
to_vector(Gamma_raw) ~ normal(0, 1);

to_vector(z) ~ std_normal();

Dvec ~ normal(1, 2);
psi ~ inv_gamma(1, 2);
L_Omega ~ lkj_corr_cholesky(4);

Ind_scatter ~ wishart(I, Ind_cov);

for (i in 1:N) {
choices[i] ~ categorical_logit(alpha_indiv[task_individual[i]]' + X[i] * beta);
}
}

generated quantities {
matrix[M,M] Omega;
matrix[P,M] D;

Omega = multiply_lower_tri_self_transpose(L_Omega);

// Deal with sign flipping
D = D_raw;
for (m in 1:M) {