The issue of adding random slopes in non-central parameterization

zrn · February 5, 2024, 8:38am

Hello everyone! I’ve encountered some issues while using rstan. My model is an intermediate model with two crossed random factors. Upon recommendation, I employed the non-central parameterization method to improve the fitting performance and avoid warnings. Initially, I constructed a model with only random intercepts (code provided below), and at this point, all coefficients appeared normal. However, when I added random slopes to this model, none of the coefficients for fixed effects were significant. I’m wondering why this is happening and how I can improve it. Any help would be greatly appreciated.

randomalpha="data{
int<lower=1> N;
vector[N] c1;
vector[N] c2;
vector[N] aq;
vector[N] cr;
vector[N] adt;
int<lower=1> J;
int<lower=1, upper=J> Group[N];
int<lower=1> G;
int<lower=1, upper=G> Group2[N];
}
parameters{
real alphaaq;
real beta1aq;
real beta2aq;
real<lower=0> sigma_alphaaq;
real<lower=0> sigma_beta1aq;
real<lower=0> sigma_beta2aq;
real<lower=0> sigmaaq;
real alphacr;
real beta1cr;
real beta2cr;
real<lower=0> sigma_alphacr;
real<lower=0> sigma_beta1cr;
real<lower=0> sigma_beta2cr;
real<lower=0> sigmacr;
real alphaadt;
real beta1adt;
real beta2adt;
real beta3adt;
real beta4adt;
real<lower=0> sigma_alphaadt;
real<lower=0> sigma_beta1adt;
real<lower=0> sigma_beta2adt;
real<lower=0> sigma_beta3adt;
real<lower=0> sigma_beta4adt;
real<lower=0> sigma_adt;
cholesky_factor_corr[3] Omega_chol1;
cholesky_factor_corr[3] Omega_chol2;
vector<lower=0>[3] sigma_ranef1;
vector<lower=0>[3] sigma_ranef2;
matrix[J,3] gamma1_raw; // random effects for id
matrix[G,3] gamma2_raw; // random effects for topic
vector[3] mu1;
vector[3] mu2;
}
transformed parameters{
vector[J] gamma1alphaaq;
vector[J] gamma1alphacr;
vector[J] gamma1alpha;
vector[G] gamma2alphaaq;
vector[G] gamma2alphacr;
vector[G] gamma2alpha;
vector[N] mu_adt;
vector[N] mu_aq;
vector[N] mu_cr;
matrix[J,3] gamma1 = rep_matrix(mu1’, J) + gamma1_raw * diag_pre_multiply(sigma_ranef1, Omega_chol1)‘;
matrix[G,3] gamma2 = rep_matrix(mu2’, G) + gamma2_raw * diag_pre_multiply(sigma_ranef2, Omega_chol2)';
matrix[3,3] DC1 = diag_pre_multiply(sigma_ranef1, Omega_chol1);
matrix[3,3] DC2 = diag_pre_multiply(sigma_ranef2, Omega_chol2);
for (j in 1:J){
gamma1alphaaq[j] = gamma1[j,1];
gamma1alphacr[j] = gamma1[j,2];
gamma1alpha[j]= gamma1[j,3];
}
for (g in 1:G) {
gamma2alphaaq[g] = gamma2[g,1];
gamma2alphacr[g] = gamma2[g,2];
gamma2alpha[g]= gamma2[g,3];
}
for (n in 1:N) {
mu_aq[n] = alphaaq + gamma1alphaaq[Group[n]] + gamma2alphaaq[Group2[n]]+
(beta1aq) * c1[n] +
(beta2aq) * c2[n];
mu_cr[n] = alphacr + gamma1alphacr[Group[n]] + gamma2alphacr[Group2[n]]+
(beta1cr) * c1[n] +
(beta2cr) * c2[n];
mu_adt[n] = alphaadt + gamma1alpha[Group[n]] + gamma2alpha[Group2[n]]+
(beta1adt) * c1[n] +
(beta2adt) * c2[n] +
(beta3adt) * aq[n] +
(beta4adt) * cr[n];
}
}model{
sigma_alphaaq ~ student_t(5, 0, 2);
sigma_beta1aq ~ student_t(5, 0, 2);
sigma_beta2aq ~ student_t(5, 0, 2);
alphaaq ~ normal(0, sigma_alphaaq);
beta1aq ~ normal(0, sigma_beta1aq);
beta2aq ~ normal(0, sigma_beta2aq);
sigmaaq ~ student_t(5, 0, 2);
sigma_alphacr ~ student_t(5, 0, 2);
sigma_beta1cr ~ student_t(5, 0, 2);
sigma_beta2cr ~ student_t(5, 0, 2);
alphacr ~ normal(0, sigma_alphacr);
beta1cr ~ normal(0, sigma_beta1cr);
beta2cr ~ normal(0, sigma_beta2cr);
to_vector(gamma1_raw) ~ normal(0, 1);
to_vector(gamma2_raw) ~ normal(0, 1);
sigmacr ~ student_t(5, 0, 2);
sigma_alphaadt ~ student_t(5, 0, 2);
sigma_beta1adt ~ student_t(5, 0, 2);
sigma_beta2adt ~ student_t(5, 0, 2);
sigma_beta3adt ~ student_t(5, 0, 2);
sigma_beta4adt ~ student_t(5, 0, 2);
alphaadt ~ normal(0, sigma_alphaadt);
beta1adt ~ normal(0, sigma_beta1adt);
beta2adt ~ normal(0, sigma_beta2adt);
beta3adt ~ normal(0, sigma_beta3adt);
beta4adt ~ normal(0, sigma_beta4adt);
sigma_adt ~ student_t(5, 0, 2);
sigma_ranef1 ~ student_t(5, 0, 2);
sigma_ranef2 ~ student_t(5, 0, 2);
Omega_chol1 ~ lkj_corr_cholesky(2.0);
Omega_chol2 ~ lkj_corr_cholesky(2.0);
aq ~ normal(mu_aq, sigmaaq);
cr ~ normal(mu_cr, sigmacr);
adt ~ normal(mu_adt, sigma_adt);
mu1 ~ normal(0, 2);
mu2 ~ normal(0, 2);
// for (j in 1:J)
// gamma1[j] ~ multi_normal_cholesky(mu1, DC1);
// for (g in 1:G)
// gamma2[g] ~ multi_normal_cholesky(mu2, DC2);
}
generated quantities{
real naiveIndEffect1;
real naiveIndEffect2;
real naiveIndEffect3;
real naiveIndEffect4;
real totalEffect;
naiveIndEffect1=beta1aqbeta3adt;
naiveIndEffect2=beta1crbeta4adt;
naiveIndEffect3=beta2aqbeta3adt;
naiveIndEffect4=beta2crbeta4adt;
totalEffect=naiveIndEffect1+naiveIndEffect2+naiveIndEffect3+naiveIndEffect4+beta1adt
+beta2adt;
}
"

randombeta model:

randombeta="data{
int<lower=1> N;
vector[N] c1;
vector[N] c2;
vector[N] aq;
vector[N] cr;
vector[N] adt;
int<lower=1> J;
int<lower=1, upper=J> Group[N];
int<lower=1> G;
int<lower=1, upper=G> Group2[N];
}
parameters{
real alphaaq;
real beta1aq;
real beta2aq;
real<lower=0> sigma_alphaaq;
real<lower=0> sigma_beta1aq;
real<lower=0> sigma_beta2aq;
real<lower=0> sigmaaq;
real alphacr;
real beta1cr;
real beta2cr;
real<lower=0> sigma_alphacr;
real<lower=0> sigma_beta1cr;
real<lower=0> sigma_beta2cr;
real<lower=0> sigmacr;
real alphaadt;
real beta1adt;
real beta2adt;
real beta3adt;
real beta4adt;
real<lower=0> sigma_alphaadt;
real<lower=0> sigma_beta1adt;
real<lower=0> sigma_beta2adt;
real<lower=0> sigma_beta3adt;
real<lower=0> sigma_beta4adt;
real<lower=0> sigma_adt;
cholesky_factor_corr[11] Omega_chol1;
cholesky_factor_corr[11] Omega_chol2;
vector<lower=0>[11] sigma_ranef1;
vector<lower=0>[11] sigma_ranef2;
matrix[J,11] gamma1_raw;
matrix[G,11] gamma2_raw;
vector[11] mu1;
vector[11] mu2;
}
transformed parameters{
vector[J] gamma1alphaaq;
vector[J] gamma1alphacr;
vector[J] gamma1alpha;
vector[J] gamma1beta1aq;
vector[J] gamma1beta2aq;
vector[J] gamma1beta1cr;
vector[J] gamma1beta2cr;
vector[J] gamma1beta1adt;
vector[J] gamma1beta2adt;
vector[J] gamma1beta3adt;
vector[J] gamma1beta4adt;
vector[G] gamma2alphaaq;
vector[G] gamma2alphacr;
vector[G] gamma2alpha;
vector[G] gamma2beta1aq;
vector[G] gamma2beta2aq;
vector[G] gamma2beta1cr;
vector[G] gamma2beta2cr;
vector[G] gamma2beta1adt;
vector[G] gamma2beta2adt;
vector[G] gamma2beta3adt;
vector[G] gamma2beta4adt;
vector[N] mu_adt;
vector[N] mu_aq;
vector[N] mu_cr;
matrix[J,11] gamma1 = rep_matrix(mu1’, J) + gamma1_raw * diag_pre_multiply(sigma_ranef1, Omega_chol1)‘; // random effects for id
matrix[G,11] gamma2 = rep_matrix(mu2’, G) + gamma2_raw * diag_pre_multiply(sigma_ranef2, Omega_chol2)';
matrix[11,11] DC1 = diag_pre_multiply(sigma_ranef1, Omega_chol1);
matrix[11,11] DC2 = diag_pre_multiply(sigma_ranef2, Omega_chol2);
for (j in 1:J){
gamma1alphaaq[j] = gamma1[j,1];
gamma1alphacr[j] = gamma1[j,2];
gamma1alpha[j]= gamma1[j,3];
gamma1beta1aq[j]= gamma1[j,4];
gamma1beta1cr[j]= gamma1[j,5];
gamma1beta1adt[j]= gamma1[j,6];
gamma1beta2aq[j]= gamma1[j,7];
gamma1beta2cr[j]= gamma1[j,8];
gamma1beta2adt[j]= gamma1[j,9];
gamma1beta3adt[j]= gamma1[j,10];
gamma1beta4adt[j]= gamma1[j,11];
}
for (g in 1:G) {
gamma2alphaaq[g] = gamma2[g,1];
gamma2alphacr[g] = gamma2[g,2];
gamma2alpha[g]= gamma2[g,3];
gamma2beta1aq[g]= gamma2[g,4];
gamma2beta1cr[g]= gamma2[g,5];
gamma2beta1adt[g]= gamma2[g,6];
gamma2beta2aq[g]= gamma2[g,7];
gamma2beta2cr[g]= gamma2[g,8];
gamma2beta2adt[g]= gamma2[g,9];
gamma2beta3adt[g]= gamma2[g,10];
gamma2beta4adt[g]= gamma2[g,11];
}
for (n in 1:N) {
mu_aq[n] = alphaaq + gamma1alphaaq[Group[n]] + gamma2alphaaq[Group2[n]]+
(beta1aq+gamma1beta1aq[Group[n]]+gamma2beta1aq[Group2[n]]) * c1[n] +
(beta2aq+gamma1beta2aq[Group[n]]+gamma2beta2aq[Group2[n]]) * c2[n];
mu_cr[n] = alphacr + gamma1alphacr[Group[n]] + gamma2alphacr[Group2[n]]+
(beta1cr+gamma1beta1cr[Group[n]]+gamma2beta1cr[Group2[n]]) * c1[n] +
(beta2cr+gamma1beta2cr[Group[n]]+gamma2beta2cr[Group2[n]]) * c2[n];
mu_adt[n] = alphaadt + gamma1alpha[Group[n]] + gamma2alpha[Group2[n]]+
(beta1adt+gamma1beta1adt[Group[n]]+gamma2beta1adt[Group2[n]]) * c1[n] +
(beta2adt+gamma1beta2adt[Group[n]]+gamma2beta2adt[Group2[n]]) * c2[n] +
(beta3adt+gamma1beta3adt[Group[n]]+gamma2beta4adt[Group2[n]]) * aq[n] +
(beta4adt+gamma1beta4adt[Group[n]]+gamma2beta4adt[Group2[n]]) * cr[n];
}
}
model{
sigma_alphaaq ~ student_t(5, 0, 2);
sigma_beta1aq ~ student_t(5, 0, 2);
sigma_beta2aq ~ student_t(5, 0, 2);
alphaaq ~ normal(0, sigma_alphaaq);
beta1aq ~ normal(0, sigma_beta1aq);
beta2aq ~ normal(0, sigma_beta2aq);
sigmaaq ~ student_t(5, 0, 2);
sigma_alphacr ~ student_t(5, 0, 2);
sigma_beta1cr ~ student_t(5, 0, 2);
sigma_beta2cr ~ student_t(5, 0, 2);
alphacr ~ normal(0, sigma_alphacr);
beta1cr ~ normal(0, sigma_beta1cr);
beta2cr ~ normal(0, sigma_beta2cr);
to_vector(gamma1_raw) ~ normal(0, 1);
to_vector(gamma2_raw) ~ normal(0, 1);
sigmacr ~ student_t(5, 0, 2);
sigma_alphaadt ~ student_t(5, 0, 2);
sigma_beta1adt ~ student_t(5, 0, 2);
sigma_beta2adt ~ student_t(5, 0, 2);
sigma_beta3adt ~ student_t(5, 0, 2);
sigma_beta4adt ~ student_t(5, 0, 2);
alphaadt ~ normal(0, sigma_alphaadt);
beta1adt ~ normal(0, sigma_beta1adt);
beta2adt ~ normal(0, sigma_beta2adt);
beta3adt ~ normal(0, sigma_beta3adt);
beta4adt ~ normal(0, sigma_beta4adt);
sigma_adt ~ student_t(5, 0, 2);
sigma_ranef1 ~ student_t(5, 0, 2);
sigma_ranef2 ~ student_t(5, 0, 2);
Omega_chol1 ~ lkj_corr_cholesky(2.0);
Omega_chol2 ~ lkj_corr_cholesky(2.0);
aq ~ normal(mu_aq, sigmaaq);
cr ~ normal(mu_cr, sigmacr);
adt ~ normal(mu_adt, sigma_adt);
mu1 ~ normal(0, 2);
mu2 ~ normal(0, 2);
// for (j in 1:J)
// gamma1[j] ~ multi_normal_cholesky(mu1, DC1);
// for (g in 1:G)
// gamma2[g] ~ multi_normal_cholesky(mu2, DC2);
}
generated quantities{
real naiveIndEffect1;
real naiveIndEffect2;
real naiveIndEffect3;
real naiveIndEffect4;
real avgIndEffect1_1;
real avgIndEffect1_2;
real avgIndEffect2_1;
real avgIndEffect2_2;
real avgIndEffect3_1;
real avgIndEffect3_2;
real avgIndEffect4_1;
real avgIndEffect4_2;
real totalEffect;
naiveIndEffect1=beta1aq * beta3adt;
avgIndEffect1_1=beta1aq * beta3adt+DC1[4,10];
avgIndEffect1_2=beta1aq * beta3adt+DC2[4,10];
naiveIndEffect2=beta1cr * beta4adt;
avgIndEffect2_1=beta1cr * beta4adt+DC1[5,11];
avgIndEffect2_2=beta1cr * beta4adt+DC2[5,11];
naiveIndEffect3=beta2aq * beta3adt;
avgIndEffect3_1=beta2aq * beta3adt+DC1[7,10];
avgIndEffect3_2=beta2aq * beta3adt+DC2[7,10];
naiveIndEffect4=beta2cr * beta4adt;
avgIndEffect4_1=beta2cr * beta4adt+DC1[8,11];
avgIndEffect4_2=beta2cr * beta4adt+DC2[8,11];
totalEffect=naiveIndEffect1+naiveIndEffect2+naiveIndEffect3+naiveIndEffect4+beta1adt
+beta2adt;
}
"

Bob_Carpenter · February 12, 2024, 11:23pm

Hi, @zrn. Did you sort out your issue? It looks like you marked your reply as an answer and then deleted it.

If by significant, you mean having a posterior interval bounded away from zero, then the question is why you would expect the coefficients to be significant in this way? My suggestion would be to simulate data and see if your model recovers it.

I’d also strongly recommend putting your Stan programs in separate files with indentation. That way, line number based errors are easy to resolve.

You can define mu_aq in a one-linear using vectorization.

The beta1adt etc. have centered parameterizations which may prove difficult to fit. You can switch to non-centered by changing the declaration to:

real<multiplier=sigma_beta1adt> beta1adt;

zrn · February 13, 2024, 5:41am

@ Bob_Carpenter Unfortunately, the problem has not been solved. Thank you for your reply! I put the code for the centered parameterized and non-centered parameterized versions in Why The effects unexpectedly became smaller after employing non-central parameterization？. I am puzzled why the significance of the effect is so different in the two versions of the model.
I really need help：（

zrn · February 13, 2024, 5:29pm

I removed

vector[11] mu1;
vector[11] mu2;
mu1 ~ normal(0, 2);
mu2 ~ normal(0, 2);

And changed

matrix[J,11] gamma1 = rep_matrix(mu1’, J) + gamma1_raw * diag_pre_multiply(sigma_ranef1, Omega_chol1) ; // random effects for id
matrix[G,11] gamma2 = rep_matrix(mu2’, G) + gamma2_raw * diag_pre_multiply(sigma_ranef2, Omega_chol2)';

to

matrix[J, 11] gamma1 = gamma1_raw * diag_pre_multiply(sigma_ranef1, Omega_chol1)‘;
matrix[G, 11] gamma2 = gamma2_raw * diag_pre_multiply(sigma_ranef2, Omega_chol2)’;

And I obtained mostly significant effects (similar to the case with only random intercept version). Does this imply that mu has an impact on gamma and I should delete it?
result with mu：

result without mu：

Topic		Replies	Views
Why The effects unexpectedly became smaller after employing non-central parameterization？ Modeling rstan , specification , cmdstanr , brms	5	365	February 13, 2024
Trouble with non-central parameterization in meta-analysis Modeling fitting-issues , meta-analysis	8	1099	February 6, 2018
Sigma / correlation matrix and non-centered parameterization of random effects WITHOUT Cholesky factorization Modeling	4	777	January 31, 2022
Centered or non-centered parametrization for random effects Modeling	15	5868	July 15, 2017
Apparently equivalent noncentered parameterization, different outcomes Modeling rstan , fitting-issues	2	109	August 8, 2024

The issue of adding random slopes in non-central parameterization

Related topics