Hi Stan community,
I am fitting a model in stan where a parameter of interest needs to be derived based on products in matrix X and need to be assigned to matrix Y, so the covariance can be estimated with other parameters in matrix Y. In the transformed parameters block I specified the following (example):
matrix[Nind,3] Y; // Estimates per individual derived from model equation for parameter 1 & 2
matrix[Nind,2] X; // Estimates per individual derived from model equation
for (ind in 1:Nind)
Y[ind,3] = (constant^2 * X[ind,1]) + X[ind,2]; // estimate parameter Y3
This seems to be working fine in order to derive the expected values for Y3 based on the expected estimates from the equation.
However, the covariances/correlations that are estimated in the generated quantities block deviate a lot from the values set in my simulations. I couldn’t find any other posts similar to this issue, so is there something I’m doing wrong with the model specifications?
All the best,
Corné
Can you maybe share the rest of the code and simulation? There’s not much to go on here!
What you did provide can be the following one-liner:
Y[ , 3] = constant^2 * X[ , 1] + X[ , 2];
This only defines the third column of the matrix Y
.
Hi @Bob_Carpenter,
I should have done so in the first place, but thought the error originated in the small piece of code in the previous post, but here is the full model code pasted below. The matrix product loop is in the transformed parameters section. The 2 correlations that are miss-estimated compared to the simulated data are cor_P2 and cor_P3, which are based on i[,3], which is the matrix product derived by the for loop. I have attached the simulated data below, let me know if you have any further questions.
Sim_PsiPhiW_CdG.csv (2.3 MB)
data {
int<lower=1> No; // total number of observations
int<lower=1> individual[No]; // individual identity for each observation
int<lower=1> opponent1[No]; // individual identity for each observation
int<lower=1> opponent2[No]; // individual identity for each observation
int<lower=1> Na; // number of individuals
real xi[No]; // focal x
real xjk[No]; // social partners x
real z[No]; // phenotypic observations
}
parameters {
real mu_x; // intercept x
real mu; // overall intercept
real psi; // psi slope for z1
// i-matrix components
vector<lower=0>[3] sd_I; // permanent individual standard deviation
matrix[Na,3] iz; // standardised permanent individual effects
cholesky_factor_corr[3] LI; // Permanent environment effect
// xe-matrix components
vector<lower=0>[2] sd_xe; // individual standard deviation
matrix[Na,2] ixe; // standardised individual effects
cholesky_factor_corr[2] Lxe; // Permanent environment effect
// Sigmas likelihood function
real<lower=0> sigma_ez; // residual sigma
real<lower=0> sigma_ex; // residual sigma x
}
transformed parameters {
// Make I-matrix
matrix[Na,3] i; // permanent individual effects
i = iz * diag_pre_multiply(sd_I, LI)';
// Make xe-matrix
matrix[Na,2] xe; // permanent individual effects
xe = ixe * diag_pre_multiply(sd_xe, Lxe)';
// Expected trait values for each observation
vector[No] e_x; // predicted values x
vector[No] z_exp; // expected phenotypic values
// Expected trait values for each observation
for (o in 1:No)
e_x[o] = mu_x + xe[individual[o],1]; // errors in variable model for x
// Partition the phenotypes into I_DIE & I_IIE
for (o in 1:No)
z_exp[o] = mu + i[individual[o],1] + (psi + i[individual[o],2])*xjk[o] + xe[opponent1[o],2] + xe[opponent2[o],2];
// Derive phi/social impact per individual for z1
for (ind in 1:Na)
i[ind,3] = (psi^2 * xe[ind,1]) + xe[ind,2];
}
model {
// Priors
mu_x ~ normal(0, 5); // Weakly informative prior on the intercept of x
mu ~ normal(0, 5); // Weakly informative prior on the intercept
psi ~ normal(0, 5); // Weakly informative prior on the slope
to_vector(sd_I) ~ exponential(3); // Ditto, and note that these distributions are truncated to be positive (given the declaration of sd)
to_vector(sd_xe) ~ exponential(3);
// Random effect definition (i.e. lower levels of the hierarchy of effects)
to_vector(iz) ~ normal(0,1); // implies i ~ normal(0,sd_I)
to_vector(ixe) ~ normal(0,1);
LI ~ lkj_corr_cholesky(3); //Prior PE correlation
Lxe ~ lkj_corr_cholesky(3); //Prior PE correlation
// Cholesky multinormal likelihood function to estimate residual correlations
for (o in 1:No)
z[o] ~ normal(z_exp[o], sigma_ez);
//Errors in variable likelihood
for (o in 1:No)
xi[o] ~ normal(e_x[o], sigma_ex);
}
generated quantities {
//Variances
real var_DIE = sd_I[1]^2;
real var_psi = sd_I[2]^2;
real var_epsilon = sd_xe[2]^2;
real var_phi = sd_I[3]^2;
real var_x = sd_xe[1]^2;
// Residual variation
real var_res = sigma_ez^2;
// Total variation
real<lower=0> var_P = var_DIE + var_psi + var_epsilon + var_x + var_res;
// Repeatabilities
real<lower=0> rep_DIE = var_DIE/var_P; // rep DIE
real<lower=0> rep_psi = var_psi/var_P; // rep psi
real<lower=0> rep_epsilon = var_epsilon/var_P; // rep epsilon
real<lower=0> rep_phi = var_phi/var_P; // rep phi
// Correlations
matrix[3, 3] Omega_P;
Omega_P = LI * LI';
real cor_P1 = Omega_P[1,2];
real cor_P2 = Omega_P[1,3];
real cor_P3 = Omega_P[2,3];
matrix[2, 2] Omega_xe;
Omega_xe = Lxe * Lxe';
real cor_xe = Omega_xe[1,2];
}
Now that I’ve reread the original question, are you checking the coverage of posterior intervals? It takes a lot of data to get tight posteriors on covariance matrices in dimensions higher than a handful.
Can you simulate data according to the model and recover the right correlations?
P.S. You can declare and define on one line:
matrix[2, 2] Omega_xe = Lxe * Lxe';
This can be sped up:
for (o in 1:No)
z[o] ~ normal(z_exp[o], sigma_ez);
to
z ~ normal(z_exp, sigma_ez);
Thanks @Bob_Carpenter for the reply!
The correlation coefficients that I set in the simulations are all 0.5, however the model fit yields correlation coefficients close to zero with a 95% CI that ranges from ~-0.6 to ~0.6. The variance attributed to the i[,3] also doesn’t match with the simulated values. I recalculated the variance in the model according to the rules of variance which matches the simulated variance. The simulation has ~6000 observations for 245 individuals (the unit where the correlations are estimated), and for the other correlations the 95% CI deviates about 0.1 to 0.2 from the median correlation coefficient. I have never simulated data according to the model to recover the right correlations, what do you exactly mean with that?
Interestingly, when I extract the observations in the i matrix from the posterior it looks like i[,3] is estimated correctly. Furthermore, the correlations of i[,1]-i[,3] and i[,2]-i[,3] are around 0.5 based on a simple correlation of the posterior medians . So perhaps something is going wrong because i[,3] is assigned at a later stage? Would you know of a way to resolve this?
P.S. thanks for the other tips! Is there a way to also vectorise the model equations? I tried this before, but failed because I assign individual values to the matrices which need to be indexed for the individual identifier (i.e. individual or opponent id).