Thanks for your feedback. I have some questions though. The model in my previous post is as close of a translation of the BUGS code in my original post. It does not have less variables than the Stan model, but still it will more than 10 times faster in BUGS 250-300 times faster in JAGS. It is therefore important for me to understand why my adaptation of it is so much slower in Stan. You commented I should use less parameters, and ditching either y
, or eta
and xi
. I am not sure whether I understand what you mean by that. Do you mean I should declare them as local variable within the model instead? Or did you mean these variables can be dropped from the model altogether? I completely agree that at least y
should be dropped as parameter, since it’s only there for fitting purposes. As for the latent variables eta
and xi
, they’re strictly not necessary as monitored parameters either. However, I do not see how I can remove them from the model entirely.
Now, defining these three latent variables locally in the model, similar to what is described in chapter 35.5 in the User’s Guide, seems possible. However, the problem I run into if I do is that if I define any of these variables locally and not as parameters, I get the following error returned when attempting to fit the model:
Exception: normal_lpdf: Random variable is nan, but must be not nan!
According to this answer to a similar problem:
So, the question becomes: If I am not to specify the aforementioned latent variable as parameters, how do I assign values to them?
Anyway, I have rewritten the model as described below. It runs in ~45 min now, so it is more efficient than the original model. Suggested alternative local variable declarations of y
, eta
and xi
are listed as comments.
data{
int<lower=0> N; // number of observations
int<lower=1> P; // number of variables
array[N, P] int<lower=1, upper=5> z; // observed data
array[P] ordered[6] thd; // Thresholds
}
transformed data {
cov_matrix[4] R = diag_matrix(rep_vector(8.0, 4)); // prior covariance matrix for xi
vector[4] u = rep_vector(0, 4); // prior mean for xi
array[N] vector[P] L;
array[N] vector[P] U;
for(i in 1:N){
for(j in 1:P){
L[i,j] = thd[j, z[i, j]];
U[i,j] = thd[j, z[i, j] + 1];
}
}
}
parameters {
array[21] real lam; // pattern coefficients
row_vector[4] gam; // coefficients
vector<lower=0>[P] psi; // precisions of y
real<lower=0> psd; // precision of eta
cov_matrix[4] phi;
array[N] vector<lower=L, upper=U>[P] y;
array[N] vector[4] xi; // latent variables
vector[N] eta; // latent variables
}
transformed parameters {
}
model {
// Local variables
vector[P] mu;
// vector[4] xi; // latent variables
vector[P] sgm = 1/psi;
real sgd = 1/psd;
matrix[4, 4] phx = inverse(phi);
psi ~ gamma(10,8); //priors on precisions
//priors on loadings and coefficients
lam[1] ~ normal(0.8, 4.0 * psi[2]);
lam[2] ~ normal(0.8, 4.0 * psi[4]);
lam[3] ~ normal(0.8, 4.0 * psi[5]);
lam[4] ~ normal(0.8, 4.0 * psi[6]);
lam[5] ~ normal(0.8, 4.0 * psi[7]);
lam[6] ~ normal(0.8, 4.0 * psi[8]);
lam[7] ~ normal(0.8, 4.0 * psi[9]);
lam[8] ~ normal(0.8, 4.0 * psi[11]);
lam[9] ~ normal(0.8, 4.0 * psi[12]);
lam[10] ~ normal(0.8, 4.0 * psi[13]);
lam[11] ~ normal(0.8, 4.0 * psi[14]);
lam[12] ~ normal(0.8, 4.0 * psi[15]);
lam[13] ~ normal(0.8, 4.0 * psi[17]);
lam[14] ~ normal(0.8, 4.0 * psi[18]);
lam[15] ~ normal(0.8, 4.0 * psi[20]);
lam[16] ~ normal(0.8, 4.0 * psi[21]);
lam[17] ~ normal(0.8, 4.0 * psi[22]);
lam[18] ~ normal(0.8, 4.0 * psi[23]);
lam[19] ~ normal(0.8, 4.0 * psi[24]);
lam[20] ~ normal(0.8, 4.0 * psi[25]);
lam[21] ~ normal(0.8, 4.0 * psi[26]);
psd ~ gamma(10,8); //priors on precisions
gam ~ normal(0.6, 4 * psd);
phi ~ wishart(30, R); //priors on precisions
for(i in 1:N){
xi[i] ~ multi_normal(u, phi);
// structural equation model
real nu;
nu = gam * xi[i];
// real eta;
eta[i] ~ normal(nu, psd);
real dthat = eta[i] - nu;
//measurement equation model
mu[1] = eta[i];
mu[2] = lam[1] * eta[i];
mu[3] = xi[i, 1];
mu[4] = lam[2] * xi[i, 1];
mu[5] = lam[3] * xi[i, 1];
mu[6] = lam[4] * xi[i, 1];
mu[7] = lam[5] * xi[i, 1];
mu[8] = lam[6] * xi[i, 1];
mu[9] = lam[7] * xi[i, 1];
mu[10] = xi[i, 2];
mu[11] = lam[8] * xi[i, 2];
mu[12] = lam[9] * xi[i, 2];
mu[13] = lam[10] * xi[i, 2];
mu[14] = lam[11] * xi[i, 2];
mu[15] = lam[12] * xi[i, 2];
mu[16] = xi[i, 3];
mu[17] = lam[13] * xi[i, 3];
mu[18] = lam[14] * xi[i, 3];
mu[19] = xi[i, 4];
mu[20] = lam[15] * xi[i, 4];
mu[21] = lam[16] * xi[i, 4];
mu[22] = lam[17] * xi[i, 4];
mu[23] = lam[18] * xi[i, 4];
mu[24] = lam[19] * xi[i, 4];
mu[25] = lam[20] * xi[i, 4];
mu[26] = lam[21] * xi[i, 4];
for(j in 1:P){
// real y;
y[i, j] ~ normal(mu[j], psi[j])T[L[i, j], U[i, j]];
real ephat;
ephat = y[i, j] - mu[j];
// ephat[i, j] = y[i, j] - mu[j];
}
} // end of i
} //end of model
Regarding the blavaan
model.
While this program certainly would certainly be helpful for fitting this model, I am currently trying to learn how to write efficient SEMs in Stan for training purposes. There’s not a considerable amount of literature on the topic of Bayesian SEM, but I found the works of Lee (2007) and Song & Lee (2012), which provides code for how to fit SEMs in WinBUGS. Ultimately, I want to estimate a larger and more complex model, which, unfortunately, is beyond the the current capabilities of blavaan
.