Hello!
I have a question I cannot answer about approaching DAG with stan. I have the following graph, with 5 latent variables, one exogeneous variable and 2 responses.
I would like to explore if the d-separations implied by the DAG are supported by my data. I have two constraints: I want latent variables to be estimated jointly with the regression, and I’d like them to be the same for each regression.
Would it make sense to fit the different regressions together in stan? It seems a little bit strange to me, but I could not imagine another way. If we consider for example the d-separations
Fert _||_ GDD | Moss2
Fert _||_ Vasc1
GDD _||_ Vasc1
I would then write a code similar to
transformed parameters{
// Final loading matrices
matrix[SF,DF] L_Fert = fill_loadings(DF, SF, L_lower_Fert, L_diag_Fert);
// Declare linear predictors
matrix[N,SF] log_mu_Fert = FS_Fert * L_Fert';
}
model{
// Parameters
/// Factor scores
for(d in 1:DF) target += normal_lpdf(FS_Fert[,d] | 0, sigma_FS_Fert[d]);
//// Fertility
target += std_normal_lpdf(L_lower_Fert); //Lower diagonal loadings of fertility
target += std_normal_lpdf(L_diag_Fert); //Diagonal loadings of fertility
// Likelihood for latent variables
// Fertility
for(s in 1:SF) target += normal_lpdf(Fert[,s] | log_mu_Fert[,s], sigma_Fert[s]);
// Likelihood for regressions
// d-separation
// Fert1 _||_ GDD | Moss2, P, T
target += normal_lpdf(FS_Fert[,1] | a_Fert_GDD + b_Fert_GDD * sc_GDD + b_Fert_GDD_Moss2 * FS_Moss[,2], sigma_Fert_GDD);
// Fert1 _||_ Vasc1 | P, T
target += normal_lpdf(FS_Fert[,1] | a_Fert_Vasc1 + b_Fert_Vasc1 * FS_Vasc[,1], sigma_Fert_Vasc1);
// GDD _||_ Vasc1 | P, SWC, T
target += lognormal_lpdf(GDD | a_GDD_Vasc1 + b_GDD_Vasc1 * FS_Vasc[,1] + b_GDD_Vasc1_SWC * sc_SWC, sigma_GDD_Vasc1);
}
I have the intuition something may be wrong, just stacking likelihood contributions for the same variables and same parameters. I could imagine the increase in dimension could make the posterior surface terribly flat, because of the high dimensionality? I guess basic probability maths about joint distribution could help better understand the implication of the approach, but I do not think I am able to understand it by myself.
Has someone some clue about the validity of the approach? Or any idea to evaluate d-separation with latent variables?
Thank you very much, have a great day!
Lucas
EDIT: Added information to the code.