Identifying variable influence in SEM

Consider the following SEM model (though the following graphic doesn’t follow all conventional SEM DAG rules, more of a dependency-graph):

Wherein a common latent weighs on two derived latents that in turn connect to two outcomes. Weights are constained as lower=-1, upper=1 (with r1 constrained to 0-1 for identifiability) and are thus interpretable as correlations and their square as quantifying the % of variability in the derived latent attributable to the common factor.

Code for this model being:

data{
	int n_id ;
	int n_obs ;
	matrix[n_obs,n_id] y1 ;
	matrix[n_obs,n_id] y2 ;
}
parameters{
	real<lower=0> noise1 ;
	real<lower=0> noise2 ;
	real<lower=0,upper=1> r1 ;
	unit_vector[2] r2 ;  // for unconstrained SEM correlations, unit_vector makes for less by-hand compute
	vector[n_id] common ;
	vector[n_id] res1 ;
	vector[n_id] res2 ;
}
model{
	noise1 ~ weibull(2,1) ;
	noise2 ~ weibull(2,1) ;
	res1 ~ std_normal() ;
	res2 ~ std_normal() ;
	common ~ std_normal() ;
	vector[n_id] mu1 = (
		common .* r1
		+ res1 .* sqrt(1-pow(r1,2)) // have to compute this term by-hand for constrained correlations
	) ;
	vector[n_id] mu2 = (
		common .* r2[1]
		+ res2 .* r2[2] // for unconstained SEM correlations, unit-vector gives us this term automatically
	) ;
	for(i_id in 1:n_id){
		y1[,i_id] ~ normal( mu1[i_id] , noise1 ) ;
		y2[,i_id] ~ normal( mu2[i_id] , noise2 ) ;
	}
}

Now, consider an extension of this model wherein there are two more observed variables, with one being directly modelled by a single latent and the other being modelled as a combination of that new single latent plus the common latent of the previous model:

With Stan code:

data{
	int n_id ;
	int n_obs ;
	matrix[n_obs,n_id] y1 ;
	matrix[n_obs,n_id] y2 ;
	matrix[n_obs,n_id] y3 ;
	matrix[n_obs,n_id] y4 ;
}
parameters{
	real<lower=0> noise1 ;
	real<lower=0> noise2 ;
	real<lower=0> noise3 ;
	real<lower=0> noise4 ;
	real<lower=0,upper=1> r1 ;
	unit_vector[2] r2 ;
	vector[n_id] res1 ;
	vector[n_id] res2 ;
	vector[n_id] common ;
	vector[n_id] mu3 ;
	vector[n_id] res4 ;
	unit_vector[3] r_mu4 ;
}
model{
	noise1 ~ weibull(2,1) ;
	noise2 ~ weibull(2,1) ;
	noise3 ~ weibull(2,1) ;
	noise4 ~ weibull(2,1) ;
	res1 ~ std_normal() ;
	res2 ~ std_normal() ;
	common ~ std_normal() ;
	mu3 ~ std_normal() ;
	res4 ~ std_normal() ;
	vector[n_id] mu1 = (
		common .* r1
		+ res1 .* sqrt(1-pow(r1,2))
	) ;
	vector[n_id] mu2 = (
		common .* r2[1]
		+ res2 .* r2[2]
	) ;
	vector[n_id] mu4 = (
		common .* r_mu4[1]
		+ mu3 .* r_mu4[2]
		+ res4 .* r_mu4[3]
	) ;
	for(i_id in 1:n_id){
		y1[,i_id] ~ normal( mu1[i_id] , noise1 ) ;
		y2[,i_id] ~ normal( mu2[i_id] , noise2 ) ;
		y3[,i_id] ~ normal( mu3[i_id] , noise3 ) ;
		y4[,i_id] ~ normal( mu4[i_id] , noise4 ) ;
	}
}

When the data y3 are low-noise, the model is well-identified and it’s possible to interpret r_mu4[1]^2 as the r-square for the prediction of mu4 from the common variable alone and r_mu4[1]^2 + r_mu4[2]^2 as the r-square for prediction of mu4 using both common and mu3.

However, when y3 is high-noise, I think that an identifiability issue creeps in whereby mu3 begins to take on a “random noise” character and thereby can start to exchange roles with res4 in reflecting a random noise residual. For model fitting this doesn’t seem to pose much of an issue, merely inducing a negative correlation between r_mu4[2] and r_mu4[3] that HMC can handle, BUT it does render the above noted derivations/interpretations of r-square invalid, at least as far as I can discern. That is, if the “added value” of mu3 to the prediction of mu4 is of critical importance, when mu3 itself is not well constrained by the y3 data, mu3 will trade roles in the SEM with the residual res4 and thereby yield inflated (or at least non-meaningful) values for it’s influence on y4.

Is this non-identifiability circumventable in any obvious way other than to collect enough y3 data to properly constrain mu3?

What was the motivation to use a weibull(2, 1) as sd for the noise parameter? The mass is concentrated at 1? Why not use a student_t distribution? Did you do a pairs-plot and have a look at the correlation between the sampled parameters?