Consider the following SEM model (though the following graphic doesn’t follow all conventional SEM DAG rules, more of a dependency-graph):

Wherein a common latent weighs on two derived latents that in turn connect to two outcomes. Weights are constained as `lower=-1, upper=1`

(with r1 constrained to 0-1 for identifiability) and are thus interpretable as correlations and their square as quantifying the % of variability in the derived latent attributable to the common factor.

Code for this model being:

```
data{
int n_id ;
int n_obs ;
matrix[n_obs,n_id] y1 ;
matrix[n_obs,n_id] y2 ;
}
parameters{
real<lower=0> noise1 ;
real<lower=0> noise2 ;
real<lower=0,upper=1> r1 ;
unit_vector[2] r2 ; // for unconstrained SEM correlations, unit_vector makes for less by-hand compute
vector[n_id] common ;
vector[n_id] res1 ;
vector[n_id] res2 ;
}
model{
noise1 ~ weibull(2,1) ;
noise2 ~ weibull(2,1) ;
res1 ~ std_normal() ;
res2 ~ std_normal() ;
common ~ std_normal() ;
vector[n_id] mu1 = (
common .* r1
+ res1 .* sqrt(1-pow(r1,2)) // have to compute this term by-hand for constrained correlations
) ;
vector[n_id] mu2 = (
common .* r2[1]
+ res2 .* r2[2] // for unconstained SEM correlations, unit-vector gives us this term automatically
) ;
for(i_id in 1:n_id){
y1[,i_id] ~ normal( mu1[i_id] , noise1 ) ;
y2[,i_id] ~ normal( mu2[i_id] , noise2 ) ;
}
}
```

Now, consider an extension of this model wherein there are two more observed variables, with one being directly modelled by a single latent and the other being modelled as a combination of that new single latent plus the `common`

latent of the previous model:

With Stan code:

```
data{
int n_id ;
int n_obs ;
matrix[n_obs,n_id] y1 ;
matrix[n_obs,n_id] y2 ;
matrix[n_obs,n_id] y3 ;
matrix[n_obs,n_id] y4 ;
}
parameters{
real<lower=0> noise1 ;
real<lower=0> noise2 ;
real<lower=0> noise3 ;
real<lower=0> noise4 ;
real<lower=0,upper=1> r1 ;
unit_vector[2] r2 ;
vector[n_id] res1 ;
vector[n_id] res2 ;
vector[n_id] common ;
vector[n_id] mu3 ;
vector[n_id] res4 ;
unit_vector[3] r_mu4 ;
}
model{
noise1 ~ weibull(2,1) ;
noise2 ~ weibull(2,1) ;
noise3 ~ weibull(2,1) ;
noise4 ~ weibull(2,1) ;
res1 ~ std_normal() ;
res2 ~ std_normal() ;
common ~ std_normal() ;
mu3 ~ std_normal() ;
res4 ~ std_normal() ;
vector[n_id] mu1 = (
common .* r1
+ res1 .* sqrt(1-pow(r1,2))
) ;
vector[n_id] mu2 = (
common .* r2[1]
+ res2 .* r2[2]
) ;
vector[n_id] mu4 = (
common .* r_mu4[1]
+ mu3 .* r_mu4[2]
+ res4 .* r_mu4[3]
) ;
for(i_id in 1:n_id){
y1[,i_id] ~ normal( mu1[i_id] , noise1 ) ;
y2[,i_id] ~ normal( mu2[i_id] , noise2 ) ;
y3[,i_id] ~ normal( mu3[i_id] , noise3 ) ;
y4[,i_id] ~ normal( mu4[i_id] , noise4 ) ;
}
}
```

When the data `y3`

are low-noise, the model is well-identified and it’s possible to interpret `r_mu4[1]^2`

as the r-square for the prediction of `mu4`

from the `common`

variable alone and `r_mu4[1]^2 + r_mu4[2]^2`

as the r-square for prediction of `mu4`

using both `common`

and `mu3`

.

However, when `y3`

is high-noise, I *think* that an identifiability issue creeps in whereby `mu3`

begins to take on a “random noise” character and thereby can start to exchange roles with `res4`

in reflecting a random noise residual. For model fitting this doesn’t seem to pose much of an issue, merely inducing a negative correlation between `r_mu4[2]`

and `r_mu4[3]`

that HMC can handle, BUT it does render the above noted derivations/interpretations of r-square invalid, at least as far as I can discern. That is, if the “added value” of `mu3`

to the prediction of `mu4`

is of critical importance, when `mu3`

itself is not well constrained by the `y3`

data, `mu3`

will trade roles in the SEM with the residual `res4`

and thereby yield inflated (or at least non-meaningful) values for it’s influence on `y4`

.

Is this non-identifiability circumventable in any obvious way other than to collect enough `y3`

data to properly constrain `mu3`

?