I am interested in fitting a single common factor model in stan using multilevel regression rather than a multivariate approach (see Factor analysis - Wikipedia for background).
Data Generating Model
Suppose each participant i has some unobserved “true” factor score (\theta_i). The goal is to estimate these using a set of J observed scores (y_{1i}, y_{2i}, ... , y_{Ji}) measured for each participant.
For each observed score for participant i’s on item j (y_{ij}), I generate data from the following factor model, where \lambda_1, \lambda_2, ..., \lambda_J are parameters we want to estimate. I’ve scaled the responses on each item (y_{j}) so that they have a mean of 0 and SD of 1.
For identification i assume that the true scores follow a standard normal distribution
\theta_i \sim \text{Normal}(0,1)
y_{ij} = \lambda_j \times \theta_i + \epsilon_{ij}
Also assume that \sigma^2_{y_{ij}} = 1, \sigma^2_{\theta_{j}} = 1, cov(\theta_{j}\epsilon_{ij})= 0
\sigma^2_{y_{ij}} = 1 =\lambda_j^2 + \sigma^2_{\epsilon_{ij}}
\sigma^2_{\epsilon_{ij}} = 1 - \lambda_j^2
Stan Model
I am using the following hierarchical model:
\theta_i \sim \text{Normal}(0,1)
y_{ij} \sim \text{Normal}(\lambda_j \theta_i , 1 - \lambda_j^2 )
data {
int<lower=0> n;
int<lower=0> pps_n;
int<lower=0> item_n;
array[n] int<lower=1, upper=item_n> item;
array[n] int<lower=1, upper=pps_n> pps;
vector[n] y;
}
parameters {
vector[pps_n] theta; // true scores
vector<lower=0, upper=1>[item_n] lambda; // loadings
}
model {
vector[n] mu;
vector[n] e_sd;
theta ~ normal(0, 1);
//lambda ~ beta(.65,1);
for (i in 1:n){
mu[i] = lambda[item[i]] * theta[pps[i]];
e_sd[i] = sqrt(1 - lambda[item[i]]^2);
}
y ~ normal(mu, e_sd);
}
}
The sample estimates seem to converge on my simulated parameters when using a large simulated sample size, but I keep getting errors (Warning: 7 of 7 chains had an E-BFMI less than 0.2.) - so i am wondering if my model could be reparameterized in some way? I’m worried there is something i have perhaps overlooked.
I can find guidance online on fitting item response theory models or factor models using the multivariate normal distribution, but not quite this model.
Simulation code: MiscellaneousRCode/stan_example at master · giac01/MiscellaneousRCode (github.com)
Example Data
> dat
$n
[1] 200
$pps_n
[1] 50
$item_n
[1] 4
$item
[1] 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
[85] 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
[169] 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
$pps
[1] 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5 6 6 6 6 7 7 7 7 8 8 8 8 9 9 9 9 10 10 10 10 11 11 11 11 12 12 12 12 13 13 13 13 14 14 14 14
[57] 15 15 15 15 16 16 16 16 17 17 17 17 18 18 18 18 19 19 19 19 20 20 20 20 21 21 21 21 22 22 22 22 23 23 23 23 24 24 24 24 25 25 25 25 26 26 26 26 27 27 27 27 28 28 28 28
[113] 29 29 29 29 30 30 30 30 31 31 31 31 32 32 32 32 33 33 33 33 34 34 34 34 35 35 35 35 36 36 36 36 37 37 37 37 38 38 38 38 39 39 39 39 40 40 40 40 41 41 41 41 42 42 42 42
[169] 43 43 43 43 44 44 44 44 45 45 45 45 46 46 46 46 47 47 47 47 48 48 48 48 49 49 49 49 50 50 50 50
$y
[1] 0.026993060 0.294678438 0.504879733 0.838063411 -0.064088946 2.135580858 -0.102658228 0.431988262 -0.637015147 0.829512463 -0.996151098 -1.172257558
[13] -0.547533963 -0.296305725 -0.931871638 0.132416422 1.642639759 0.194497232 1.118328557 0.956495918 -0.310032813 0.515213705 -1.037700683 1.393111976
[25] -0.066293502 0.797577852 -0.935548272 -0.719678258 -0.142148393 -0.707048459 1.064033885 -1.085824814 -0.811128998 -0.518701709 -1.245711539 -1.242911556
[37] -0.037287432 0.338006972 0.065670432 -0.481413775 0.739121455 -1.549421883 1.592945823 0.156952157 0.675959428 -0.575613220 0.100065338 1.144788985
[49] 0.203182220 0.016312288 1.094622448 -1.648530696 0.230619598 -2.295092814 0.380279399 -0.315248921 -0.331125608 -0.687076351 0.404710970 -0.038534007
[61] 0.281532518 0.715078091 -1.425030335 -1.151016981 -0.187804492 0.776883927 1.076076376 1.690008957 0.697057058 1.165423111 -1.997176896 -0.831881308
[73] -0.455523243 -0.584201492 0.393350755 -0.255847690 -1.040633917 -1.101931637 0.661327615 -0.883798199 0.716656937 0.211037794 -1.563880433 0.966980798
[85] -0.582617281 -0.399425564 -0.060571701 1.473233891 0.227961969 2.343480726 -1.686202457 0.015734558 1.114514997 -0.712103496 0.031194180 1.446668581
[97] -1.673851859 1.227383369 0.796922764 -0.973365207 0.605591599 -0.101965655 -0.727886304 -1.206806676 -0.556067045 0.858942458 0.921756082 -1.163805082
[109] 1.859318210 1.038252385 1.089760225 1.609051927 -2.118524593 -2.309381226 -1.139945318 0.768823310 0.179539302 -0.062310755 -0.067278419 0.088842584
[121] 0.874852839 1.512254586 0.757874720 0.947273258 -0.813710985 -0.403531497 -0.136977139 -0.997688339 0.483562032 -0.867146646 -0.422424119 0.186733548
[133] -0.824017861 -1.179733762 -0.005953189 0.126401855 1.163641753 0.801800304 -0.043334431 1.568972827 -1.371857977 0.154702529 1.117423071 -0.521789561
[145] 2.946917504 0.252095390 0.365654201 -0.441102137 -1.433798849 -0.351040775 -1.086362052 -1.785369041 -0.124113300 0.498320681 1.791852973 0.506375434
[157] -0.628787515 0.800924879 -0.665474402 0.889657871 -0.238493159 1.376453226 0.571182247 0.034757974 1.737456561 0.606837749 0.678950573 1.137989338
[169] -0.859358493 -0.288164957 -0.515627109 -1.512983638 -1.763944833 0.103986638 -0.326272991 0.057044237 -1.310907039 0.294874018 0.156111141 -1.648735228
[181] 0.301116343 -1.261611563 -0.172779808 -0.064322524 -0.093500569 -1.024119434 0.626138072 1.386070237 1.404419811 -0.387153066 2.363686265 -0.718435361
[193] 0.810834107 -1.543641639 -2.269518803 0.682420125 0.100678757 -0.653388345 -0.162460478 0.224488114