# Fitting a Bayesian Factor Analysis Model in Stan

I learned about Stan a couple days ago. I’m using R version 4.0.2 and RStudio 1.3.1056 on macOS Catalina version 10.15.6. I have installed the rstan package version 2.21.2.

Issue I want to reproduce the results from a GitHub page dated from 2015 and titled " Fitting a Bayesian Factor Analysis Model in Stan" (found here).
The authors have supplied code, but some of the code does not work on my machine.

``````data {
int<lower=1> N;                // number of
int<lower=1> P;                // number of
matrix[N,P] Y;                 // data matrix of order [N,P]
int<lower=1> D;              // number of latent dimensions
}
transformed data {
int<lower=1> M;
vector[P] mu;
mu <- rep_vector(0.0,P);
}
parameters {
vector[M] L_t;   // lower diagonal elements of L
vector<lower=0>[D] L_d;   // lower diagonal elements of L
vector<lower=0>[P] psi;         // vector of variances
real<lower=0>   mu_psi;
real<lower=0>  sigma_psi;
real   mu_lt;
real<lower=0>  sigma_lt;
}
transformed parameters{
cov_matrix[P] Q;   //Covariance mat
{
int idx1;
int idx2;
real zero;
zero <- 0;
for(i in 1:P){
for(j in (i+1):D){
idx1 <- idx1 + 1;
L[i,j] <- zero; //constrain the upper triangular elements to zero
}
}
for (j in 1:D) {
L[j,j] <- L_d[j];
for (i in (j+1):P) {
idx2 <- idx2 + 1;
L[i,j] <- L_t[idx2];
}
}
}
Q<-L*L'+diag_matrix(psi);
}
model {
// the hyperpriors
mu_psi ~ cauchy(0, 1);
sigma_psi ~ cauchy(0,1);
mu_lt ~ cauchy(0, 1);
sigma_lt ~ cauchy(0,1);
// the priors
L_d ~ cauchy(0,3);
L_t ~ cauchy(mu_lt,sigma_lt);
psi ~ cauchy(mu_psi,sigma_psi);
//The likelihood
for( j in 1:N)
Y[j] ~ multi_normal(mu,Q);
}
``````

Here is the R code up to the problematic part.

``````library("rstan")
library("parallel")

fa.data <-list(P=P,N=N,Y=Y,D=D)

# a function to generate intial values that are slightly jittered for each chain.
init_fun = function() {
init.values<-list(L_t=rep(0,24)+runif(1,-.1,.1),
L_d=rep(.5,D)+runif(1,-.1,.1),
psi=rep(.2,P)+runif(1,-.1,.1),
sigma_psi=0.15+runif(1,-.1,.1),
mu_psi=0.2++runif(1,-.1,.1),
sigma_lt=0.5+runif(1,-.1,.1),
mu_lt=0.0+runif(1,-.1,.1))
return(init.values);
}

#compile the model
fa.model<- stan("fa.stan",
data = fa.data,
chains =0,
pars=c("L","psi","sigma_psi","mu_psi","sigma_lt","mu_lt"))
``````

I receive the following messages after running the code:

``````DIAGNOSTIC(S) FROM PARSER:
Info: assignment operator <- deprecated in the Stan language; use = instead.
Info: integer division implicitly rounds to integer. Found int division: D * D - 1 / 2
Positive values rounded down, negative values rounded up or down in platform-dependent way.
.
.
.
the number of chains is less than 1; sampling not done
Warning message:
In readLines(file, warn = TRUE) :
incomplete final line found on '...fa.stan'
``````

So I changed updated every case of `<-` to `=`. I added a blank line to the fa.stan file at the very end. I also changed `chains=0` to `chains=4`. Running the code after making these changes gave me the following messages:

``````SAMPLING FOR MODEL 'fa' NOW (CHAIN 1).
Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: Exception: []: accessing element out of range. index -2147483647 out of range; expecting index to be between 1 and 24; index position = 1L_t  (in 'model8d30668ceee6_fa' at line 40)

[1] "Error in sampler\$call_sampler(args_list[[i]]) : "
[2] "  Exception: []: accessing element out of range. index -2147483647 out of range; expecting index to be between 1 and 24; index position = 1L_t  (in 'model8d30668ceee6_fa' at line 40)"
error occurred during calling the sampler; sampling not done
.
.
.
[[1]]
Stan model 'fa' does not contain samples.

[[2]]
Stan model 'fa' does not contain samples.

[[3]]
Stan model 'fa' does not contain samples.

[[4]]
Stan model 'fa' does not contain samples.

Warning message:
In .local(object, ...) :
some chains had errors; consider specifying chains = 1 to debug
``````

So I changed `chains=4` to `chains=1` , re-ran the code and received this message:

``````.
.
.
SAMPLING FOR MODEL 'fa' NOW (CHAIN 1).
Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: Exception: []: accessing element out of range. index -2147483647 out of range; expecting index to be between 1 and 24; index position = 1L_t  (in 'model8d30668ceee6_fa' at line 40)

[1] "Error in sampler\$call_sampler(args_list[[i]]) : "
[2] "  Exception: []: accessing element out of range. index -2147483647 out of range; expecting index to be between 1 and 24; index position = 1L_t  (in 'model8d30668ceee6_fa' at line 40)"
error occurred during calling the sampler; sampling not done
``````

The author of the page was able to get the code to run somehow. I don’t understand how to properly address any of the messages I was given. I am hoping that someone can help me get this code working.

Hi Anthony,

The error:

``````Exception: []: accessing element out of range. index -2147483647
``````

Indicates that the index being used to access an element of a container is larger than the dimensions of a given container. The size of the index: `index -2147483647`, indicates that the index variable is likely not properly initialised. We can see that occurring with this stan code because of the variables: `idx1` and `idx2`:

``````{
int idx1;
int idx2;
real zero;
zero <- 0;
for(i in 1:P){
for(j in (i+1):D){
idx1 <- idx1 + 1;
L[i,j] <- zero; //constrain the upper triangular elements to zero
}
}
for (j in 1:D) {
L[j,j] <- L_d[j];
for (i in (j+1):P) {
idx2 <- idx2 + 1;
L[i,j] <- L_t[idx2];
}
}
}
``````

When declaring a new variable in Stan, it automatically gets initialised to `nan`, which for integers looks a lot like: -2147483647. In the Stan code, both `idx1` and `idx2` are declared without assigning any values to them, so they get initialised to that large negative number. When the indexes get incremented:

``````      idx2 <- idx2 + 1;
``````

This is just adding 1 to the large negative number, so the index continues to stay way out of bounds. I believe you need to initialise the indexes to zero:

``````  int idx1 = 0;
int idx2 = 0;
``````

But I haven’t looked at the model too closely, so double-check that that would be right

1 Like

Wow… I spent a lot of time verifying whether the code inside the for-loops made sense and didn’t once think about that. I’ve made the change and have attempted to re-run the code. I’m not receiving any of those messages and it seems like it is in the process of sampling.

Many thanks for the speedy response! Will update once/if the run completes.

1 Like

That seemed to do the trick! I was able to run the rest of the code from the GitHub page. The results differ greatly, but I may need to rerun everything after a fresh restart of RStudio. Thanks again!