Convergence problems with factor loadings in multivariante autoregressive model

Alfonso · May 19, 2020, 11:34am

Hi everyone,

I am trying to implement a MAR/VAR model by Ovaskainen et al. 2017. I have simplified the model by assuming species only interact with individuals of their own species (i.e. no interspecific interactions). I have focused on estimating the variance-covariance matrix assuming a multiplicative gamma prior on the factor loadings (Bhattacharya and Dunson 2011). I have used some code from the Stan forum, mainly from here.

Here is the stan code:

data {
  int<lower=0> sp;  // Number of species
  int<lower=0> T;   // Numer of time steps
  int <lower=0> nf; // Number of latent variables
  vector[sp] Y[T];   // The outcome matrix

}
parameters {
  real<lower=0> a1;   
  real<lower=0> a2; 
  vector[sp] r;     
  vector[sp] a;    
  vector<lower=0>[nf] delta; 
  matrix<lower=0>[sp,nf] phi;
  matrix[sp,nf] lam;
  vector<lower=0>[sp] s2i;
}
transformed parameters{
  vector<lower=0>[nf] theta;
  vector<lower=0>[sp] s2;
  matrix[sp,sp] omega;
  
  theta[1] = delta[1];
  for(l in 2:nf)
     theta[l] = theta[l-1] * delta[l];
     
  s2 = 1.0 ./ s2i;
  
  omega = diag_matrix(s2) + lam *lam';
}
model {
  // priors
  a1 ~ gamma(2.0, 1.0);
  a2 ~ gamma(2.0, 1.0);
  r ~ normal(0,1);
  a ~ normal(0,1);
  
  delta[1] ~ gamma(a1, 1.0);
  for (h in 2:nf)
      delta[h] ~ gamma(a2, 1.0);

  for (s in 1:sp)
      for (l in 1:nf)
          phi[s, l] ~ gamma(3.0 / 2.0, 3.0 / 2.0);

  for (s in 1:sp)
      for (l in 1:nf)
          lam[s, l] ~ normal(0.0, 1.0 / phi[s, l] / theta[l]);

  for (s in 1:sp)
      s2i[s] ~ gamma(1.0, 0.3);
  
  //Likelihood
  for (t in 2:T)
      Y[t] ~ multi_normal(r + diag_matrix(a) * Y[t-1], omega);
}
generated quantities{
  vector[sp] sde;
  matrix[sp, sp]Rho;
  sde = sqrt(s2);
  for(i in 1:sp)
    for(j in 1:sp)
      Rho[i,j] = omega[i,j]/sqrt(omega[i,i] * omega[j,j]);
}

And here is the R code simulating the data and fitting the model

# Clean the environment
rm(list = ls())

# Load libraries
library(rstan)
library(mvtnorm)
library(rethinking)
library(truncnorm)

# simulate time series following a gompertz curve. Simulates species biomass on a log scale
Gompertz.sim <- function(init.pop, r, alpha, time, sp, Rho, sd_noise){
  metacommunity <- matrix(NA, nrow = time, ncol = sp)
  metacommunity[1,] <- init.pop 
  for (t in 2:time) {
    metacommunity[t,]<- r + alpha*metacommunity[t-1,] + t(rmvnorm2(1, Mu = rep(0, sp), sigma = diag(sd_noise), Rho = Rho))
  }
  return(metacommunity)
}


P0<- 1            # Initial population biomass for all species on log scale
sp <- 10           # Number of species
t <- 200          # Number of timesteps 
r <- rnorm(sp, 0.65, 0) # intrinsict grwoth rate for all species
alpha <- rtruncnorm(sp, a = -Inf, b = 0.95, mean = 0.94, sd = 0.1) # Density dependence for all sp

# Process error standard deviation for all sp
sd_noise <- matrix(0, nrow = sp, ncol = sp)   
diag(sd_noise) <- rlnorm(sp, -1.5, 0.1)       

Rho <- rlkjcorr(1, sp, 2) # Correlation matrix
Sigma <- sd_noise%*%Rho%*%sd_noise # variance-covariance matrix

# Simulate the community dynamics 
Sim.data <- Gompertz.sim(init.pop = P0, r = r, alpha = alpha, time = t, sp = sp, sd_noise = sd_noise, Rho = Rho)

# Prepare the stan data
dat<-list(Y = Sim.data, 
          sp = ncol(Sim.data), 
          T=nrow(Sim.data),
          nf = 2)

# Estimate the posterior
MAR_model <- stan(file = "MAR_Ovas.stan",
                  data = dat,
                  chains = 4, 
                  cores = 4,
                  iter = 2000,
                  control = list(adapt_delta =0.9))

The model recovers the true parameter fairly well. All the parameters seem to converge (good Rhat, n_eff and traceplots) except for the factor ladings lam, which do fine if I increase the number of iterations from 2000 to 15000.

I am new to Stan and latent variable models. Any ideas how to make my model specification more efficient? Or, any other suggestion on how to estimate the factor loadings using other priors?

Thanks,
Alfonso

spinkney · May 19, 2020, 12:24pm

Take a look at this post Help with factor analysis (latent variable model) and the latent factor model in the Stan manual. That post references @rfarouni blog http://rfarouni.github.io/2015-04-26-fa/ which has a lot more detail.

That post shows how to write the multi normal in terms of cholesky factors. Also you can remove some of the double loops by vectorization such as

// your formulation
  for (s in 1:sp)
      for (l in 1:nf)
          phi[s, l] ~ gamma(3.0 / 2.0, 3.0 / 2.0);

// faster, this takes phi[s] as a vector length nf and places
// the gamma prior on each element of the vector
  for (s in 1:sp)
  phi[s] ~ gamma(3.0 / 2.0, 3.0 / 2.0);

Alfonso · May 25, 2020, 7:28am

Thanks, that helped a lot.

Here is the stan model:

data {
  int<lower=1> S;  // Number of Secies
  int<lower=1> T;   // Numer of years
  int <lower=1> D; // Number of latent dimensions
  vector[S] Y[T];   // The outcome matrix
}
transformed data{
  int<lower=1> M;
  M  = D*(S-D)+ D*(D-1)/2;  // number of non-zero loadings
}
parameters {
  vector[S] r;   // Vector of intrinsic growth rates
  vector[S] a;   // matrix with competitive parameter
  vector[M] L_t;   // lower diagonal elements of L
  vector<lower=0>[D] L_d;   // lower diagonal elements of L
  vector<lower=0>[S] psi;         // vector of variances
  real<lower=0>   mu_psi;
  real<lower=0>  sigma_psi;
  real   mu_lt;
  real<lower=0>  sigma_lt;
}
transformed parameters{
  cholesky_factor_cov[S,D] L;  //lower triangular factor loadings Matrix 
  cov_matrix[S] Omega;   //Covariance mat
{
  int idx;
  idx = 0;
  for(i in 1:(D-1)){
    for(j in (i+1):D){
      L[i,j] = 0; //constrain the upper triangular elements to zero 
    }
  }
  for(j in 1:D) {
    L[j,j] = L_d[j];
    for(i in (j+1):S) {
      idx = idx + 1;
      L[i,j] = L_t[idx];
    } 
  }
} 
Omega=L*L'+diag_matrix(psi);
}
model {
// the hyperpriors 
  mu_psi ~ cauchy(0, 1);
  sigma_psi ~ cauchy(0,1);
  mu_lt ~ cauchy(0, 1);
  sigma_lt ~ cauchy(0,1);
// the priors 
  L_d ~ cauchy(0,3);
  L_t ~ cauchy(mu_lt,sigma_lt);
  psi ~ cauchy(mu_psi,sigma_psi);
  r ~ normal(0,1);
  a ~ normal(0,1);
  
//The likelihood
  for (t in 2:T)
      Y[t] ~ multi_normal(r + diag_matrix(a) * Y[t-1], Omega);
}
generated quantities{
  matrix[S, S]Rho;
  vector[S] sde;
  for(i in 1:S)
    for(j in 1:S)
      Rho[i,j] = Omega[i,j]/sqrt(Omega[i,i] * Omega[j,j]);
  
  sde = sqrt(diagonal(Omega));
}

The model runs well, but I wonder if there is any changes that can be done to increase the efficiency.

spinkney · May 27, 2020, 12:28am

Use multi_normal_cholesky directly since you have L and psi. See the above discussion at Help with factor analysis (latent variable model) or search the manual for cholesky parameterization. You can also set the lagged Y in transformed data and have a vectorized call to the multi_normal_cholesky.

Alfonso · May 27, 2020, 5:48am

For what I understood \Sigma=LL^T+\Psi where L is the matrix of factor loadings with the constrains from @rfarouni blog. In Help with factor analysis (latent variable model) they used multi_normal_cholesky for the factor scores, not for the factor loadings. The Ld in their code is the product between the factor scores sd and the factor scores correlation, whereas the L in my code is the factor loadings, what is defined as lambda in their code. So you are suggesting to write

model{
    multi_normal_cholesky(r + diag_matrix(a) * Y[t-1], L)
}

In this case I don’t know what to do with psi

Or

transformed parameters{
    L_s = diag_pre_multiply(psi, L)
}
model {
    multi_normal_cholesky(r + diag_matrix(a) * Y[t-1], L_s)
}

But L is not from a correlation matrix

I am sure I am missing something conceptually.

And thanks a lot for your help,
Alfonso

spinkney · May 27, 2020, 10:05am

I think just something like

transformed parameters {
...
Omega = add_diag(multiply_lower_tri_self_transpose(L), psi);
L_Omega = cholesky_decompose(Omega);
...
}
model { 
...
 for (t in 2:T)
      Y[t] ~ multi_normal_cholesky(r + diag_pre_multiply(a, Y[t-1]), L_Omega);
...
}|

ldeschamps · May 27, 2020, 11:21am

Hey :)

You should definitly look at Mauricio’s code in this thread :

It has been of great help to me!

Lucas

Alfonso · May 29, 2020, 12:33am

Hey,

It seems add_diag has not been included into rstan, but it is available on Stan math library. Any idea on how to load that into my workspace?

So instead of:

spinkney:

transformed parameters {
...
Omega = add_diag(multiply_lower_tri_self_transpose(L), psi);
L_Omega = cholesky_decompose(Omega);
...
}
model { 
...
 for (t in 2:T)
      Y[t] ~ multi_normal_cholesky(r + diag_pre_multiply(a, Y[t-1]), L_Omega);
...
}

I did:

transformed parameters {
...
  Omega = L*L'+diag_matrix(psi);
  L_Omega = cholesky_decompose(Omega);
...
}
model { 
...
  for (t in 2:T)
      Y[t] ~ multi_normal_cholesky(r + diag_matrix(a)*Y[t-1], L_Omega);
...
}

The model runs faster than before. However, I realised that for both cases multi_normal and multi_normal_cholesky, the model has low n_eff and Rhat for the L when I increase the number of species or the the number of latent dimensions. L Traceplots:

It looks like some of the L are bimodal. I used the same constrains for the factor loadings L as @ldeschamps. So the difference I see is that you model the latent factors FS_UNC with L_UNC to calculate Mu, while I use the factor loadings and psi to estimate the variance-covariance matrix. Is that right? If I am to include FS_UNC in my model I am not sure how to use it. should I

...
transformed parameters{
matrix[T,S] Mu;
...
for(t in 2:N) Mu [t] = r + diag_matrix(a) * Y[t-1] + FS_UNC * L_UNC';
}

If so I am not sure what argument use in multi_normal

...
  for (t in 2:T)
      Y[t] ~ multi_normal(Mu[t], ????);
}

Thanks a lot for the help :)

Cheers,
Alfonso

Alfonso · July 28, 2020, 1:43pm

Hi there,

After some time I got back to this problem. I implemented the sign correction in the generated quantities block. I tried first with a model not including the latent factors, only the factor loadings, similar to my model specifiaction above. The model converges and retrieves the true parameters.

Here is the stan code for that model (mar_lat_noF.stan)

data {
  int<lower=1> S;  // Number of Secies
  int<lower=1> T;   // Numer of years
  int <lower=1> D; // Number of latent dimensions
  vector[S] Y[T];   // The outcome matrix
}
transformed data{
  int<lower=1> M;
  M  = D*(S-D)+ D*(D-1)/2;  // number of non-zero loadings
}
parameters {
  vector[S] r;   // Vector of intrinsic growth rates
  vector[S] a;   // matrix with competitive parameter
  vector[M] L_t;   // lower diagonal elements of L
  vector[D] L_d;   // diagonal elements of L
  vector<lower=0>[S] psi;         // vector of variances
}
transformed parameters{
  matrix[S,D] L;  //lower triangular factor loadings Matrix 
  cov_matrix[S] Omega;   // Covariance matrix
  matrix[S,S] L_Omega;

{
  int idx;
  idx = 0;
  for(i in 1:(D-1)){
    for(j in (i+1):D){
      L[i,j] = 0; //constrain the upper triangular elements to zero 
    }
  }
  for(j in 1:D) {
    L[j,j] = L_d[j];
    for(i in (j+1):S) {
      idx = idx + 1;
      L[i,j] = L_t[idx];
    } 
  }
} 

  Omega = L*L';
  for (i in 1:rows(Omega))
  Omega[i,i] = Omega[i,i] + psi[i];
  L_Omega = cholesky_decompose(Omega);
}
model {
// the priors 
  L_d ~ normal(0,1);
  L_t ~ normal(0,1);
  psi ~ exponential(2);
  r ~ normal(0,1);
  a ~ normal(0,1);
  
//The likelihood
  for (t in 2:T)
      Y[t] ~ multi_normal_cholesky(r + diag_matrix(a)*Y[t-1], L_Omega);
}
generated quantities{
  matrix[S,S] Rho; // Correlation matrix
  vector[S] sde;   // species' sd
  matrix[S,D] L_invar = L; // constrained factor loadings

    for(d in 1:D){
      if(L[d,d]<0){
        L_invar[,d] = -1 * L[,d];
    }
  }
  for(i in 1:S)
    for(j in 1:S)
      Rho[i,j] = Omega[i,j]/sqrt(Omega[i,i] * Omega[j,j]);
  
  sde = sqrt(diagonal(Omega));
}

Both Mauricio’s code and @ldeschamps’s code have explicitly included the latent factors in their models. I tried to implement a stan model including them (mar_lat.stan), but assuming no correlation between latent factors:

data {
  int<lower=1> S;  // Number of Secies
  int<lower=1> T;   // Numer of years
  int <lower=1> D; // Number of latent dimensions
  vector[S] Y[T];   // The outcome matrix
}
transformed data{
  int<lower=1> M;
  M  = D*(S-D)+ D*(D-1)/2;  // number of non-zero loadings

}
parameters {
  vector[S] r;   // Vector of intrinsic growth rates
  vector[S] a;   // matrix with competitive parameter
  vector[M] L_t;   // lower diagonal elements of L
  vector[D] L_d;   // diagonal elements of L
  vector<lower=0>[S] psi;         // vector of variances
  matrix[D,T-1] F;

}
transformed parameters{
  matrix[S,D] L;  //lower triangular factor loadings Matrix 
  matrix[S,T-1] rand_eff;
  vector[S] Mu[T-1];
  
{
  int idx;
  idx = 0;
  for(i in 1:(D-1)){
    for(j in (i+1):D){
      L[i,j] = 0; //constrain the upper triangular elements to zero 
    }
  }
  for(j in 1:D) {
    L[j,j] = L_d[j];
    for(i in (j+1):S) {
      idx = idx + 1;
      L[i,j] = L_t[idx];
    } 
  }
} 
  rand_eff = L*F;
  for(t in 1:(T-1)){
    Mu[t,] = r + diag_matrix(a)*Y[t] + rand_eff[,t];
  }
}
model {
// the priors 
  L_d ~ normal(0,1);
  L_t ~ normal(0,1);
  psi ~ exponential(2);
  r ~ normal(0,1);
  a ~ normal(0,1);
  to_vector(F) ~ normal(0,1);

//The likelihood
  for (t in 2:T){
     for(s in 1:S){
      Y[t,s] ~ normal(Mu[t-1,s], psi[s]);
    }
  }
}
generated quantities{
  matrix[S,S] Omega;
  matrix[S, S] Rho;
  vector[S] sde;
  matrix[S,D] L_invar = L;
  matrix[D,T-1] F_invar = F;
  
    for(d in 1:D){
      if(L[d,d]<0){
        L_invar[,d] = -1 * L[,d];
        F_invar [,d] = -1 * F[,d];
    }
  }
  Omega = L_invar*L_invar'+diag_matrix(psi);
  for(i in 1:S)
    for(j in 1:S)
      Rho[i,j] = Omega[i,j]/sqrt(Omega[i,i] * Omega[j,j]);
  
  sde = sqrt(diagonal(Omega));
}

It didn’t go well. I had divergent transitions, low n_eff, big Rhats and low bfmi in the four chains (same fate when increasing adapt_delta and max_treedepth). The true parameter values for r and a were retrieved by the model, but the variance-covariance matrix Omega was not.

Here are the traceplots for the sign-constrained factor loadings:

and here the summary:

               mean   sd  5.5% 94.5% n_eff Rhat4
L_invar[1,1]   0.09 0.03  0.04  0.13   531  1.00
L_invar[1,2]   0.00 0.00  0.00  0.00   NaN   NaN
L_invar[1,3]   0.00 0.00  0.00  0.00   NaN   NaN
L_invar[2,1]   0.12 0.04  0.06  0.17   325  1.02
L_invar[2,2]   0.08 0.04  0.01  0.14   152  1.02
L_invar[2,3]   0.00 0.00  0.00  0.00   NaN   NaN
L_invar[3,1]  -0.03 0.05 -0.10  0.05   236  1.00
L_invar[3,2]   0.05 0.07 -0.07  0.14   197  1.01
L_invar[3,3]   0.10 0.04  0.02  0.16    90  1.05
L_invar[4,1]   0.05 0.08 -0.09  0.18   120  1.03
L_invar[4,2]   0.16 0.12 -0.09  0.29    80  1.06
L_invar[4,3]  -0.15 0.13 -0.28  0.15    87  1.05
L_invar[5,1]  -0.16 0.03 -0.20 -0.11   576  1.01
L_invar[5,2]  -0.03 0.05 -0.10  0.04   263  1.01
L_invar[5,3]   0.02 0.04 -0.04  0.08   212  1.02
L_invar[6,1]   0.16 0.04  0.09  0.21   213  1.02
L_invar[6,2]   0.04 0.07 -0.07  0.14   175  1.02
L_invar[6,3]  -0.09 0.04 -0.14 -0.01    82  1.06
L_invar[7,1]   0.08 0.05 -0.01  0.16   124  1.02
L_invar[7,2]  -0.10 0.08 -0.19  0.05    81  1.05
L_invar[7,3]   0.04 0.10 -0.17  0.17   137  1.04
L_invar[8,1]  -0.05 0.03 -0.10  0.00   646  1.00
L_invar[8,2]  -0.04 0.04 -0.09  0.03   270  1.01
L_invar[8,3]  -0.04 0.03 -0.09  0.01   152  1.03
L_invar[9,1]   0.03 0.06 -0.07  0.12   166  1.00
L_invar[9,2]   0.05 0.11 -0.15  0.21   156  1.02
L_invar[9,3]   0.19 0.07  0.04  0.25    43  1.09
L_invar[10,1] -0.11 0.04 -0.16 -0.05   387  1.00
L_invar[10,2] -0.05 0.05 -0.13  0.04   222  1.01
L_invar[10,3] -0.06 0.04 -0.11  0.01   106  1.03

All of this makes me think I have not specified the model correctly.

I tried another model where I include correlation among factors (mar_lat2.stan), with similar failure:

data {
  int<lower=1> S;  // Number of Secies
  int<lower=1> T;   // Numer of years
  int <lower=1> D; // Number of latent dimensions
  vector[S] Y[T];   // The outcome matrix
}
transformed data{
  int<lower=1> M;
  vector[D] FS_mu;           // vector of latent factor means
  vector<lower=0>[D] FS_sd;  // vector of latent factor sd
  M  = D*(S-D)+ D*(D-1)/2;  // number of non-zero loadings
  for(m in 1:D){
    FS_mu[m] = 0;
    FS_sd[m] = 1;
  }
}
parameters {
  vector[S] r;   // Vector of intrinsic growth rates
  vector[S] a;   // matrix with competitive parameter
  vector[M] L_t;   // lower diagonal elements of L
  vector[D] L_d;   // diagonal elements of L
  vector<lower=0>[S] psi;         // vector of variances
  matrix[D,T-1] F;
  cholesky_factor_corr[D] Rho_F;
}
transformed parameters{
  matrix[S,D] L;  //lower triangular factor loadings Matrix 
  matrix[D,D] L_F;
  matrix[S,T-1] rand_eff;
  vector[S] Mu[T-1];
  
  L_F = diag_pre_multiply(FS_sd, Rho_F);
{
  int idx;
  idx = 0;
  for(i in 1:(D-1)){
    for(j in (i+1):D){
      L[i,j] = 0; //constrain the upper triangular elements to zero 
    }
  }
  for(j in 1:D) {
    L[j,j] = L_d[j];
    for(i in (j+1):S) {
      idx = idx + 1;
      L[i,j] = L_t[idx];
    } 
  }
} 
  rand_eff = L*F;
  for(t in 1:(T-1)){
    Mu[t,] = r + diag_matrix(a)*Y[t] + rand_eff[,t];
  }
}
model {
// the priors 
  L_d ~ normal(0,1);
  L_t ~ normal(0,1);
  psi ~ exponential(2);
  r ~ normal(0,1);
  a ~ normal(0,1);
  Rho_F ~ lkj_corr_cholesky(2);

//The likelihood
  for (t in 2:T){
    F[,t-1] ~ multi_normal_cholesky(FS_mu, L_F);
    for(s in 1:S){
      Y[t,s] ~ normal(Mu[t-1,s], psi[s]);
    }
  }
}
generated quantities{
  matrix[S,S] Omega;
  matrix[S, S] Rho;
  vector[S] sde;
  matrix[S,D] L_invar = L;
  matrix[D,T-1] F_invar = F;
  matrix[D,D] phi; 
  
  phi = multiply_lower_tri_self_transpose(Rho_F);
  
    for(d in 1:D){
      if(L[d,d]<0){
        L_invar[,d] = -1 * L[,d];
        F_invar [,d] = -1 * F[,d];
    }
  }
  Omega = L_invar*phi*L_invar'+diag_matrix(psi);
  for(i in 1:S)
    for(j in 1:S)
      Rho[i,j] = Omega[i,j]/sqrt(Omega[i,i] * Omega[j,j]);
  
  sde = sqrt(diagonal(Omega));
}

Here is the R script simulating data and fitting the model

# Clean the environment
rm(list = ls())

# Load libraries
library(rstan)
library(mvtnorm)
library(rethinking)
library(truncnorm)

# simulate time series following a gompertz curve. Simulates species biomass on a log scale
Gompertz.sim <- function(init.pop, r, alpha, time, sp, Rho, sd_noise){
  metacommunity <- matrix(NA, nrow = time, ncol = sp)
  metacommunity[1,] <- init.pop 
  for (t in 2:time) {
    metacommunity[t,]<- r + alpha*metacommunity[t-1,] + 
      t(rmvnorm2(1, Mu = rep(0, sp), 
                 sigma = diag(sd_noise), 
                 Rho = Rho))
  }
  return(metacommunity)
}

set.seed(2020)

P0<- 1            # Initial population biomass for all species on log scale
sp <- 10           # Number of species
t <- 100          # Number of timesteps 
r <- rnorm(sp, 0.65, 0.1) # intrinsict grwoth rate for all species
alpha <- rtruncnorm(sp, a = -Inf, b = 0.95, mean = 0.9, sd = 0.1) # Density dependence for all sp

# Process error standard deviation for all sp
sd_noise <- matrix(0, nrow = sp, ncol = sp)   
diag(sd_noise) <- rlnorm(sp, -1.5, 0.1)       

Rho <- rlkjcorr(1, sp, 2) # Correlation matrix
Sigma <- sd_noise%*%Rho%*%sd_noise # variance-covariance matrix

# Simulate the community dynamics 
Sim.data <- Gompertz.sim(init.pop = P0, r = r, alpha = alpha, time = t, sp = sp, sd_noise = sd_noise, Rho = Rho)

# Prepare the stan data
dat<-list(Y = Sim.data, 
          S = ncol(Sim.data), 
          T=nrow(Sim.data),
          D = 3)

# Estimate the posterior
MAR_model <- stan(file = "mar_lat.stan", #file = "mar_lat2.stan" #file = "mar_lat_noF.stan"
                  data = dat,
                  chains = 4, 
                  cores = 4,
                  iter = 4000,
                  control = list(adapt_delta = 0.8, max_treedepth = 10))

Any ideas?

Thanks for your help!

Cheers,
Alfonso

ldeschamps · July 28, 2020, 3:57pm

Hi Alfonso!

I am myself dealing with similar trouble : chains sometimes diverging without being completly messed up.

I read the paper describing the sign-changing method :

Merkle EC , Wang T . 2018 . Bayesian latent variable models for the analysis of experimental psychology data. Psychonomic Bulletin and Review 25 : 256–270.

Another trick is to define the non-identified prior as hierarchical with estimated variances (or covariances), and then expand it by scaling by this standard-deviation.

I find this approach quite promising, you may give a shot at it!

Cheers, Lucas

ldeschamps · July 28, 2020, 4:02pm

I also noted that your priors for loading are normal(0,1). In the litterature I have read, they :

Put a normal(0,1) prior on scores and a wider prior on loadings
OR
They estimate the variance of scores and put a normal(0,1) prior on loadings.

Nice to see you should have good temporal meta-community data :)

Good success!
Lucas

Alfonso · July 31, 2020, 2:29am

Hi Lucas,

Thanks for your answer and for the paper!

Do you mean something like this:

parameters{
...
  real<lower=0> sigma_lt;
  real<lower=0> sigma_ld;
...
}
...
model {
  // hyperpriors
  sigma_lt ~ exponential(1); 
  sigma_ld ~ exponential(1);
  // Priors
  L_d ~ normal(0,sigma_ld);
  L_t ~ cauchy(0,sigma_lt);
  psi ~ exponential(2);
  r ~ normal(0,1);
  a ~ normal(0,1);
...

I specifed those priors because for the first model (mar_lat_noF.stan) I was still having some issues for some simulations when using cauchy priors. Changing them to normal(0,10) gives a wider prior and still seems to work fine for few species and when I don’t include the factors in the model. For the models with the factors included… well, disaster persist. That brings me to a question. If I use the model were the factors have been integrated out, Y_t \sim MVN(\mu, \Lambda \Phi \Lambda' + \Psi) (N.B. \Phi = I), can I still use the sign change method only for the loadings? All I have seen about it is with the model with the factors in.

Cheers,
Alfonso

Topic		Replies	Views
Non-convergence of latent variable model Modeling	28	6163	February 13, 2020
Help with factor analysis (latent variable model) Modeling specification	22	5247	June 11, 2018
Latent variables, divergent transitions and prior selection Modeling rstan , techniques , fitting-issues	22	1427	September 19, 2021
Constraining Latent Factor model / baysian probabalisic matrix factorization to remove multimodality Modeling techniques , fitting-issues , specification	27	5395	March 17, 2020
Sparse infinite factor priors Modeling	14	1415	June 12, 2018

Convergence problems with factor loadings in multivariante autoregressive model

Related topics