The largest R-hat is NA with non-centered multivariate autoregressive model

Alfonso · April 30, 2020, 2:42am

Hello everyone,

I am fitting a multivariate autoregressive model, MAR, (also known as vector autoregressive model, VAR) to simulated data from its generative process. The model is from Ives et al. 2003. I am assuming that species only interact with individuals of their species, setting interspecific competition to 0.

Here is the Stan code (it takes around 2 minutes to run in my laptop)

data {
  int<lower = 0> sp;  // Number of species
  int<lower = 0> T;   // Number of years
  vector[sp] Y[T];   // The outcome matrix
}
parameters {
  vector[sp] r;                    // Vector of intrinsic growth rates
  vector[sp] a;                    // Vector with density dependence  parameters
  cholesky_factor_corr[sp] L_Rho;  // Cholesky corr matrix for species correlations
  vector<lower=0>[sp] sde;         // Species sd
}
transformed parameters{
  matrix[sp,sp] L_S;
  L_S = diag_pre_multiply(sde, L_Rho);   // Variance covariance matrix 
}
model {
  // Priors 
  a ~ normal(0,1);
  r ~ normal(0,1);
  L_Rho ~ lkj_corr_cholesky(2);
  sde ~ exponential(1);
  
  //Likelihood
  for(t in 2:T){
    Y[t] ~ multi_normal_cholesky(r + diag_matrix(a) * Y[t-1], L_S);
  }
}
generated quantities {
  matrix[sp,sp] Rho;
  matrix[sp,sp] S;
  Rho = multiply_lower_tri_self_transpose(L_Rho);  // Return the correlation matrix 
  S = L_S*L_S';                                    // Return variance covariance matrix
}

And here is the R code simulating the data

# Clean the environment
rm(list = ls())

# Load libraries
library(rstan)
library(mvtnorm)
library(rethinking)
library(truncnorm)

# simulate time series following a gompertz curve. Simulates species biomass on a log scale
Gompertz.sim <- function(init.pop, r, alpha, time, sp, Rho, sd_noise){
  metacommunity <- matrix(NA, nrow = time, ncol = sp)
  metacommunity[1,] <- init.pop 
  for (t in 2:time) {
    metacommunity[t,]<- r + alpha*metacommunity[t-1,] + t(rmvnorm2(1, Mu = rep(0, sp), sigma = diag(sd_noise), Rho = Rho))
  }
  return(metacommunity)
}


P0<- 1            # Initial population biomass for all species on log scale
sp <- 5           # Number of species
t <- 200          # Number of timesteps 
r <- rep(0.5, sp) # intrinsict grwoth rate for all species
alpha <- rtruncnorm(sp, a = -Inf, b = 0.95, mean = 0.94, sd = 0.1) # Density dependence for all sp

# Process error standard deviation for all sp
sd_noise <- matrix(0, nrow = sp, ncol = sp)   
diag(sd_noise) <- rlnorm(sp, -1.5, 0.1)       

Rho <- rlkjcorr(1, sp, 2) # Correlation matrix
Sigma <- sd_noise%*%Rho%*%sd_noise # variance-covariance matrix

# Simulate the community dynamics 
Sim.data <- Gompertz.sim(init.pop = P0, r = r, alpha = alpha, time = t, sp = sp, sd_noise = sd_noise, Rho = Rho)

# Prepare the stan data
dat<-list(Y = Sim.data, 
          sp = ncol(Sim.data), 
          T=nrow(Sim.data))

# Estimate the posterior
MAR_model <- stan(file = "MAR_model.stan",
                  data = dat,
                  chains = 4, 
                  cores = 4,
                  iter = 3000)

When I run the model I get the next warning messages:

Warning messages:
1: The largest R-hat is NA, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat
2: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess
3: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess

However, when I check the n_eff and Rhat, I get values indicating convergence. I suspect it has something to do with the L_Rho upper triangular elements being 0 and having NaN for their se_mean, n_eff and Rhat. However, I implemented the non-centered parameterization with Cholesky factors as I learnt it in Richard McElreath’s book, Statistical Rethinking. Indeed, I have ran a model from the book that worked correctly in late 2019, but now it is producing the same warning messages I encounter with the model above.

Running on
R version 3.6.3
rstan version 2.19.3

Any ideas?

Many thanks,
Alfonso

bgoodri · April 30, 2020, 3:46am

It is probably fine, although you could call monitor(MAR_model) to confirm that none of the actual parameters have high R-hat or low Bulk / Tail effective sample size.

Alfonso · April 30, 2020, 4:54am

Thanks for the quick reply. Yes the model seems to be converging fine. All the Rhat are below 1.05 and the bulk_Ess and Tail_Ess are high, even for all the L_Rho parameters. I wonder why the warning message is appearing, though.

bgoodri · April 30, 2020, 5:11am

It isn’t supposed to, but it checks for R_hat < 1.05 which is FALSE if any of the R_hat are NA.

Alfonso · April 30, 2020, 6:07am

That makes sense. I didn’t encounter this before. Now other models that converged without this warnings are producing this warnings. I wonder if it may be due to a minor bug in the last rstan version.

bgoodri · April 30, 2020, 10:04pm

Yeah, it is treating 1 < NA as FALSE.

T_nick · May 4, 2020, 11:16am

Same problem here. I guess it is an issue of 2.19.3. It didn’t occur in the previous one

martinmodrak · May 5, 2020, 10:11am

Could you (@Alfonso and @T_nick) try running your models with CmdStanR (https://mc-stan.org/cmdstanr/articles/cmdstanr.html)? The CRAN Rstan version is old-ish while CmdStanR can use the latest version, so it would be useful to see if the problem persists with the latest version. Thanks for letting us now!

Alfonso · May 6, 2020, 3:32am

I have installed the package from github but when I run install_cmdstan() I get

Removing the existing installation of CmdStan…
Latest CmdStan release is v2.23.0
Installing CmdStan v2.23.0 in C:/Users/alfon/Documents/.cmdstanr
Downloading cmdstan-2.23.0.tar.gz from GitHub…
Download complete
Unpacking archive…
Building CmdStan binaries…
Error in rethrow_call(c_processx_exec, command, c(command, args), stdin, :
Command ‘mingw32-make.exe’ not found @win/processx.c:983 (processx_exec)
Type .Last.error.trace to see where the error occured

and with .Last.error.trace
Stack trace:

cmdstanr:::install_cmdstan(overwrite = T)
cmdstanr:::build_cmdstan(dir_cmdstan, cores, quiet, timeout)
processx::run(make_cmd(), args = c(paste0("-j", cores), “build”), …
process$new(command, args, echo_cmd = echo_cmd, wd = wd, windows_verbatim_args = windows_verbatim_arg …
.subset2(public_bind_env, “initialize”)(…)
processx:::process_initialize(self, private, command, args, stdin, …
rethrow_call(c_processx_exec, command, c(command, args), stdin, …

x Command ‘mingw32-make.exe’ not found @win/processx.c:983 (processx_exec)

Not sure how to fix this.

P.S. I am working on windows 10

martinmodrak · May 6, 2020, 7:09am

This might be a PATH issue, do you have RTools on your path? My Rtools are located in C:\Rtools and my path has: C:\Rtools\bin;c:\Rtools\mingw_64\bin; I roughly remember removing the c:\Rtools\mingw_32\bin from PATH at some point, but not sure what problem I was trying to solve or if that ended up helping (but I don’t have the directory in there now).

T_nick · May 27, 2020, 11:20pm

Hi @martinmodrak, apologies for the late reply. I just tried cmdstanR, I got no warnings everything looks normal!

Topic		Replies	Views
Convergence problems with factor loadings in multivariante autoregressive model Modeling rstan , fitting-issues , specification , ecology	11	1586	July 31, 2020
Help with Multivariate Autoregressive Model with random effects Modeling rstan , specification , performance , multivariate-normal , autoregressive-model	2	184	April 26, 2024
Multivariate logistic model and Cholesky correlation matrix Modeling rstan , techniques	3	221	December 3, 2024
Multivariate analysis Modeling rstan , fitting-issues , specification	4	541	April 22, 2021
Generalized linear latent variable model in Stan Modeling	3	3793	October 9, 2017

The largest R-hat is NA with non-centered multivariate autoregressive model

Related topics