Hello everyone,
I am fitting a multivariate autoregressive model, MAR, (also known as vector autoregressive model, VAR) to simulated data from its generative process. The model is from Ives et al. 2003. I am assuming that species only interact with individuals of their species, setting interspecific competition to 0.
Here is the Stan code (it takes around 2 minutes to run in my laptop)
data {
int<lower = 0> sp; // Number of species
int<lower = 0> T; // Number of years
vector[sp] Y[T]; // The outcome matrix
}
parameters {
vector[sp] r; // Vector of intrinsic growth rates
vector[sp] a; // Vector with density dependence parameters
cholesky_factor_corr[sp] L_Rho; // Cholesky corr matrix for species correlations
vector<lower=0>[sp] sde; // Species sd
}
transformed parameters{
matrix[sp,sp] L_S;
L_S = diag_pre_multiply(sde, L_Rho); // Variance covariance matrix
}
model {
// Priors
a ~ normal(0,1);
r ~ normal(0,1);
L_Rho ~ lkj_corr_cholesky(2);
sde ~ exponential(1);
//Likelihood
for(t in 2:T){
Y[t] ~ multi_normal_cholesky(r + diag_matrix(a) * Y[t-1], L_S);
}
}
generated quantities {
matrix[sp,sp] Rho;
matrix[sp,sp] S;
Rho = multiply_lower_tri_self_transpose(L_Rho); // Return the correlation matrix
S = L_S*L_S'; // Return variance covariance matrix
}
And here is the R code simulating the data
# Clean the environment
rm(list = ls())
# Load libraries
library(rstan)
library(mvtnorm)
library(rethinking)
library(truncnorm)
# simulate time series following a gompertz curve. Simulates species biomass on a log scale
Gompertz.sim <- function(init.pop, r, alpha, time, sp, Rho, sd_noise){
metacommunity <- matrix(NA, nrow = time, ncol = sp)
metacommunity[1,] <- init.pop
for (t in 2:time) {
metacommunity[t,]<- r + alpha*metacommunity[t-1,] + t(rmvnorm2(1, Mu = rep(0, sp), sigma = diag(sd_noise), Rho = Rho))
}
return(metacommunity)
}
P0<- 1 # Initial population biomass for all species on log scale
sp <- 5 # Number of species
t <- 200 # Number of timesteps
r <- rep(0.5, sp) # intrinsict grwoth rate for all species
alpha <- rtruncnorm(sp, a = -Inf, b = 0.95, mean = 0.94, sd = 0.1) # Density dependence for all sp
# Process error standard deviation for all sp
sd_noise <- matrix(0, nrow = sp, ncol = sp)
diag(sd_noise) <- rlnorm(sp, -1.5, 0.1)
Rho <- rlkjcorr(1, sp, 2) # Correlation matrix
Sigma <- sd_noise%*%Rho%*%sd_noise # variance-covariance matrix
# Simulate the community dynamics
Sim.data <- Gompertz.sim(init.pop = P0, r = r, alpha = alpha, time = t, sp = sp, sd_noise = sd_noise, Rho = Rho)
# Prepare the stan data
dat<-list(Y = Sim.data,
sp = ncol(Sim.data),
T=nrow(Sim.data))
# Estimate the posterior
MAR_model <- stan(file = "MAR_model.stan",
data = dat,
chains = 4,
cores = 4,
iter = 3000)
When I run the model I get the next warning messages:
Warning messages:
1: The largest R-hat is NA, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat
2: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess
3: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess
However, when I check the n_eff and Rhat, I get values indicating convergence. I suspect it has something to do with the L_Rho upper triangular elements being 0 and having NaN for their se_mean, n_eff and Rhat. However, I implemented the non-centered parameterization with Cholesky factors as I learnt it in Richard McElreathās book, Statistical Rethinking. Indeed, I have ran a model from the book that worked correctly in late 2019, but now it is producing the same warning messages I encounter with the model above.
Running on
R version 3.6.3
rstan version 2.19.3
Any ideas?
Many thanks,
Alfonso