Hi everyone,

I am trying to implement a MAR/VAR model by Ovaskainen et al. 2017. I have simplified the model by assuming species only interact with individuals of their own species (i.e. no interspecific interactions). I have focused on estimating the variance-covariance matrix assuming a multiplicative gamma prior on the factor loadings (Bhattacharya and Dunson 2011). I have used some code from the Stan forum, mainly from here.

Here is the stan code:

data {
int<lower=0> sp;  // Number of species
int<lower=0> T;   // Numer of time steps
int <lower=0> nf; // Number of latent variables
vector[sp] Y[T];   // The outcome matrix

}
parameters {
real<lower=0> a1;
real<lower=0> a2;
vector[sp] r;
vector[sp] a;
vector<lower=0>[nf] delta;
matrix<lower=0>[sp,nf] phi;
matrix[sp,nf] lam;
vector<lower=0>[sp] s2i;
}
transformed parameters{
vector<lower=0>[nf] theta;
vector<lower=0>[sp] s2;
matrix[sp,sp] omega;

theta[1] = delta[1];
for(l in 2:nf)
theta[l] = theta[l-1] * delta[l];

s2 = 1.0 ./ s2i;

omega = diag_matrix(s2) + lam *lam';
}
model {
// priors
a1 ~ gamma(2.0, 1.0);
a2 ~ gamma(2.0, 1.0);
r ~ normal(0,1);
a ~ normal(0,1);

delta[1] ~ gamma(a1, 1.0);
for (h in 2:nf)
delta[h] ~ gamma(a2, 1.0);

for (s in 1:sp)
for (l in 1:nf)
phi[s, l] ~ gamma(3.0 / 2.0, 3.0 / 2.0);

for (s in 1:sp)
for (l in 1:nf)
lam[s, l] ~ normal(0.0, 1.0 / phi[s, l] / theta[l]);

for (s in 1:sp)
s2i[s] ~ gamma(1.0, 0.3);

//Likelihood
for (t in 2:T)
Y[t] ~ multi_normal(r + diag_matrix(a) * Y[t-1], omega);
}
generated quantities{
vector[sp] sde;
matrix[sp, sp]Rho;
sde = sqrt(s2);
for(i in 1:sp)
for(j in 1:sp)
Rho[i,j] = omega[i,j]/sqrt(omega[i,i] * omega[j,j]);
}



And here is the R code simulating the data and fitting the model

# Clean the environment
rm(list = ls())

library(rstan)
library(mvtnorm)
library(rethinking)
library(truncnorm)

# simulate time series following a gompertz curve. Simulates species biomass on a log scale
Gompertz.sim <- function(init.pop, r, alpha, time, sp, Rho, sd_noise){
metacommunity <- matrix(NA, nrow = time, ncol = sp)
metacommunity[1,] <- init.pop
for (t in 2:time) {
metacommunity[t,]<- r + alpha*metacommunity[t-1,] + t(rmvnorm2(1, Mu = rep(0, sp), sigma = diag(sd_noise), Rho = Rho))
}
return(metacommunity)
}

P0<- 1            # Initial population biomass for all species on log scale
sp <- 10           # Number of species
t <- 200          # Number of timesteps
r <- rnorm(sp, 0.65, 0) # intrinsict grwoth rate for all species
alpha <- rtruncnorm(sp, a = -Inf, b = 0.95, mean = 0.94, sd = 0.1) # Density dependence for all sp

# Process error standard deviation for all sp
sd_noise <- matrix(0, nrow = sp, ncol = sp)
diag(sd_noise) <- rlnorm(sp, -1.5, 0.1)

Rho <- rlkjcorr(1, sp, 2) # Correlation matrix
Sigma <- sd_noise%*%Rho%*%sd_noise # variance-covariance matrix

# Simulate the community dynamics
Sim.data <- Gompertz.sim(init.pop = P0, r = r, alpha = alpha, time = t, sp = sp, sd_noise = sd_noise, Rho = Rho)

# Prepare the stan data
dat<-list(Y = Sim.data,
sp = ncol(Sim.data),
T=nrow(Sim.data),
nf = 2)

# Estimate the posterior
MAR_model <- stan(file = "MAR_Ovas.stan",
data = dat,
chains = 4,
cores = 4,
iter = 2000,


The model recovers the true parameter fairly well. All the parameters seem to converge (good Rhat, n_eff and traceplots) except for the factor ladings lam, which do fine if I increase the number of iterations from 2000 to 15000.

I am new to Stan and latent variable models. Any ideas how to make my model specification more efficient? Or, any other suggestion on how to estimate the factor loadings using other priors?

Thanks,
Alfonso

Take a look at this post Help with factor analysis (latent variable model) and the latent factor model in the Stan manual. That post references @rfarouni blog http://rfarouni.github.io/2015-04-26-fa/ which has a lot more detail.

That post shows how to write the multi normal in terms of cholesky factors. Also you can remove some of the double loops by vectorization such as

// your formulation
for (s in 1:sp)
for (l in 1:nf)
phi[s, l] ~ gamma(3.0 / 2.0, 3.0 / 2.0);

// faster, this takes phi[s] as a vector length nf and places
// the gamma prior on each element of the vector
for (s in 1:sp)
phi[s] ~ gamma(3.0 / 2.0, 3.0 / 2.0);

1 Like

Thanks, that helped a lot.

Here is the stan model:

data {
int<lower=1> S;  // Number of Secies
int<lower=1> T;   // Numer of years
int <lower=1> D; // Number of latent dimensions
vector[S] Y[T];   // The outcome matrix
}
transformed data{
int<lower=1> M;
}
parameters {
vector[S] r;   // Vector of intrinsic growth rates
vector[S] a;   // matrix with competitive parameter
vector[M] L_t;   // lower diagonal elements of L
vector<lower=0>[D] L_d;   // lower diagonal elements of L
vector<lower=0>[S] psi;         // vector of variances
real<lower=0>   mu_psi;
real<lower=0>  sigma_psi;
real   mu_lt;
real<lower=0>  sigma_lt;
}
transformed parameters{
cov_matrix[S] Omega;   //Covariance mat
{
int idx;
idx = 0;
for(i in 1:(D-1)){
for(j in (i+1):D){
L[i,j] = 0; //constrain the upper triangular elements to zero
}
}
for(j in 1:D) {
L[j,j] = L_d[j];
for(i in (j+1):S) {
idx = idx + 1;
L[i,j] = L_t[idx];
}
}
}
Omega=L*L'+diag_matrix(psi);
}
model {
// the hyperpriors
mu_psi ~ cauchy(0, 1);
sigma_psi ~ cauchy(0,1);
mu_lt ~ cauchy(0, 1);
sigma_lt ~ cauchy(0,1);
// the priors
L_d ~ cauchy(0,3);
L_t ~ cauchy(mu_lt,sigma_lt);
psi ~ cauchy(mu_psi,sigma_psi);
r ~ normal(0,1);
a ~ normal(0,1);

//The likelihood
for (t in 2:T)
Y[t] ~ multi_normal(r + diag_matrix(a) * Y[t-1], Omega);
}
generated quantities{
matrix[S, S]Rho;
vector[S] sde;
for(i in 1:S)
for(j in 1:S)
Rho[i,j] = Omega[i,j]/sqrt(Omega[i,i] * Omega[j,j]);

sde = sqrt(diagonal(Omega));
}



The model runs well, but I wonder if there is any changes that can be done to increase the efficiency.

Use multi_normal_cholesky directly since you have L and psi. See the above discussion at Help with factor analysis (latent variable model) or search the manual for cholesky parameterization. You can also set the lagged Y in transformed data and have a vectorized call to the multi_normal_cholesky.

For what I understood \Sigma=LL^T+\Psi where L is the matrix of factor loadings with the constrains from @rfarouni blog. In Help with factor analysis (latent variable model) they used multi_normal_cholesky for the factor scores, not for the factor loadings. The Ld in their code is the product between the factor scores sd and the factor scores correlation, whereas the L in my code is the factor loadings, what is defined as lambda in their code. So you are suggesting to write

model{
multi_normal_cholesky(r + diag_matrix(a) * Y[t-1], L)
}


In this case I don’t know what to do with psi

Or

transformed parameters{
L_s = diag_pre_multiply(psi, L)
}
model {
multi_normal_cholesky(r + diag_matrix(a) * Y[t-1], L_s)
}


But L is not from a correlation matrix

I am sure I am missing something conceptually.

And thanks a lot for your help,
Alfonso

I think just something like

transformed parameters {
...
L_Omega = cholesky_decompose(Omega);
...
}
model {
...
for (t in 2:T)
Y[t] ~ multi_normal_cholesky(r + diag_pre_multiply(a, Y[t-1]), L_Omega);
...
}|


Hey :)

You should definitly look at Mauricio’s code in this thread :

It has been of great help to me!

Lucas

Hey,

It seems add_diag has not been included into rstan, but it is available on Stan math library. Any idea on how to load that into my workspace?

I did:

transformed parameters {
...
Omega = L*L'+diag_matrix(psi);
L_Omega = cholesky_decompose(Omega);
...
}
model {
...
for (t in 2:T)
Y[t] ~ multi_normal_cholesky(r + diag_matrix(a)*Y[t-1], L_Omega);
...
}


The model runs faster than before. However, I realised that for both cases multi_normal and multi_normal_cholesky, the model has low n_eff and Rhat for the L when I increase the number of species or the the number of latent dimensions. L Traceplots:

It looks like some of the L are bimodal. I used the same constrains for the factor loadings L as @ldeschamps. So the difference I see is that you model the latent factors FS_UNC with L_UNC to calculate Mu, while I use the factor loadings and psi to estimate the variance-covariance matrix. Is that right? If I am to include FS_UNC in my model I am not sure how to use it. should I

...
transformed parameters{
matrix[T,S] Mu;
...
for(t in 2:N) Mu [t] = r + diag_matrix(a) * Y[t-1] + FS_UNC * L_UNC';
}


If so I am not sure what argument use in multi_normal

...
for (t in 2:T)
Y[t] ~ multi_normal(Mu[t], ????);
}


Thanks a lot for the help :)

Cheers,
Alfonso

Hi there,

After some time I got back to this problem. I implemented the sign correction in the generated quantities block. I tried first with a model not including the latent factors, only the factor loadings, similar to my model specifiaction above. The model converges and retrieves the true parameters.

Here is the stan code for that model (mar_lat_noF.stan)

data {
int<lower=1> S;  // Number of Secies
int<lower=1> T;   // Numer of years
int <lower=1> D; // Number of latent dimensions
vector[S] Y[T];   // The outcome matrix
}
transformed data{
int<lower=1> M;
}
parameters {
vector[S] r;   // Vector of intrinsic growth rates
vector[S] a;   // matrix with competitive parameter
vector[M] L_t;   // lower diagonal elements of L
vector[D] L_d;   // diagonal elements of L
vector<lower=0>[S] psi;         // vector of variances
}
transformed parameters{
cov_matrix[S] Omega;   // Covariance matrix
matrix[S,S] L_Omega;

{
int idx;
idx = 0;
for(i in 1:(D-1)){
for(j in (i+1):D){
L[i,j] = 0; //constrain the upper triangular elements to zero
}
}
for(j in 1:D) {
L[j,j] = L_d[j];
for(i in (j+1):S) {
idx = idx + 1;
L[i,j] = L_t[idx];
}
}
}

Omega = L*L';
for (i in 1:rows(Omega))
Omega[i,i] = Omega[i,i] + psi[i];
L_Omega = cholesky_decompose(Omega);
}
model {
// the priors
L_d ~ normal(0,1);
L_t ~ normal(0,1);
psi ~ exponential(2);
r ~ normal(0,1);
a ~ normal(0,1);

//The likelihood
for (t in 2:T)
Y[t] ~ multi_normal_cholesky(r + diag_matrix(a)*Y[t-1], L_Omega);
}
generated quantities{
matrix[S,S] Rho; // Correlation matrix
vector[S] sde;   // species' sd

for(d in 1:D){
if(L[d,d]<0){
L_invar[,d] = -1 * L[,d];
}
}
for(i in 1:S)
for(j in 1:S)
Rho[i,j] = Omega[i,j]/sqrt(Omega[i,i] * Omega[j,j]);

sde = sqrt(diagonal(Omega));
}


Both Mauricio’s code and @ldeschamps’s code have explicitly included the latent factors in their models. I tried to implement a stan model including them (mar_lat.stan), but assuming no correlation between latent factors:

data {
int<lower=1> S;  // Number of Secies
int<lower=1> T;   // Numer of years
int <lower=1> D; // Number of latent dimensions
vector[S] Y[T];   // The outcome matrix
}
transformed data{
int<lower=1> M;

}
parameters {
vector[S] r;   // Vector of intrinsic growth rates
vector[S] a;   // matrix with competitive parameter
vector[M] L_t;   // lower diagonal elements of L
vector[D] L_d;   // diagonal elements of L
vector<lower=0>[S] psi;         // vector of variances
matrix[D,T-1] F;

}
transformed parameters{
matrix[S,T-1] rand_eff;
vector[S] Mu[T-1];

{
int idx;
idx = 0;
for(i in 1:(D-1)){
for(j in (i+1):D){
L[i,j] = 0; //constrain the upper triangular elements to zero
}
}
for(j in 1:D) {
L[j,j] = L_d[j];
for(i in (j+1):S) {
idx = idx + 1;
L[i,j] = L_t[idx];
}
}
}
rand_eff = L*F;
for(t in 1:(T-1)){
Mu[t,] = r + diag_matrix(a)*Y[t] + rand_eff[,t];
}
}
model {
// the priors
L_d ~ normal(0,1);
L_t ~ normal(0,1);
psi ~ exponential(2);
r ~ normal(0,1);
a ~ normal(0,1);
to_vector(F) ~ normal(0,1);

//The likelihood
for (t in 2:T){
for(s in 1:S){
Y[t,s] ~ normal(Mu[t-1,s], psi[s]);
}
}
}
generated quantities{
matrix[S,S] Omega;
matrix[S, S] Rho;
vector[S] sde;
matrix[S,D] L_invar = L;
matrix[D,T-1] F_invar = F;

for(d in 1:D){
if(L[d,d]<0){
L_invar[,d] = -1 * L[,d];
F_invar [,d] = -1 * F[,d];
}
}
Omega = L_invar*L_invar'+diag_matrix(psi);
for(i in 1:S)
for(j in 1:S)
Rho[i,j] = Omega[i,j]/sqrt(Omega[i,i] * Omega[j,j]);

sde = sqrt(diagonal(Omega));
}


It didn’t go well. I had divergent transitions, low n_eff, big Rhats and low bfmi in the four chains (same fate when increasing adapt_delta and max_treedepth). The true parameter values for r and a were retrieved by the model, but the variance-covariance matrix Omega was not.

and here the summary:

               mean   sd  5.5% 94.5% n_eff Rhat4
L_invar[1,1]   0.09 0.03  0.04  0.13   531  1.00
L_invar[1,2]   0.00 0.00  0.00  0.00   NaN   NaN
L_invar[1,3]   0.00 0.00  0.00  0.00   NaN   NaN
L_invar[2,1]   0.12 0.04  0.06  0.17   325  1.02
L_invar[2,2]   0.08 0.04  0.01  0.14   152  1.02
L_invar[2,3]   0.00 0.00  0.00  0.00   NaN   NaN
L_invar[3,1]  -0.03 0.05 -0.10  0.05   236  1.00
L_invar[3,2]   0.05 0.07 -0.07  0.14   197  1.01
L_invar[3,3]   0.10 0.04  0.02  0.16    90  1.05
L_invar[4,1]   0.05 0.08 -0.09  0.18   120  1.03
L_invar[4,2]   0.16 0.12 -0.09  0.29    80  1.06
L_invar[4,3]  -0.15 0.13 -0.28  0.15    87  1.05
L_invar[5,1]  -0.16 0.03 -0.20 -0.11   576  1.01
L_invar[5,2]  -0.03 0.05 -0.10  0.04   263  1.01
L_invar[5,3]   0.02 0.04 -0.04  0.08   212  1.02
L_invar[6,1]   0.16 0.04  0.09  0.21   213  1.02
L_invar[6,2]   0.04 0.07 -0.07  0.14   175  1.02
L_invar[6,3]  -0.09 0.04 -0.14 -0.01    82  1.06
L_invar[7,1]   0.08 0.05 -0.01  0.16   124  1.02
L_invar[7,2]  -0.10 0.08 -0.19  0.05    81  1.05
L_invar[7,3]   0.04 0.10 -0.17  0.17   137  1.04
L_invar[8,1]  -0.05 0.03 -0.10  0.00   646  1.00
L_invar[8,2]  -0.04 0.04 -0.09  0.03   270  1.01
L_invar[8,3]  -0.04 0.03 -0.09  0.01   152  1.03
L_invar[9,1]   0.03 0.06 -0.07  0.12   166  1.00
L_invar[9,2]   0.05 0.11 -0.15  0.21   156  1.02
L_invar[9,3]   0.19 0.07  0.04  0.25    43  1.09
L_invar[10,1] -0.11 0.04 -0.16 -0.05   387  1.00
L_invar[10,2] -0.05 0.05 -0.13  0.04   222  1.01
L_invar[10,3] -0.06 0.04 -0.11  0.01   106  1.03


All of this makes me think I have not specified the model correctly.

I tried another model where I include correlation among factors (mar_lat2.stan), with similar failure:

data {
int<lower=1> S;  // Number of Secies
int<lower=1> T;   // Numer of years
int <lower=1> D; // Number of latent dimensions
vector[S] Y[T];   // The outcome matrix
}
transformed data{
int<lower=1> M;
vector[D] FS_mu;           // vector of latent factor means
vector<lower=0>[D] FS_sd;  // vector of latent factor sd
for(m in 1:D){
FS_mu[m] = 0;
FS_sd[m] = 1;
}
}
parameters {
vector[S] r;   // Vector of intrinsic growth rates
vector[S] a;   // matrix with competitive parameter
vector[M] L_t;   // lower diagonal elements of L
vector[D] L_d;   // diagonal elements of L
vector<lower=0>[S] psi;         // vector of variances
matrix[D,T-1] F;
cholesky_factor_corr[D] Rho_F;
}
transformed parameters{
matrix[D,D] L_F;
matrix[S,T-1] rand_eff;
vector[S] Mu[T-1];

L_F = diag_pre_multiply(FS_sd, Rho_F);
{
int idx;
idx = 0;
for(i in 1:(D-1)){
for(j in (i+1):D){
L[i,j] = 0; //constrain the upper triangular elements to zero
}
}
for(j in 1:D) {
L[j,j] = L_d[j];
for(i in (j+1):S) {
idx = idx + 1;
L[i,j] = L_t[idx];
}
}
}
rand_eff = L*F;
for(t in 1:(T-1)){
Mu[t,] = r + diag_matrix(a)*Y[t] + rand_eff[,t];
}
}
model {
// the priors
L_d ~ normal(0,1);
L_t ~ normal(0,1);
psi ~ exponential(2);
r ~ normal(0,1);
a ~ normal(0,1);
Rho_F ~ lkj_corr_cholesky(2);

//The likelihood
for (t in 2:T){
F[,t-1] ~ multi_normal_cholesky(FS_mu, L_F);
for(s in 1:S){
Y[t,s] ~ normal(Mu[t-1,s], psi[s]);
}
}
}
generated quantities{
matrix[S,S] Omega;
matrix[S, S] Rho;
vector[S] sde;
matrix[S,D] L_invar = L;
matrix[D,T-1] F_invar = F;
matrix[D,D] phi;

phi = multiply_lower_tri_self_transpose(Rho_F);

for(d in 1:D){
if(L[d,d]<0){
L_invar[,d] = -1 * L[,d];
F_invar [,d] = -1 * F[,d];
}
}
Omega = L_invar*phi*L_invar'+diag_matrix(psi);
for(i in 1:S)
for(j in 1:S)
Rho[i,j] = Omega[i,j]/sqrt(Omega[i,i] * Omega[j,j]);

sde = sqrt(diagonal(Omega));
}


Here is the R script simulating data and fitting the model

# Clean the environment
rm(list = ls())

library(rstan)
library(mvtnorm)
library(rethinking)
library(truncnorm)

# simulate time series following a gompertz curve. Simulates species biomass on a log scale
Gompertz.sim <- function(init.pop, r, alpha, time, sp, Rho, sd_noise){
metacommunity <- matrix(NA, nrow = time, ncol = sp)
metacommunity[1,] <- init.pop
for (t in 2:time) {
metacommunity[t,]<- r + alpha*metacommunity[t-1,] +
t(rmvnorm2(1, Mu = rep(0, sp),
sigma = diag(sd_noise),
Rho = Rho))
}
return(metacommunity)
}

set.seed(2020)

P0<- 1            # Initial population biomass for all species on log scale
sp <- 10           # Number of species
t <- 100          # Number of timesteps
r <- rnorm(sp, 0.65, 0.1) # intrinsict grwoth rate for all species
alpha <- rtruncnorm(sp, a = -Inf, b = 0.95, mean = 0.9, sd = 0.1) # Density dependence for all sp

# Process error standard deviation for all sp
sd_noise <- matrix(0, nrow = sp, ncol = sp)
diag(sd_noise) <- rlnorm(sp, -1.5, 0.1)

Rho <- rlkjcorr(1, sp, 2) # Correlation matrix
Sigma <- sd_noise%*%Rho%*%sd_noise # variance-covariance matrix

# Simulate the community dynamics
Sim.data <- Gompertz.sim(init.pop = P0, r = r, alpha = alpha, time = t, sp = sp, sd_noise = sd_noise, Rho = Rho)

# Prepare the stan data
dat<-list(Y = Sim.data,
S = ncol(Sim.data),
T=nrow(Sim.data),
D = 3)

# Estimate the posterior
MAR_model <- stan(file = "mar_lat.stan", #file = "mar_lat2.stan" #file = "mar_lat_noF.stan"
data = dat,
chains = 4,
cores = 4,
iter = 4000,
control = list(adapt_delta = 0.8, max_treedepth = 10))


Any ideas?

Cheers,
Alfonso

Hi Alfonso!

I am myself dealing with similar trouble : chains sometimes diverging without being completly messed up.

I read the paper describing the sign-changing method :

Merkle EC , Wang T . 2018 . Bayesian latent variable models for the analysis of experimental psychology data. Psychonomic Bulletin and Review 25 : 256–270.

Another trick is to define the non-identified prior as hierarchical with estimated variances (or covariances), and then expand it by scaling by this standard-deviation.

I find this approach quite promising, you may give a shot at it!

Cheers, Lucas

1 Like

Put a normal(0,1) prior on scores and a wider prior on loadings
OR
They estimate the variance of scores and put a normal(0,1) prior on loadings.

Nice to see you should have good temporal meta-community data :)

Good success!
Lucas

Hi Lucas,

Do you mean something like this:

parameters{
...
real<lower=0> sigma_lt;
real<lower=0> sigma_ld;
...
}
...
model {
// hyperpriors
sigma_lt ~ exponential(1);
sigma_ld ~ exponential(1);
// Priors
L_d ~ normal(0,sigma_ld);
L_t ~ cauchy(0,sigma_lt);
psi ~ exponential(2);
r ~ normal(0,1);
a ~ normal(0,1);
...


I specifed those priors because for the first model (mar_lat_noF.stan) I was still having some issues for some simulations when using cauchy priors. Changing them to normal(0,10) gives a wider prior and still seems to work fine for few species and when I don’t include the factors in the model. For the models with the factors included… well, disaster persist. That brings me to a question. If I use the model were the factors have been integrated out, Y_t \sim MVN(\mu, \Lambda \Phi \Lambda' + \Psi) (N.B. \Phi = I), can I still use the sign change method only for the loadings? All I have seen about it is with the model with the factors in.

Cheers,
Alfonso