Multivariate logistic model and Cholesky correlation matrix

Hi! I’m using Stan to model a multivariate logistic mixed model for a dataset with fish species. The responses are the presence (1) or absence (0) of 5 species and some covariates like salinity, temperature, depth, etc… I was searching on the forum some solutions to my problem, but I didn’t find any solution with the logistic regression but only with the probit regression. I wrote a model which fits my data, but as I’m pretty new to RStan I don’t know if it is actually correct and if it makes sense. The main problem is to account for the correlation between the species.


data {
  int<lower=0> N;// number of observations
  int<lower=0> K;// number of species
  int<lower=0> P; // number of covariates
  int<lower=0, upper=1> y[N,K];   // response variable
  vector[P] x[N]; // covariates
}

parameters {
  matrix[K, P] beta; // coefficients for covariates (and intercept)
  cholesky_factor_corr[K] L_Omega;
  vector[K] z[N];
}

transformed parameters {
  vector[K] x_beta[N];
  for (n in 1:N)
    x_beta[n] = beta * x[n];
}
model {
  L_Omega ~ lkj_corr_cholesky(4);
  to_vector(beta) ~ normal(0, 5);

  z ~ multi_normal_cholesky(x_beta, L_Omega);

  for (i in 1:N)
    y[i] ~ bernoulli_logit(z[i]);
}

generated quantities {
  corr_matrix[K] Omega;
  Omega = multiply_lower_tri_self_transpose(L_Omega);
}

At first glance, your methods seem reasonable, but I don’t have time to check. It’s also not how I would code it, but that is also my own style.

Can you vectorize the for loop?

Here’s resources to help:

Sorry I cannot help more than these pointers.