Multivariate logistic model and Cholesky correlation matrix

Hi! I’m using Stan to model a multivariate logistic mixed model for a dataset with fish species. The responses are the presence (1) or absence (0) of 5 species and some covariates like salinity, temperature, depth, etc… I was searching on the forum some solutions to my problem, but I didn’t find any solution with the logistic regression but only with the probit regression. I wrote a model which fits my data, but as I’m pretty new to RStan I don’t know if it is actually correct and if it makes sense. The main problem is to account for the correlation between the species.


data {
  int<lower=0> N;// number of observations
  int<lower=0> K;// number of species
  int<lower=0> P; // number of covariates
  int<lower=0, upper=1> y[N,K];   // response variable
  vector[P] x[N]; // covariates
}

parameters {
  matrix[K, P] beta; // coefficients for covariates (and intercept)
  cholesky_factor_corr[K] L_Omega;
  vector[K] z[N];
}

transformed parameters {
  vector[K] x_beta[N];
  for (n in 1:N)
    x_beta[n] = beta * x[n];
}
model {
  L_Omega ~ lkj_corr_cholesky(4);
  to_vector(beta) ~ normal(0, 5);

  z ~ multi_normal_cholesky(x_beta, L_Omega);

  for (i in 1:N)
    y[i] ~ bernoulli_logit(z[i]);
}

generated quantities {
  corr_matrix[K] Omega;
  Omega = multiply_lower_tri_self_transpose(L_Omega);
}

At first glance, your methods seem reasonable, but I don’t have time to check. It’s also not how I would code it, but that is also my own style.

Can you vectorize the for loop?

Here’s resources to help:

Sorry I cannot help more than these pointers.

Bumping this because I’m doing something similar and it seems like it would be better to stick here than start a new thread.

I’m wondering how this worked out for you? I am playing with something similar, and came up with nearly the same model. There are two differences in how I approached it. I converted the outcome to a one dimensional array and the predicted log-odds to a vector so that the likelihood can be vectorized, like @Richard_Erickson suggested.

Second, your model for the log-odds:

L_Omega ~ lkj_corr_cholesky(4);
z ~ multi_normal_cholesky(x_beta, L_Omega);

forces the standard deviation of the errors in the log-odds to be 1. I think you would want to add parameters to estimate the standard deviations. It seems like they would probably be smaller than 1 (e.g. a standard normal distribution centered at 0 in the logit space would have a 95% interval of ~ .12 - .88 in probability space). I’m wondering if allowing uncertainty in the log-odds (as I also have done) would make it difficult to get precise estimates of the coefficients since it allows more values to be compatible? Maybe putting fairly strong regularizing priors on the standard deviations might be a good idea?

Interested in any advice folks have. Here is how I’ve done it:

Thanks!

data{
  int<lower=1> K; // number of covariates
  int<lower=1> D; // number of outcome variables
  int<lower=0> N; // number of observations
  array[N, D] int<lower=0,upper=1> y; // array of binary outcomes
  array[N] vector[K] x; // covariates
}
transformed data{
  // make y 1 dimensional so that bernoulli_logit() can be vectorized
  array[N*D] int y_vec;
  {
    int j = 1;
    for(n in 1:N){
      for(d in 1:D){
        y_vec[j] = y[n,d];
        j += 1;
      }
    }
  }
}
parameters{
  matrix[D, K] beta; // matrix of coefficients
  cholesky_factor_corr[D] L_Omega; // cholesky factorization of the matrix of correlations of the log_odds
  matrix[N,D] z;
  vector<lower=0>[D] sigma;
}
transformed parameters{
  vector[N*D] l_odds; // vector of log-odds 
  array[N] vector[D] mu_odds; // array of vectors carrying the log-odds
  for(i in 1:N){
    // calculated the covarying log-odds
    mu_odds[i] = beta * x[i] + diag_pre_multiply(sigma, L_Omega) * z[i]';
  }
  {
    // make the log odds one dimmensional (so that they match y_vec) so
    // bernoulli_logit() can be vectorized
    int g = 1;
    for(n in 1:N){
      for(d in 1:D){
        l_odds[g] = mu_odds[n, d];
        g += 1;
      }
    }
  }
}
model{
  to_vector(beta) ~ normal(0, 1); // prior on coefficients
  L_Omega ~ lkj_corr_cholesky(2); // prior on correlation matrix
  to_vector(z) ~ normal(0, 1); // prior on z-score of errors
  sigma ~ exponential(2); // prior on standard deviation of errors
  
  // likelihood
  y_vec ~ bernoulli_logit(l_odds);
}

Sorry I didn’t see this earlier. You probably want to scale the correlation matrix up into a covariance matrix as described in the regression chapter of our User’s Guide. There you’ll see that you want to scale with a vector sigma of positively constrained scales to get a Cholesky factor for a full covariance matrix, which will look like L_Sigma = diag_pre_multiply(sigma, L_Omega).

It looks like you have a paramter z[k, n] for each binay observation y[k, n]. That may prove challenging to fit because it’s going to try to push the z[k, n] toward + or - infinity depending on whether y[k, n] is 1 or 0.