Joint Modeling of Mixed Outcome Types

Hi all!

Broad question:
In Stan, is it possible to jointly model a multivariate response vector of mixed data types?

Specifics:
I borrow heavily from these two papers for the generalities:
Gueorguieva_Agresti_2001.pdf (1.3 MB)
Zhang_EtAl_2014.pdf (1.5 MB)

Suppose I have an observed response vector Y_i = (C_i, O_i) where each individual/row in my dataset has a series of continuous measurements C and discrete ordinal scores O. Separately, these models are rather straightforward (although certainly not trivial) in Stan.

In the continuous case C we can simply design a MVN model with a mean vector \mathbf{\mu} and covariance structure \Sigma. The code here MVN code demonstrates an example from my own use case. Further, in the ordinal case O, one can use @bgoodri 's parameterization here that is a parameterization of a tMVN and the GHK algorithm.

The crux of this post is if I can combine these two models into a single model. That is, instead of modeling the continuous and ordinal models separately, can they be modeled jointly in a way that captures the correlation/relationship between each outcome according to some predictor x. Back to Zhang et al., from above we assumes the continuous variables C_i follows a MVN with some mean function \mu and covariance structure \Sigma. We also assume the ordinals variables O_i follow a multivariate probit model with continuous normal latent variable ZO_i also with some mean \mu and covariance \Sigma. Thus I have a final response vector Z_i = (C_i, ZO_i). At the end of the day I want to model Z_i \sim MVN(\mu_i, \Sigma).

This is where I am stuck. Looking through the forum, @CerulloE has adapted the probit code for both binary and polychotomous responses. My questions is if I can extend it or adapt it to incorportate continuous and ordinal (both binary and polychotomous) responses.

Thank you!

You can do this with a normal copula. Code for mixed discrete is at Helpful Stan Functions: Mixed Discrete-Continuous Gaussian Copula Functions.

It does not have ordinal outcomes yet. @andrjohns ported this to brms and has a branch where you can use brms R code to model the mixed outcomes. He may have added ordinal. Anyway, you can follow the idea and code up the ordinal marginal to use with the normal marginals.

1 Like

Thanks @spinkney! The key limitation with the mixed-type outcome modelling is that it requires the CDF for each outcome type. Unfortunately, Stan does not yet have an ordinal CDF.

Itā€™s something that Iā€™m planning on adding eventually, but I canā€™t say for certain when Iā€™ll get to it.

Thank you, @spinkney and @andrjohns !

So I will first admit, Iā€™m a novice at the usage and implementation of copulas - apologies if I make 0 sense or way off base with my train of thought.

Iā€™ve been scanning your helpful Stan functions as well as your example code for a general multi_normal copula here. Please correct me if I am wrong, but essentially the workflow is as follows:

  1. use the normal marginal function from your link on the continuous variables in my dataset - this returns a 2D array of matrices where rtn[1] is the centered random variable and rtn[2] is the jacobian adjustment.

  2. use some sort of ā€œordinalā€ marginal function on the ordinal variables in my dataset. Now, I know you said this doesnā€™t currently exist technically. But, looking at @bgoodri 's tMVN/GHK parameterization and comparing it to some of your other marginals like poisson and bernoulli, it appears I want the variable z which represents my random variable and the operation on bound, which represents the jacobian adjustment. So like above, I end up with a 2D array of matrices with the random variable z and jacobian adjustment.

  3. The marginals from the normal marginal and the ā€œordinalā€ marginal can then be fed in as arguments in your centered_gaussian_copula_cholesky_lpdf log prob function.

3b) Essentially the construction here is like your code here. The exception is your likelihood statement: my for loop would be something like:

for(n to N){
y[cont,N] = normal_marginal(...)
y[ord,N] = ordinal_marginal(...) // here I mean z and bound from the tMVN code
}

y ~ multi_normal_cholesky_copula(L)

Am I close at all?

Yes! Take a look at this more fleshed out example by @andrjohns Gaussian Copulas Ā· Issue #1317 Ā· paul-buerkner/brms Ā· GitHub

@andrjohns As I slowly try to figure out how to code up a CDF for probit outcomes (maybe similar to @bgoodri 's multi-probit code??) I was perusing here Seemingly unrelated / Multivariate Probit Regression #1366 and here Gaussian Copulas #1317.

Correct me if I am wrong, but it may be possible that in the coming weeks youā€™ll have a possible probit/ordinal cdf?

@spinkney @andrjohns

Quick syntax question as I manipulate some of your copula code. I paste the full model below. I have not gotten the ordinal probit coded up yet, but in the meantime, I am just testing on continuous data to make sure I have the code up and running there first. Note, this is not ā€˜efficientā€™ or anything. Just testing some code.

That said, I am just adjusting small parts for my own use. This includes my own mean function \mu (instead of the linear model) and I also want to customize the sd \sigma - basically I want sigma to vary as a function of x. Iā€™ve gotten this to work in general models like here . My attempt at this is seen in the code below in the model block (I even write HELP to stand out :)). I know I need to manipulate this area AND manipulate the normal_marginal function from above because as written it only takes a vector of 2 SDs (assumes constant noise) - essentially I am trying to model heteroscedasticity here.

Youā€™ll see in some of my commented areas Iā€™ve attempted a few things including declaring a new matrix. I always get an error related to incorrect syntax either in the normal_marinal function or during my initial declaration in the model block.

I hope all of this makes sense!

functions{
  matrix chol2inv2(matrix L) {
    int K = rows(L);
    matrix[K, K] K_identity = diag_matrix(rep_vector(1.0, K));
    matrix[K, K] L_inv = mdivide_left_tri_low(L, K_identity);
    matrix[K, K] rtn;

    if (K == 0) {
      return L;
    }

    if (K == 1) {
      rtn[1, 1] = inv_square(L[1, 1]);
    } else {
      for (k in 1:K) {
        rtn[k, k] = dot_self(tail(L_inv[ : , k], K + 1 - k));
        for (j in (k + 1):K) {
          int Kmj = K - j;
          rtn[k, j] = dot_product(tail(L_inv[ : , k], Kmj + 1),
                                  tail(L_inv[ : , j], Kmj + 1));
          rtn[j, k] = rtn[k, j];
        }
      }
    }

    return rtn;
  }
  
  real multi_normal_cholesky_copula_lpdf(matrix U, matrix L) {
    int N = rows(U);
    int J = cols(U);
    matrix[J, J] Gammainv = chol2inv2(L);
    return -N * sum(log(diagonal(L))) - 0.5 * sum(add_diag(Gammainv, -1.0) .* crossprod(U));
  }
  
  real centered_gaussian_copula_cholesky_(matrix[,] marginals, matrix L, int[] order) {
    // Extract dimensions of final outcome matrix
    int N = rows(marginals[1][1]);
    int J = rows(L);
    matrix[N, J] U;

    // Iterate through marginal arrays, concatenating the outcome matrices by column
    // and aggregating the log-likelihoods (from continuous marginals) and jacobian
    // adjustments (from discrete marginals)
    real adj = 0;
    int pos = 1;
    for (m in 1 : size(marginals)) {
      int curr_cols = cols(marginals[m][1]);

      U[ : , pos : (pos + curr_cols - 1)] = marginals[m][1];

      adj += sum(marginals[m][2]);
      pos += curr_cols;
    }

    // Return the sum of the log-probability for copula outcomes and jacobian adjustments
    return multi_normal_cholesky_copula_lpdf(U[ : , order] | L) + adj;
  }
  
  matrix[] normal_marginal(matrix y, matrix mu_glm, vector sigma) {
    int N = rows(mu_glm);
    int J = cols(mu_glm);
    matrix[N, J] rtn[2];
    //matrix[N, J] sigma2; HELP
    // Initialise the jacobian adjustments to zero, as vectorised lpdf will be used
    rtn[2] = rep_matrix(0, N, J);

    for (j in 1 : J) {
      rtn[1][ : , j] = (y[ : , j] - mu_glm[ : , j]) / sigma[1];
      rtn[2][1, j] = normal_lpdf(y[ : , j] | mu_glm[ : , j], sigma[2]);
    }

    return rtn;
}
}
data{
  int<lower=1> N; // total number of obs
  vector[N] y1; // first continuous response
  vector[N] y2; // second continuous response
  vector[N] x;
  int<lower=1> nresp; // number of responses
}
transformed data{
  int Outcome_Order[2]; //response array
  matrix[N,2] Y_gaussian; //response array
  Outcome_Order[1]=1;
  Y_gaussian[ : ,1] = y1;
  Outcome_Order[2] = 2;
  Y_gaussian[ : ,2] = y2;
}
parameters{
  real <lower=0> a1;
  real <lower=0> a2;
  real <lower=0> r1;
  real <lower=0> r2;
  real b1;
  real b2;
  real <lower=0> kappa1;
  real <lower=0> kappa2;
  real <lower=0> sigma_y1;
  real <lower=0> sigma_y2;
  cholesky_factor_corr[nresp] Lrescor;
}
model{
  // priors
  a1 ~ normal(0,10); // half-normal
	r1 ~ normal(0,1); //half-normal
	b1 ~ normal(0,10);
	a2 ~ normal(0,10); // half-normal
	r2 ~ normal(0,1); //half-normal
	b2 ~ normal(0,10);
	kappa1 ~ normal(0,1); //half-normal
	kappa2 ~ normal(0,1); //half-normal
  sigma_y1 ~ exponential(2);
  sigma_y2 ~  exponential(2);
  Lrescor ~ lkj_corr_cholesky(1);
  
  // likelihood 
  vector[N] mu1 = a1*x^r1+b1;
  vector[N] mu2 = a2*x^r2+b2;
  //vector[N] s1 = sigma_y1*(1+kappa1*x); HELP
  //vector[N] s2 = sigma_y2*(1+kappa2*x); HELP
  
  matrix[N,2] Mu_gaussian;
  
  vector[2] sigma = transpose([sigma_y1, sigma_y2]);
  Mu_gaussian[ : , 1] = mu1;
  Mu_gaussian[ : , 2] = mu2;
  //matrix[N,2] sigma; HELP
  //sigma[ : , 1] = s1; HELP
  //sigma[ : , 2] = s2; HELP
  
  target += centered_gaussian_copula_cholesky_({normal_marginal(Y_gaussian, Mu_gaussian, sigma)}, Lrescor, Outcome_Order);
}
generated quantities{
  corr_matrix[nresp] Rescor = multiply_lower_tri_self_transpose(Lrescor);
}