Correlated observations that aren't normal

What are my options for modeling correlated observations that aren’t normal?

The situation is I have observations y_{ij} where I’ve made I measurements each time observing J things.

I can build a model for each J that works fine. So like I could do:

y_{i1} \sim \text{log_normal}(...) \\ y_{i2} \sim \text{neg_binomial}(...) \\ ...

but y_{i1} and y_{i2} are correlated and I want to model that somehow. What are the options?


My first idea would be to use J observation-level latent variables, model those as multivariate normal, and plug them into the formulae that generate the parameters for your J responses. So:

L \sim N_j(0,\Sigma)

Where L has i rows and j columns, and \Sigma is a J \times J correlation matrix. Additionally, you’d want to fit J scale parameters \sigma_j as well.

The for each response for some location parameter µ:
µ_{ij} = f(L_{ij}\sigma_j).

This would cause non-normal covariance at the scale of observations.


A toy example in cmdstanr. I had to put a pretty strong prior on sigma1 to keep it from getting stuck at zero, since in this toy example the residuals and random effects aren’t well identified otherwise.

data {
  int<lower=1> N;
  int<lower=1> J;
  real y1[N];
  int y2[N];

parameters {
  // variance parameters
  real<lower=0> sigma1;
  real<lower=0> phi2;

  // latent variables
  vector[J] L[N];
  cholesky_factor_corr[J] cholcor;
  // scale of latent variables
  vector<lower=0>[J] sigmaL;

model {
  phi2 ~ exponential(2); // negative binomial scale
  sigma1 ~ gamma(6,10); // lognormal scale
  sigmaL ~ exponential(1); // scales of latent variables
  cholcor ~ lkj_corr_cholesky(2); // correlations of latent variables
  L ~ multi_normal_cholesky(rep_vector(0,J),cholcor); // latent variables
  y1 ~ lognormal(to_vector(L[1:N,1])*sigmaL[1],sigma1); // y1 is lognormal
  y2 ~ neg_binomial_2(exp(to_vector(L[1:N,2])*sigmaL[2]),phi2); // y2 is negative binomial

generated quantities {
real y1out[N] = lognormal_rng(to_vector(L[1:N,1])*sigmaL[1],sigma1);
int y2out[N] = neg_binomial_2_rng(exp(to_vector(L[1:N,2])*sigmaL[2]),phi2);

# Generate some continuous and integer-valued correlated data 
cormat <- matrix(c(1,0.8,0.8,1),ncol=2)
y <- MASS:: mvrnorm(50, c(0,0), cormat)
y <- exp(y)
y[,2] <- round(y[,2])

datlist <- list(
  y1 = y[,1],
  y2 = as.integer(y[,2]),
  N = NROW(y),
  J = NCOL(y)

model <- cmdstan_model("scripts/latent_re.stan")
fit <- model$sample(data = datlist, parallel_chains = 4, adapt_delta = 0.99, max_treedepth = 15)

Copulas! See

I show an example where I have 4 RV, normal, gumbel, weibull, and lognormal all correlated with each other.

The R code to simulate and run everything is also in that repo.

This works for margins that are continuous. The negative binomial involves more work. But @tarheel has done the hard work of writing code for mixed continuous and discrete margins that use a gaussian copula.


This is really cool! Do you have a favourite reference/resource for copulas? Seems like they’d pretty handy in a lot of situations

Here’s an excellent review paper on the topic.

The idea for Gaussian copulas is basically there’s a latent variables Z that is multivariate normal with mean 0 and and a correlation matrix. Each margin in Z is then transformed to (0,1) by applying the standard normal CDF. We then apply the inverse CDF based on the distribution your data are (e.g. log-normal, negative binomial).

For continuous marginals, the inverse CDF is one-to-one, so given Y, the latent variable pertaining to that margin isn’t really latent.

For discrete marginals, the inverse CDF is a one-to-many function, so they really are latent (but we know the bounds of the latent variable) and must be sampled in the sampling scheme.

The discrete / mixed case is significantly more difficult to work with than the continuous case, but hopefully you can figure out what’s going on in the mixed code provided by @spinkney


Excellent thanks for that, will get to reading

Something that would be really useful is to add functors that take in distributions. Right now the mixed case is written for poisson and bernoulli discrete marginals. However, it would be cool to have a kernel like functor that takes in the distribution and handles everything knowing that the marginal is distributed as x.

There are 2 more reasons to add this functionality. The first is lambertw stuff @stevebronder (@Neel @alberto_colombo). The second is for correlation priors @Stephen_Martin @paul.buerkner.


Thanks for the responses and example code all!

I looked at copulas yesterday but wasn’t understanding how they worked. That example model makes it a lot clearer how they’re used so maybe I can dig into the Math papers now.

@spinkney what’s the link to the mixed continuous/discrete code?