Vine copula examples in Stan

jmh530 · January 31, 2023, 10:08pm

I’ve attached some files (you can save to a folder and just adjust the working directory in the file that starts with “S_”) to compare a few different ways of fitting multivariate normals in Stan: 1) Stan’s multi_normal_cholesky, 2) an individual regression approach I’ve discussed on the forums previously, 3) a partial correlation approach that uses multi_normal, and 4) the C-vine copula approach (which is also copied below and borrows from your copula code).

The true correlation matrix is set up so that the partial correlation of variables 2 and 3 given variable 1 is zero.

I’m using slightly different priors, so they aren’t exactly comparable, but they all give more-or-less the same results. The individual regressions approach is the fastest (chains run around 0.5s or so on my machine with my settings) and results in the highest effective samples. I directly calculated the covariance matrix, instead of inverting a matrix which probably works better when you get more complicated. The vine approach is the second fastest (chains running around 1.5-2s). The multinormal cholesky chains take around 5-10s and the partial correlation version around 20s (I did this one since the output is comparable to the Cvine and the difference in the run-time reflects the slowdown as a result of using the multi-normal instead of the bivariate).

Anyway, the C-vine example below is specifically set up for normal marginals and a normal copula. Things get a little more complicated when you move away from this (I start getting a bit beyond my skis, but I found this explanation pretty good: https://tnagler.github.io/vine-arisa.pdf).

In the more general case, the copula here would take the conditional CDFs of the variables, so it looks something like c_23|1(F_2|1(x2|x1),F_3|1(x3|x1)). F_2|1(x2|x1) is the conditional CDF and is calculated by taking the derivative with respect to x1 of the c12 copula while fixing F_1(x1) values. It simplifies nicely for Gaussians, but not more generally as you could imagine.

There is an R package VineCopula that looks good, though it looks the bleeding edge is a C++ implementation called vinecopulib (with R/Python interfaces).

functions {
    real normal_copula_vector_lpdf(vector u, vector v, real rho) {
      int N = num_elements(u);
      real rho_sq = square(rho);
      
      real a1 = 0.5 * rho;
      real a2 = rho_sq - 1;
      real a3 = 0.5 * log1m(rho_sq);
      real x = -2 * u' * v + rho * (dot_self(u) + dot_self(v));
      
      return a1 * x / a2 - N * a3;
    }
}
data {
    int<lower=0> T;
    matrix[T, 3] X;
}
transformed data {
    int N = 3;
}
parameters {
    vector[N] mu;
    vector<lower=0>[N] sigma;
    real<lower=0,upper=1> p1_raw;
    real<lower=0,upper=1> p12_raw;
    real<lower=0,upper=1> p13_raw;
    real<lower=0,upper=1> p23_given_1_raw;
}
transformed parameters {
    real<lower=-1,upper=1> p12;
    real<lower=-1,upper=1> p13;
    real<lower=-1,upper=1> p23_given_1;
    p12 = (p12_raw - 0.5) * 2;
    p13 = (p13_raw - 0.5) * 2;
    p23_given_1 = (p23_given_1_raw - 0.5) * 2;
}
model {
    // priors
    mu ~ normal(0.15, 0.15);
    sigma ~ cauchy(0, 0.3);
    p1_raw ~ beta(19, 1);
    p12_raw ~ normal(p1_raw, 0.1);
    p13_raw ~ normal(p1_raw, 0.1);
    p23_given_1_raw ~ beta(1, 1);

    // univariate
    for (i in 1:3) {
        X[, i] ~ normal(mu[i], sigma[i]);
    }
    {
        matrix[T, N] Uinv;
        matrix[T, N - 1] Uinv_2nd_level;
        for (i in 1:N) {
            // Since assuming normal marginals:
            // U = Phi((x - mu) / sigma), and
            // U_inv = inv_Phi(U) = (x - mu) / sigma
            Uinv[, i] = (X[, i] - mu[i]) / sigma[i];
        }
        // fit first level
        Uinv[, 2] ~ normal_copula_vector(Uinv[, 1], p12);
        Uinv[, 3] ~ normal_copula_vector(Uinv[, 1], p13);
        // fit second level
        // The general appraoch is to get each conditional CDF. This requires
        // taking the derivative of the copula CDF when holding one variable
        // constant. Things are simpler here because I'm using normal marginals
        // and copulae, but this doesn't hold generally. What I'm doing here
        // is essentially regressing variables 2 and 3 on variable 1 and
        // and dividing by the residual standard deviation. Since we have
        // standardized the data above, it simplifies the calculations here
        // and I don't need to do the regression because the correlations are
        // re-used from above. Works for normals, but not more generally.
        Uinv_2nd_level[, 1] = (Uinv[, 2] - p12 * Uinv[, 1]) / sqrt(1 - square(p12));
        Uinv_2nd_level[, 2] = (Uinv[, 3] - p13 * Uinv[, 1]) / sqrt(1 - square(p13));
        Uinv_2nd_level[, 2] ~ normal_copula_vector(Uinv_2nd_level[, 1], p23_given_1);
    }
}
generated quantities {
    real p23;
    p23 = p12 * p13 + p23_given_1 * sqrt(1 - square(p12)) * sqrt(1 - square(p13));
}

S_Simple_Cvine_Copula.r (4.6 KB)
simple_cvine_copula.stan (2.8 KB)
simple_multinormal.stan (675 Bytes)
simple_multinormal_indv_regressions.stan (1.5 KB)
simple_multinormal_partials.stan (1.3 KB)

Topic		Replies	Views
Normal copula with Rstan Modeling	8	148	April 23, 2025
Speed-up for Mixed Discrete-Continuous Gaussian Copula - Reduce Sum? Modeling techniques , fitting-issues , specification , performance , cmdstanr	11	966	February 1, 2023
Copula regression example (gaussian & poisson) Modeling rstan , fitting-issues , specification	18	1108	May 14, 2024
"Pairwise" alternative to multivariate normal isn't behaving for the hierarchical case; help? Modeling techniques , specification	13	885	June 10, 2021
Frank Copula Modeling rstan , specification	2	697	May 30, 2023

Vine copula examples in Stan

Related topics