Truncated cumulative multivariate normal?

I need normal_lcdf(...)T[l,u], but multivariate with dense covariance.

If this is not available yet, what work items would it entail to add?

Thanks.

1 Like

To my knowledge, they do not yet have any multivariate normal CDF; there was a bivariate normal CDF someone implemented at one point, but beyond that, I believe it’s an open problem.

What do you mean by “open problem”? At least in R, a package is available that offers sadmvn.

Many functions implementing the multivariate normal CDF are built around Alan Genz’s fortran routines (including the mnormt package you linked) which is a heavy amount of work to translate.

I’m not familiar enough with the algorithmic details of the various approximations of the CDF to say which is the most implementable in c++, but if @bgoodri is around he might have some helpful comments here

1 Like

On second thought, I need the density, not the cumulative distribution function (tmvnorm). Is the truncated density any easier to compute than the CDF?

That is easier. Ben Goodrich wrote up the details and a Stan implementation that can be found in this forum post: Multi_normal_lcdf? - #6 by aornugent

1 Like

I’ve been playing around with multivariate gaussian copulas and I think this also works. At least it gives reasonable answers vs the tmvtnorm R package.

The idea is that the joint pdf of some multivariate distribution is given by the product of the pdf of the copula and the pdf of the univariate marginals.

In the case of a multivariate truncated gaussian,

\begin{aligned} f(x_1, \ldots, x_D) = c(F_1(x_1), \ldots, F_D(x_d)) \prod_{d=1}^D {f_d}(x_d) \end{aligned}

If you want more of the math check out my repo that contains documentation on a bunch of stan functions including the multivariate gaussian copula GitHub - spinkney/helpful_stan_functions. Documentation for it is here.

You may noticed that each of the univariate F's can be different. But for your case each univariate will be a truncated gaussian distribution. We can write this in Stan as

functions {
  real multi_normal_copula_lpdf(matrix u, matrix L){
   int K = rows(u);
   int N = cols(u);
   real inv_sqrt_det_log = sum(log(diagonal(L)));
   matrix[K, N] x = mdivide_left_tri_low(L, u);

   return -N * inv_sqrt_det_log - 0.5 * sum(columns_dot_self(x) - columns_dot_self(u));
 }
 
 // univariate truncated normal
 // cached real lb, real ub as bound_norm
  vector truncated_normal_cdf (vector y, real mu, real sigma, real alpha, real bound_norm) {
    int N = num_elements(y);
    vector[N] zeta = Phi( (y - mu) / sigma);
  
    return (zeta - Phi( alpha ) ) / bound_norm;
  }
  
   real truncated_normal_lpdf (vector y, real mu, real sigma, real bound_norm) {
    int N = num_elements(y);

    return normal_lpdf( y | mu, sigma ) - N * log( bound_norm );
  }
  
}
data {
  int<lower=0> N; // number of data points
  int K; // number of marginals
  matrix[N, K] x;
  vector[K] lb;
  vector[K] ub;
}
parameters {
  vector<lower=lb, upper=ub>[K] mu;
  vector<lower=0>[K] sigma;
  cholesky_factor_corr[K] L;
}
transformed parameters {
  vector[K] alpha = (lb - mu) ./ sigma;
  vector[K] beta =  (ub - mu) ./ sigma;
  vector[K] bounds_normalization = Phi( beta ) - Phi( alpha );
}
model {
  sigma ~ lognormal(0, 2);
  
  // the following gets computed a bunch
  // so for optimization going to cache the result
  // but then this gets put in to the functions
  // normal_cdf(ub, mu, sigma) - normal_cdf(lb, mu, sigma)
  // cdf returns a real in Stan but it's easier to to keep
  // this cached result as a vector so will use std_normal_cdf Phi
  matrix[K, N] y;
  for (k in 1:K) {
    target += truncated_normal_lpdf(x[, k] | mu[k], sigma[k], bounds_normalization[k]);
    y[k] = inv_Phi(truncated_normal_cdf(x[, k], mu[k], sigma[k], alpha[k], bounds_normalization[k]))';
  }
    y ~ multi_normal_copula(L);
}
generated quantities {
  matrix[K, K] corr = multiply_lower_tri_self_transpose(L);
  matrix[K, K] Sigma = quad_form_diag(corr, sigma);
}

Here’s the R script to run through a simulation and compare to tmvtnorm

library(cmdstanr)
library(tmvtnorm)

fp <- file.path("multi_normal_copula_trunc_opt.stan")
mod <- cmdstan_model(fp, 
                     force_recompile = T)

N <- 2000

corr_ <- matrix(c(1.00,  0.40,  0.20, -0.80,
                0.40,  1.00, -0.30,  0.15,
                0.20, -0.30,  1.00, -0.50,
                -0.80,  0.15, -0.50,  1.00),
                byrow = T, nr = 4, ncol = 4)
tau <- diag(c(0.5, 1, 1.5, 2))

mu <- c(0,1.75,3,4)
lb <- c(-0.5, 1, 2, 3)
ub <- c(0.5, 2.5, 3.5, 4.5)
sigma <- tau %*% corr_ %*% t(tau)

X <- rtmvnorm(n = N, mu, sigma, lower = lb, upper = ub)
apply(X, 2, sd)


mle_fit <- mle.tmvnorm(x, 
                       lower = c(-0.5, 1, 2, 3), 
                       upper = c(0.5, 2.5, 3.5, 4.5),
                        start = list( 
                          mu = c(0, 1, 2, 3),
                          sigma = diag(apply(X, 2, sd))
                                       )
)

gmm_fit <- gmm.tmvnorm(x, lower = c(-0.5, 1, 2, 3), upper = c(0.5, 2.5, 3.5, 4.5),
            start = list( 
              mu = c(0, 1, 2, 3),
              sigma = diag(apply(X, 2, sd))
            )
)

sigma_coefs_gmm <- coef(gmm_fit)[5:14]
sigma_coefs_mle <- coef(mle_fit)[5:14]

make_sigma_mat <- function (sigma_coefs) { 
  sigma_est <- matrix(0, nr = 4, nc = 4)
  sigma_est[lower.tri(sigma_est, diag = T)] <- sigma_coefs     
  sigma_est[upper.tri(sigma_est)] <- t(sigma_est)[upper.tri( sigma_est )] 
  return( round(sigma_est, 3))
}

mod_out <- mod$sample(
  data = list(N = N,
              K = 4,
              x = X,
              lb =  c(-0.5, 1, 2, 3),
              ub = c(0.5, 2.5, 3.5, 4.5)),
  chains = 2,
  adapt_delta = 0.9,
  parallel_chains = 2,
  iter_warmup = 200,
  iter_sampling = 200
)

sigma
round(matrix(mod_out$summary("Sigma")$mean, 4, 4), 3)
make_sigma_mat(sigma_coefs_gmm)
make_sigma_mat(sigma_coefs_mle)
3 Likes

Gah, sorry for getting confused, but I’m actually trying to use mtmvnorm. Can this be made available in Stan?

Thanks.

I found “Multivariate probit regression” in the user guide. Reading Albert & Chib (1993) now.

Hm, I don’t think the Albert & Chib (1993) approach will work for the application I have in mind.

How about a univariate version of mtmvnorm, just to verify that I’m on the correct track? Can I easily get the mean and variance moments for a truncated multivariate Normal Distribution?

I got it, you want the mean and variance of the truncated mvn not an estimate of the mean and variance of the untruncated distribution.

Wikipedia has the following for the marginal variance

\sigma^2_{trunc} =\sigma^2\left[1+\frac{\alpha\phi(\alpha)-\beta\phi(\beta)}{Z} -\left(\frac{\phi(\alpha)-\phi(\beta)}{Z}\right)^2\right]

where Z=\Phi(\beta)-\Phi(\alpha). The copula picks up the correlation so then we just add this in the stan program as the following,

  vector var_truncated_lp (vector alpha, vector beta, vector bound_norm){
    int K = num_elements(alpha);
    vector[K] alpha_;
    vector[K] beta_;
    vector[K] term1;
    vector[K] term2; 
    
    for (k in 1:K){
      alpha_[k] = exp( std_normal_lpdf( alpha[k] ) );
      beta_[k] = exp( std_normal_lpdf( beta[k] ) );
      term1[k] = ( alpha[k] * alpha_[k] - beta[k] * beta_[k] ) / bound_norm[k];
      term2[k] = square( ( alpha_[k] - beta_[k] ) / bound_norm[k] );
    }
    
    return 1 + term1 - term2;
  }
...
transformed parameters {
...
  vector[K] trunc_sigma_sq = square(sigma) .* var_truncated_lp(alpha, beta, bounds_normalization);
...
}
...
generated quantities {
  matrix[K, K] trunc_Sigma = quad_form_diag(corr, sqrt(trunc_sigma_sq));
}

It compares favorably to mtmvnorm

> mod_out$summary("mu")$mean
[1] 0.06005775 1.71259863 3.27410267 4.19305352
> trunc_Sigma <- matrix(mod_out$summary("trunc_Sigma")$mean, 4, 4)
> trunc_Sigma
             [,1]         [,2]         [,3]         [,4]
[1,]  0.036023875  0.039911499 -0.004468582 -0.037026942
[2,]  0.039911499  0.165042702 -0.001867108  0.001938298
[3,] -0.004468582 -0.001867108  0.180443915 -0.013756003
[4,] -0.037026942  0.001938298 -0.013756003  0.180885067
> mtmvnorm(mu, sigma, lower = lb, upper = ub)
$tmean
[1] 0.05730107 1.75268291 2.78769348 3.77984515

$tvar
            [,1]         [,2]         [,3]         [,4]
[1,]  0.03633103  0.040667111 -0.006135056 -0.039144418
[2,]  0.04066233  0.170859997 -0.006098696  0.003880287
[3,] -0.00616403 -0.006099607  0.177662501 -0.007915609
[4,] -0.03914115  0.003870145 -0.008038443  0.181241198
# the mvtnorm package compares their calculation to the 
# empirical covariance estimate of the data
# which you can see corresponds similarly to the
# Stan output
> cov(X)
             [,1]         [,2]         [,3]         [,4]
[1,]  0.036653166  0.041935755 -0.004637356 -0.039094790
[2,]  0.041935755  0.168084869 -0.001970778  0.002548021
[3,] -0.004637356 -0.001970778  0.180895486 -0.014395542
[4,] -0.039094790  0.002548021 -0.014395542  0.183491556
2 Likes

A note about the above. The estimate of the mean still seems off. However, the hard part is getting the covariance.

I’m sure if you look in the mvtnorm package you can get out how to derive a better mean from the estimates above (or just use the empirical mean of the truncated obs).

future me

I looked up some references and this is a much harder problem than I thought. I wrongly assumed the marginals will be truncated normals but that is not necessarily the case. This is talked about in the mvtnorm vignette. I need to think more about if I can come up with something better. In the meantime, use what I’ve done with caution, as an approximation, and possibly (many) points where it can fail.

2 Likes

Yes. This.

1 Like

Looks like the Albert & Chib (1993) approach will work after all.

If and when you have a MWE, I think that’d be really nice to post here, if you can.

If and when you have a MWE

Happy to give something back, a2.stan (3.3 KB) test3.R (7.3 KB)

It’s an A/C/E variance decomposition of simulated twin data with an ordinal outcome. Point estimates look slightly better than those recovered by maximum likelihood.

1 Like