Efficiency improvement over inverse cdf sampling for simulated multivariate normal draws



As part of a larger model, @yjin and I are trying to use simulated random draws from a multivariate normal distribution. The reason for this is to account for individual heterogeneity across observations without trying to identify the individual errors themselves (which appears infeasible in our context).

The crux of the idea is to draw NS [independent uniform] random draws once and then transform them into multivariate draws inside the model, given the mean and correlation parameters (which we are trying to estimate), as in with a copula (or ‘inverse cdf sampling’) method. Sample Stan code is below. The current implementation is fairly slow, however, and so we’re wondering if there’s an efficiency gain to be made somewhere. @Bob_Carpenter or @bgoodri - perhaps you’d know?

  int I;
  int L;

transformed data{

  matrix[I,L] unif_1;
  matrix[I,L] unif_2;

  for(i in 1:I){
    for(l in 1:L){
      unif_1[i,l] = uniform_rng(0, 1);
      unif_2[i,l] = uniform_rng(0, 1);

  cholesky_factor_corr[2] Sigma_corr_chol;

  matrix[I, 2] epsilons;

  Sigma_corr_chol ~ lkj_corr_cholesky(1);

     for(l in 1:L){
      epsilons = append_col(inv_Phi(unif_1[,l]), inv_Phi(unif_2[,l])) * Sigma_corr_chol; //this needs L>10K - slice sampling?

     #...[stuff with epsilons down the line]


append_col does a copy that you might be able to avoid if epsilons is an array of row_vectors.


Why not just use the non-centered generation of Gaussian variates? Taking \vec{z} \sim \mathcal{N}(0, 1) with \vec{x} = \vec{\mu} + L \cdot \vec{z} where \Sigma = L \cdot L^{T} implies \vec{x} \sim \mathcal{N} (\vec{\mu}, \Sigma).


Yes, of course!! Thanks, @betanalpha!!