Implementing a Gaussian Copula with Gamma Margins in Stan - Numerical Stability Issues
Model Description
I’m implementing a model in Stan where the latent parameters lambda, mu, and nu have Gamma marginal priors and are correlated via a Gaussian copula.
Here’s the implementation of the multi_normal_cholesky_copula function and the model block:
functions {
real multi_normal_cholesky_copula_lpdf(matrix U, matrix L) {
int N = rows(U);
int J = cols(U);
matrix[J, J] Gammainv = chol2inv(L);
return -N * sum(log(diagonal(L))) - 0.5 * sum(add_diag(Gammainv, -1.0) .* crossprod(U));
}
}
model {
matrix[N,3] u; // To hold the three uniforms
// 1) Marginal Gamma priors
lambda ~ gamma(r, a);
mu ~ gamma(s, b);
nu ~ gamma(q, c);
// 2) Build uniforms via their marginal CDFs
for (i in 1:N) {
u[i, 1] = gamma_cdf(lambda[i] | r, a);
u[i, 2] = gamma_cdf(mu[i] | s, b);
u[i, 3] = gamma_cdf(nu[i] | q, c);
}
// 3) Copula ties the three margins together
target += multi_normal_cholesky_copula_lpdf(u | L);
Is this the correct and stable way to specify Gamma marginals with a Gaussian copula in Stan?
Also I got these warning errors:
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 2 Exception: gamma_lpdf: Random variable[65] is inf, but must be positive finite! (in 'C:/Users/riana/AppData/Local/Temp/RtmpA3j600/model-73807ba7106.stan', line 58, column 2 to column 23)
Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Thank you for your help.
Hi, the quoted function doesn’t take parameter u (a unit vector). Instead, you should feed it a standard (multivariate) normally distributed vector, which is constructed from what you now call parameter u. What is missing from your code is a translation of u from the unit scale to the unbounded scale, using the normal copula. This can be achieved simply by the inv_Phi() function (approximate but fast):
should be
Instead of inv_Phi(), you could also use the inverse CDF of the normal distribution, which is available in Stan. But in my experience, it is slower (more computation involved) and can suffer from numerical instability. My advice would be to stick with the polynomial approximation via inv_Phi().
EDIT: I did not check your definition of the multi_normal_cholesky_copula_lpdf() function, but I’m assuming you adopted @spinkney’s implementation as described on their github here.
Yes, the fmin/fmax is set because of the warnings. After making adjustments - such as using cholesky_factor_corr and applying inv_Phi() - the warning still appears at the beginning, but as you mentioned, MCMC sampling runs successfully. Many thanks for your help!