Hi,
I have here a model to fit the approximated sum of dependent variables with a gamma distribution with the same coefficient of variation (cv or tau) and thus a different rate. (I got to this monstrosity with the saddle point approximation and using sympy.)
The lpdf works incredibly well! (it could be faster though). The issue is that I know that it’s not a real pdf which integrates to one, and I won’t be able to compare the fit of the sum of gamma to something else (using loo for example). So I 'm trying to get the normalizing constant through integration.
The issue is that it fails when I set the right limit of integration to positive_infinity()
. I get the error:
Chain 1 Exception: Error in function tanh_sinh<double>::integrate: The tanh_sinh quadrature evaluated your function at a singular point and got -nan. Please narrow the bounds of integration or check your function for singularities.
My question: Does it make sense then to set the right limit of the integration to a very large number instead of positive_infinity()
? E.g., 2999.9,
, I’m expecting the values of the actual x to be not larger than 10. (Larger right limits of integration give me the same error as pos infinity, but sporadically not all the time, which annoyingly enough don’t let me get the fit of the model, even if the integration happens in the generated quantities. Slightly smaller right limits give me the same normalizing constant than 2999.9.)
(I’ve played with the density function in R, and while I can set to x to very large and I end up with a lpdf of 0 (due to underflow) with x >10000, I also cannot set x to infinity (I get a nan). )
functions {
real gammasum_lpdf(real x, real mu_1, real mu_2, real tau) {
//approximated pdf
real x0 = log(2);
real x1 = tau^(-2);
real x2 = 2*x1;
real x3 = mu_2*x;
real x4 = mu_1*x;
real x5 = mu_1^2;
real x6 = mu_2^2;
real x7 = x5*x6;
real x8 = x^2;
real x9 = mu_1*mu_2;
real x10 = 2*x9;
real x11 = -x10*x8 + x5*x8 + x6*x8;
real x12 = x11 + 4*x7;
real x13 = sqrt(x12);
real x14 = x10 + x13;
real x15 = 1/mu_1;
real x16 = x/2;
real x17 = 1/mu_2;
return x0*(x2 + 1/2) - x0 + x1*(log(mu_1) + log(mu_2)) + (x2 - 1)*log(x) - log(tau) +
//after sympy.simplify
//log((x14 - x3 + x4)^(-x1)*(x14 + x3 - x4)^(-x1)*exp(x1*(x13*x15*x17/2 - x15*x16 - x16*x17 + 1))/sqrt((x10*x13 + x12)/(x11 + 4*x13*x9 + 8*x7)))
// my own simplification
+ x1*(x13*x15*x17/2 - x15*x16 - x16*x17 + 1)
+ log((x14 - x3 + x4)^(-x1)*(x14 + x3 - x4)^(-x1))
- log(x10*x13 + x12)/2
+ log_sum_exp([log(x11), log(4) + log(x13) + log(x9), log(8) + log(x7)])/2
- log(pi())/2;
}
real gammasum_pdf(real x, // Function argument
real xc, // Complement of function argument
// on the domain (defined later)
real[] theta, // parameters
real[] x_r, // data (real)
int[] x_i) { // data (integer)
real mu_1 = theta[1];
real mu_2 = theta[2];
real tau = theta[3];
return exp(gammasum_lpdf(x | mu_1, mu_2, tau)) ;
}
}
data {
int<lower = 0> N;
real<lower = 0> y[N];
}
transformed data {
int x_i[0];
real x_r[0];
}
parameters {
positive_ordered[2] mu;
real<lower=0> cv;
}
model {
target += normal_lpdf(mu | 0, 1);
target += normal_lpdf(cv | 0, 1);
for(j in 1:N){
target += gammasum_lpdf(y[j]| mu[1], mu[2], cv);
}
}
generated quantities{
vector[N] approx_log_lik;
real K;
vector[N] log_lik;
for(j in 1:N){
approx_log_lik[j] = gammasum_lpdf(y[j]| mu[1], mu[2], cv);
}
// print("mu_1 ", mu[1],
// ";mu_2 ", mu[2],
// ";tau ", cv);
K = log(integrate_1d(gammasum_pdf,
0,
// 10000.0,
2999.9,
// 100000000.0,
//positive_infinity(),
//positive_infinity(),
{ mu[1], mu[2], cv }, x_r, x_i));
log_lik = approx_log_lik + K;
}
Code to try it in R
rgammasum <- function(N, mean1=1, mean2=1, cv=1){
alpha = 1/cv^2
beta1 = 1/(mean1 * cv^2)
beta2 = 1/(mean2 * cv^2)
rgamma(N, alpha, rate= beta1) + rgamma(N, alpha, rate = beta2)
}
N <- 1000
y <- rgammasum(N, .5, .15, .22)
m_ss <- cmdstan_model("gamma_sum_saddle.stan")
fit_ss <- m_ss$sample(
data = list(N=length(y),y=y),
seed = 123,
parallel_chains = 4,
max_treedepth = 15,
adapt_delta = .9)
Output:
> fit_ss$summary(c("mu","cv","K", "approx_log_lik"))
# A tibble: 1,004 x 10
variable mean median sd mad q5 q95 rhat
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 mu[1] 0.136 0.110 8.75e-2 7.95e-2 0.0240 0.306 1.01
2 mu[2] 0.511 0.536 8.74e-2 7.95e-2 0.340 0.622 1.01
3 cv 0.210 0.205 2.12e-2 2.35e-2 0.180 0.245 1.02
4 K -0.344 -0.344 3.67e-4 3.59e-4 -0.345 -0.344 1.00
5 approx_… 0.500 0.501 3.44e-2 3.35e-2 0.441 0.557 1.01
6 approx_… 0.913 0.912 2.22e-2 2.21e-2 0.877 0.949 1.01
7 approx_… 0.545 0.545 3.28e-2 3.16e-2 0.490 0.599 1.01
8 approx_… 0.909 0.908 2.22e-2 2.19e-2 0.873 0.945 1.01
9 approx_… 0.868 0.868 2.31e-2 2.31e-2 0.832 0.906 1.01
10 approx_… 0.462 0.462 2.80e-2 2.70e-2 0.414 0.508 1.01
# … with 994 more rows, and 2 more variables: ess_bulk <dbl>,
# ess_tail <dbl>
Unrelated to the main issue, maybe @martinmodrak has some idea about this. Isn’t it strange the normalizing constant is so big in comparison with the approximated likelihood? I understood that the saddle point approximation should be quite close to the real pdf including the normalizing constant.
Thanks!