Hello, there!
I recently faced a problem with sampling and wanted to share it with you guys. Maybe someone can help me out! Suppose we have two random variables X \sim F_{\alpha} and Y \sim F_{\beta} with known \alpha and \beta. The problem is that we want to sample from X | X + Y, that is, what is the distribution of X given the sum X+Y when both distributions are known?
Well, since we do not have a closed-form expression for the density of this conditional distribution, I think it would be nice to propose a general solution, if possible. It is easy to write a Stan code that does this. Below I show the code when X and Y come from normal distributions. So we give a prior for X and the likelihood is X+Y|X, which is a known distribution. Happily, we can see that the Stan sampler works appropriately because we can compare it with the analytical form of the conditional distribution. In the image below, the blue histogram represents samples from the true distribution, while the blue line is the Stan posterior result. They both agree!
Code normal distribution
data {
real s;
real mu1;
real mu2;
real<lower=0> sigma1;
real<lower=0> sigma2;
}
parameters {
real x;
}
model {
x ~ normal(mu1, sigma1);
s ~ normal(mu2 + x, sigma2);
}
However, if we have another very simple case, problems start to appear. Suppose X and Y have Gamma distributions with a shared scale parameter. We can derive an analytical expression for the conditional distribution again for comparison. Unfortunately, Stan does not work when data comes from gamma distributions with small shape parameters! With \alpha_x = 0.2, \alpha_y = 0.2 and \beta = 0.2, the image below shows the result. Notice that the samples from the true conditional distribution (the histogram) are closer to the bounds! Divergences appear mainly next to the boundary, but not exclusively. This is not so surprising since the curvature changes too much and a large part of the mass is near the boundary. If we suppose that the random variables come from beta distributions, we do not have closed-form expressions to compare, but the divergences show that this problem persists and is even worst!
Below we see the divergences.
Code gamma distribution
data {
real s;
real<lower=0> alpha1;
real<lower=0> alpha2;
real<lower=0> beta1;
real<lower=0> beta2;
}
parameters {
real<lower=0, upper=s> x;
}
model {
target += gamma_lpdf(x | alpha1, beta1);
target += gamma_lpdf(s-x| alpha2, beta2);
}
I notice that the problem appears when the parameters are small. When they are not, all works fine. Moreover, I remember that sampling from gamma or beta distribution with small parameters is hard. Sometimes R gives samples 1 or 0, which should not happen.
I would be very thankful if someone could help me with this!