I am interested in knowing how to take samples from a conditional probability distribution.
Suppose I have the joint probability distribution for a bivariate normal distribution:
Suppose the mean vector for this bivariate normal distribution is mu = (5,10) and the covariance matrix (element_11, element_12, element_21, element_22) is (1,0.5,0.5,1). Using the R programming language, here is a visualization of this (R will generate many points from this specific bivariate normal distribution and then plot these points):
library("mvtnorm")
range = seq(-5,5,0.1)
mean = c(5,10)
Sigma = matrix(c(1, .5, .5, 1), 2)
out = matrix (rep(0,101*101),101)
for (i in 1:length(range)){
for (j in 1:length(range)){
out[i,j] = dmvnorm(c(range[i],range[j]),mean=mean,sigma=Sigma)
}
}
persp(out,theta = 30,phi = 30,col="lightblue")
Problem: I am interested in taking samples from this joint probability distribution and finding out the conditional distribution, e.g. P(X1 | X2 = 0.5).
What I have tried so far:
1) Generating the conditional distribution based on the (convenient) theoretical properties of the multivariate gaussian (i.e. normal) distribution
There is a convenient theoretical property of multivariate guaussian distributions which easily allow us to move between the joint probability distribution and the conditional probability distribution :
So if I want to find out “P(X1 | X2 = 0.5)” : P(X1 | X2 = 0.5) itself has a normal distribution with mu = 5 + 0.5(0.5-10) and sigma = 1 - (0.5^2 / 1). Thus, using the theoretical properties of the multivariate gaussian distribution, P(X1 | X2 = 0.5) ~ N( mu = 0.25, sigma = 0.75) .
In R, I can simulate this conditional distribution easily:
hist(rnorm(10000,0.25,0.75))
However, in the real world, we can not always rely on the convenient properties of the multivariate normal distribution - especially when we have high dimensional, irregular and non-normal distribution functions. This is why we use different MCMC (Markov Chain Monte Carlo) based approaches for taking samples from the conditional distribution, such as:
- Importance Sampling
- Acceptance-Rejection Sampling
- Gibb’s Sampling
- Metropolis-Hastings
My Question: Assuming we know the joint probability distribution and have an expression for the conditional distribution, does anyone know any R packages (or ways in STAN) to take samples from this conditional distribution? I spent some time looking online, but there does not appear to be a straightforward way to do this. Suppose I take the same conditional distribution from above: P(X1 | X2 = 0.5) : Is it possible to take samples (with R) from this conditional distribution using any of the above sampling algorithms? Can someone please point me in the right direction?
Thanks!
****Note: **** I found a library in R (“AR” : CRAN - Package AR) which perform Acceptance-Rejection Sampling, but only on a limited number of distributions (e.g. not custom distributions). Below, I used this library to take samples from an Exponential Distribution (lambda = 2), with the Log Normal Distribution (mu = 0, variance = 1) as the candidate distribution:
library(AR)
f_X = function(x) dlnorm(x,0,1)
Sim <- AR.Sim(n=1000, f_X, Y.dist="exp", Y.dist.par=2 )
hist (Sim)