Calculation of integral using posterior predictive distribution as its measure with Stan

Now, I tried to calculate an integral using posterior predictive distribution as its measure with Stan.
I have a one solution but I am not sure whether it is correctly made or not. If you do not mind, please let me know that the strategy of the following code is correct or not.

Let y be data to be fitted a model and
\theta be a model parameter,
f(y|\theta) be a model
\pi(\theta|y) be a posterior

Let T(y,\theta) be a function.
Let y_{\text{obs}} be an observed dataset.
To calculate the integral \int \int T(y,\theta) f(y|\theta)\pi(\theta|y_{\text{obs}}) dy d\theta, I use the Monte Carlo approximation twice as follows.

\int \int T(y,\theta) f(y|\theta)\pi(\theta|y_{\text{obs}}) dy d\theta
\approx \frac{1}{I} \sum_{i=1,...,I} \int T(y,\theta_i) f(y|\theta_i)dy
\approx \frac{1}{I} \sum_{i=1,...,I}\frac{1}{J} \sum_{j=1,...,J} T(y_{ij},\theta_i)
where \theta_i denotes a posterior sample and y_{ij} is a sample drawn from f(*|\theta_i), i.e.,
\theta_i \sim \pi( *|\theta) and y_{ij} \sim f(*|\theta_i).

Example

In the following code, the model is

y \sim \text{Noraml}(\mu,1)
\mu \sim \text{Normal(0,10)}

and \mu is a model parameter (in the above it is denoted by \theta).

T(y,\mu) = y + \mu.

In the following code,
the posterior mean of the parameter T defined in generated quantities block is seemed to be the desired integral. But, I guess it merely calculates the
\sum _iT(y_{i},\theta_i)

which is not same as the desired summation, namely

\sum _j\sum _iT(y_{ij},\theta_i).
Because in the later the subscripts i and j can move in distinct regions, but the former is not. In other words, in the former summation, sample y_i is drawn only one time from the model f(*|\theta_i) but I guess it should be drawn more times.

Should I make a more complicated code to calculate the integral or the following code is sufficient to calculate the integral?

stanmodelcode <- "
data {
  int<lower=0> N;
  real y[N];
} 

parameters {
  real mu;
} 

model {
  target += normal_lpdf(mu | 0, 10);
  target += normal_lpdf(y  | mu, 1);
} 

generated quantities{
real yyy = normal_rng(mu,1);
real T = yyy + mu;
}
"

y <- rnorm(20) 
dat <- list(N = 20, y = y); 
fit <- stan(model_code = stanmodelcode, model_name = "example", 
            data = dat, iter = 2012, chains = 3, verbose = FALSE,
            sample_file = file.path(tempdir(), 'norm.csv')) 
print(fit)

It’s not clear to me what you mean by “an integral”, or more precisely what are trying to accomplish, but if I understand more or less what you are trying to do here a re a few comments and possible clarifications.

If you need to compute any quantity using the posterior distribution all you need is to compute the distribution g(\theta) \pi(\theta | y) and integrate over \theta if that’s a quantity of interest (e.g. \int g(\theta) \pi(\theta | y) d\theta). You shouldn’t need to integrate of the distribution of y, since that is all possible configurations of any data, and as far as your inference is concerned everything is conditional on the data set you have (i.e. the y_{obs}). f(y | \theta) is not that (a distribution of y given the parameters), just the likelihood of y_{obs} for any sample from the posterior, and is there only to enable the calculation (by sampling) of the posterior distribution: \pi(\theta | y) \propto f(y | \theta) p(\theta).

I hope I didn’t misunderstand your question, and that this helps.

Thank you!!

By " integral" , I mean the term \int \int T(y,\theta) f(y|\theta)\pi(\theta|y_{\text{obs}}) dy d\theta \cdots \cdots \cdots \cdots(1),

and, in the following, I refer this by the integral (1).

What are trying to accomplish is the calculation of the integral (1) in concrete example using Stan codes.

The integral (1) appeared in, e.g., the section about posterior predictive p-value in the book Bayesian Data Analysis 2nd edition; Andrew Gelman at al.

In my previous post, I gave an example in which I tried to calculate the integral (1) in case of T(y,\theta) = y + \theta, under the assumption that y \sim \text{Normal}(\mu, 1) and \mu \sim \text{Normal}(0,10). In this context, \theta is denoted by \mu and the integral (1) is
\int \int T(y,\theta) f(y|\theta)\pi(\theta|y_{\text{obs}}) dy d\theta =\int \int (\mu+ y)\text{Normal}(y|\mu,1)\text{Normal}(y_{\text{obs}}|\mu,1)\text{Normal}(\mu|0,10) dy d\mu, \cdots (1)'

where \text{Normal}(y|\mu,1) denotes the probability mass of Gaussian with mean \mu and variance 1. If the previous code is correct, then the integral (1)’ is obtained as the posterior mean of the parameter T defined in generated quantities block.

My biggest concern is that in the above code, I draw data y from each model f(*|\theta_i) with a posterior sample \theta_i, only one time, but I guess that many samples should be drawn from f(*|\theta_i) for each \theta_i. But if I draw many samples from a model f(*|\theta_i), then the code will become complicated and if redundant, I do not want to do this. I am not sure , but if my memory is correct , in the above book Bayesian Data Analysis 2nd edition; Andrew Gelman at al., the integral (1) is calculated by using the code like the previous post.


Edit

I fixed misprint, and I didn’t intend but it raises this sled, sorry.

Ok, That is the integral to compute the Bayesian version of a p-value, maybe you should have led with that. So T(y, \theta) is a test statistic. I only have the third edition of the book, and the expression there looks slightly different, but it seems you are trying to evaluate p_B = \mathrm{Pr}(T(y^{rep}, \theta) \geq T(y,\theta)|y) , so again that is conditional on y_{obs} (or simply y), which is not being integrated over.

Instead, y^{rep} represents any possible replication of the data, that’s what I was missing, so as long as you simulate replicates of y from the posterior draws you will get a distribution of y^{rep} along with your distribution \pi(\theta|y) and you can evaluate any function, like the test statistic T(y,\theta) using the sums to approximate the double integral.

Not sure if this is just repeating things you already know. Your code is pretty straightforward, but I’d have to look more closely to make sure it’s actually doing what you want. But I hope this helps.

1 Like