Bayesian melding

Hi all,

I’m trying to replicate a simple example of Bayesian melding from page 1250 of Poole & Raftery (2000) paper (http://amstat.tandfonline.com/doi/abs/10.1080/01621459.2000.10474324#.Wl5WxyOZO1s).

The model is given by Z = Y/X, where X~U(2,4); Y~U(6,9) and Z~U(0,5). Melding consists of taking the geometric pooling of the prior of Z induced by Y/X and the independent prior of Z ~ U(0,5).

I’m having problem generating the induced distribution of Z to be further used in the pooling.

In the example below I approximate the induced distribution by a gamma(17.5,6.6) and the results seem to match the paper. However, I was looking for a more generalizable approach (possibly using a nonparametric density estimation?), which could be used in more complex models. Is there a way to do this in Stan?

parameters {
real<lower=0> X;
real<lower=0> Y;
}

transformed parameters {
real<lower=0> Z;
Z = Y/X;
}

model{
X ~ uniform(2,4);
Y ~ uniform(6,9);
target += -log_sum_exp (uniform_lpdf(Z | 0,5), gamma_lpdf(Z | 17.5,6.6));
}

Thanks,

Gabriel

1 Like

Hi Gabriel,

I’d love if @Bob_Carpenter, @mitzimorris or @randommm could confirm, but in general I don’t think you can access posterior samples from inside the model block. To me this seems to indicate that it’d be very hard to get a density estimate for Z internally.

I made a couple of tweaks to your original code, which can be found here. Basically I:

  • Made the dependence on the pooling weight (parameter) alpha explicit;
  • Generalised the code a little bit so you can plug in different bounds for the (terrible) uniform priors in the example;
  • Wrote a little function that gets a method-of-moments estimate of the parameters of a Gamma distribution for the induced distribution on Z.

This last step could easily be done – much better – outside Stan, but I guess it’s convenient to have it all in one place. I wonder if one could devise a way to get a density by writing a function that would take a value z and a transformed data array Z_rep and output a density for z (cf. the function get_MM_ests in the code linked above). I bet it would be almost obscenely slow, but I guess that could be done.

I do realise this does not answer the original question, though.

1 Like

Thanks, this is a nice workaround.

In this simple example, the gamma distribution seems to work, though it is probably not the best choice. In more complicated models, the choice of the distribution for the induced prior might be quite tricky. I was wondering it is possible to do something with Gaussian processes, like discussed here.

One thing I noticed is that the distributions that result from your code don’t reproduce the results in the paper. I was wondering if we shouldn’t be doing log(alpha) instead of alpha. That seems to work. The last line of the model code would then be:
target += log(alpha) * uniform_lpdf(Z |0,5) + log(1-alpha)*gamma_lpdf(Z | parms[1], parms[2]);
Does that make sense?

I don’t think it makes sense, no. If f(z) = f_1(z)^a* f_2(z)^{1-a}, then log f(z) = a *log f_1(z) + {1-a}*log f_2(z). I ditched X and Y and implemented the exact (not normalised) target here.

What I haven’t figured out yet is how to obtain samples from X and Y.

Update (2018-01-26): I modified the code linked above to use the induced joint distribution of X and Y and now we get apparently correct samples for X, Y and Z. @gmendesb, please let me know if this “fixes” the problem.

1 Like

Great! That looks good to me. In this example, it is possible to implement the exact target of the induced distribution. In real world examples, we would still have to pick a distribution and estimate its parameters using, for example, the method of moments, right? Something like the gamma distribution we were doing before.
Thanks, @maxbiostat !

1 Like

Yes, absolutely. As an “exercise”, I’d encourage you to use the method of moments thing we came up with to get a Gamma distribution for Z and see how that affects the posteriors for X and Y. It would be an interesting thing to compare the “exact” posteriors with their approximate counterparts.

1 Like

I’m just catching way up on the list. Sorry for the delay.

That is correct.

The model block computes the log density which is used by the sampling algorithm to generate the posterior draws. There’s no way to make them available inside the model block. The entire sample of posterior draws is then analyzed after the fact to calculate expectations for estimators or expectations for event probabilities.

I couldn’t follow your explanation of the model. Statn only requires a continuous log density that’s proper—it doesn’t care how you define it.

You can do what you’re trying to do in Stan, but you should understand what the density is that you’e defining, as it may not be what you expect. I’d have expected a Jacobian adjustment for Z, but then I don’t see why there are these multiple priors in the first place.

I found Raftery’s version of the paper, but don’t think I’ll have time to dive into it.