I trained a Bayesian model with the package rstanarm, and I have predicted the posterior distributions of two distinct observations. I want to know how similar/different these two posterior distributions are. Do you know which technique/tool should I use?
Welcome to the forums!
The appropriate tools will depend on your reason for making the comparison. Are you trying to understand the behavior of the model, and therefore want to assess fine details of how the posterior distributions vary? Or are you trying to do inference on, for example, the strength of evidence for one of the two quantities being larger than the other? Or something else?
Thank you for your message! I’m interested in the second scenario that you propose: “strength of evidence for one of the two quantities being larger than the other”. I could do this easily considering for example the MAP of each distribution and take the difference, but, I understand that I would be losing some information, especially considering scenarios where the posterior distribution follows a bimodal behavior.
I have tried to tackle a similar problem myself, and I have used two different approaches depending on the circumstance. They may be useful for you.
One. Assess the degree of overlap between distributions. In this case, the shape of the distribution doesn’t matter, it just assesses the degree to which the distributions overlap. What this means is that you would have to give some interpretation as to whether the overlap is indicative of one distribution generally giving larger values than the other - for example, there could be low overlap, but it might just be because one distribution is much more spread out than the other.
Two. When I wanted to say, given posterior estimates of parameters, whether one tended to be larger than the other, an option could be ‘probability of superiority’. You would have to see exactly the extent to which it could be applied to your case, but essentially if you had two vectors representing the different posterior estimates, it would tell you the probability (i.e., percentage of cases) in which one vector gives larger estimates than the other.
Ruscio, J. (2008). A probability-based measure of effect size: Robustness to base rates and other factors. Psychological methods , 13 (1), 19.
It is a very intuitive measure and also really easy to code in R yourself - you just substract one estimate from the other, and then determine if it is larger or smaller, then compute the proportion of cases in the posterior in which the one vector is larger. Maybe this could be of use to you.
Hopefully these are something that might get you started.
Thank you very much for your detailed answer and links to the resources! I read the paper “Measuring Distribution Similarities Between Samples: A Distribution-Free Overlapping Index”. As you mention in your message, one of the tricky points is the lack of “A more or less than B” since the score only gives the degree of distribution. A simple way to overcome this issue is to see whether the mean/median/MAP of the posterior distribution of A is higher than the respective value of the posterior distribution of B. What do you think?
Regarding the second approach, after checking the paper, I do not think if I fully understand it. For instance, if we have two vectors A and B, you mean to randomly pick N times one value from vector A and another one from vector B and assess if the value from A is higher than the value from B? After N repetitions, you ultimately get the percentage of values from vector A that are higher.
I think in most cases you could look at the median or something like that, to see a tendency for the one distribution to be higher than the other. However, I can imagine some edge cases where weirdly shaped distributions might make this misleading. In general I think you could probably do it.
For the probability of superiority, I am imagining for example you might have two variables and you want to know which seems to have a bigger effect than the other. To keep it simple, let’s say you have a regression model with two binary predictors (e.g., yes vs. no on some variable) and you want to know if the impact of one group effect is greater than the other. You could draw posterior samples so that you get a dataframe with all the posterior estimates of these parameters from your model. In each row of the posterior samples dataframe, these variables are actually paired because they are estimated jointly, so you should not randomly shuffle them. You could then just make an extra column that codes 1 if the estimate of Effect A is greater than effect B, 0 if B is greater than A, and .5 if they are equal. You then can just average that new column and it would be the Probability of Superiority for A estimates being bigger than B estimates, ranging from 0 (all A estimates are lower), through .5 (A and B are equally likely to be greater/smaller than one another) up to 1 (all estimates of A are greater than B). Does this make any sense?