Hello,
So my problem is this:
I have 40 balls of different weights (we can assume each weight is unique). The weights are the probability I draw that ball during a particular draw. I sequentially draw 30 balls without replacement. I want to know the probability of drawing each ball at every position (e.g. ball A has a 50% chance of being drawn first, a 30% chance of being drawn second, a 20% chance of being drawn third, and a 0% chance of being drawn anywhere else). I do not know the weight of the balls, but I do have a best-guess predicted draw order of the balls (ball A, first; ball B, third; ball C, second, etc.).
As far as I am aware, the categorical family will not work because I am sampling without replacement. The Wallenius Hypergeometric would probably be the most accurate, but I don’t believe it is implemented in Stan or BRMS, although some work has been done along those lines. Is there a way for me to solve this problem and calculate these probabilities by transforming the data in some way or by a novel application of one of the existing distribution families?
Here’s a toy version of the dataset Trostle_toy_data_set - Sheet1.csv (280 Bytes) . What I’m really after is the probabilities predicted by estimated_pick, and I’d be happy with any solution so long as it’s valid and extendable.