Hello,
I’ve been working on a simple meta-analysis model for aggregating subjective probability estimates from a survey. While this seems like a simple model consistent with the Stan documentation and the treatment in BDA3 chapter 5.6, I haven’t seen any examples of this model structure applied to this domain, or to data imaged to be a beta-distributed random variable. A sanity check would be greatly appreciated!
Data
Each respondent gave three estimates for the probability of a certain event:
- 5th percentile,
- Best guess (median),
- 95th percentile.
The goal is to aggregate these responses in a way that captures respondents’ uncertainty in a principled way.
Model Overview:
I think it’s just a basic meta-analysis model: I treat each respondent’s “best guess” as a beta_proportion()
distributed variable. I then use Python to pre-compute a beta_proportion() kappa parameter for each person based on their 5th and 95th percentiles to reflect their subjective uncertainty, and pass those kappa parameters to the model as data. The parameters of inferential interest are the mu and kappa of the ‘aggregation distribution’.
Current Stan Model
data {
int<lower=1> N;
array[N] real<lower=0, upper=1> median_respondent;
array[N] real<lower=0> kappa_respondent;
}
parameters {
array[N] real<lower=0, upper=1> mu_respondent;
real<lower=0, upper=1> mu_aggregation;
real<lower=0> kappa_aggregation;
}
model {
kappa_aggregation ~ gamma(2, 0.1);
median_respondent ~ beta_proportion(mu_respondent, kappa_respondent);
mu_respondent ~ beta_proportion(mu_aggregation, kappa_aggregation);
}
Questions
I’m confused about whether this is conceptually appropriate for the task at hand: to get a picture of respondents’ best-guesses in a way that accurately takes into account their subjective uncertainty. Specifically, I’m wondering:
- While the kappa parameters I pass to the model do come from a distribution found by optimization to capture each respondent’s 5th and 95th quantiles, I’m confused about whether this remains true even when separated from the mu parameter of that optimized distribution. Is it a face-validity issue if the distribution with
mu = best_guess
andkappa = kappa_respondent
does not have the same median and 90% interval as what the respondent gave? Does this problem exist in meta-analysis with the classic Normal model as well? - Might it be better to directly model the 5th and 95th percentiles directly, rather than using them to compute the scale parameter? I’ve never seen a model of this kind, but would be curious if there’s anything out there.
Thank you in advance for any advice!