Background
I recently encountered a paper analyzing longitudinal survey data where the questions are all of the form “what do you think is the probability that event X will happen in the next 10 years?”. So the responses are all proportions between 0 and 1, which lends itself nicely to beta regression.
But there’s a twist: the authors present a summary statistic of the timepoint-specific responses that accounts for ‘respondent under-confidence’ by converting each response to its associated log odds, raising each one to a shared fixed power 0 < α < ∞, and using MLE to infer the value of α. Larger values of α scale responses to be more extreme, which is what we would want if we thought people were systematically under-confident in their responses. As the authors put it: “in this paper we…correct the systematic bias at a collective level by shifting each probability forecast closer to its nearest boundary point. If the probability forecast is less (more) than 0.5, it is moved away from its original point and closer to 0.0 (1.0).”
This struck me as a fun idea – usually we regularize to make values less extreme!
Question
My question is: how might one implement this kind of (de?)regularization towards extreme responses in a Bayesian beta regression context?
Suppose we have the vanilla beta regression given below (borrowed from Michael Clark’s outstanding examples). How might one modify this Stan code such that the posterior predictions tend to be pulled towards the extremes 0 and 1 of the distribution? Can it be done with priors, or by altering the model structure in some way?
data {
int<lower=1> N; // sample size
int<lower=1> K; // K predictors
vector<lower=0,upper=1>[N] y; // response
matrix[N,K] X; // predictor matrix
}
parameters {
vector[K] theta; // reg coefficients
real<lower=0> phi; // dispersion parameter
}
transformed parameters{
vector[K] beta;
beta = theta * 5; // same as beta ~ normal(0, 5); fairly diffuse
}
model {
// model calculations
vector[N] LP; // linear predictor
vector[N] mu; // transformed linear predictor
vector[N] A; // parameter for beta distn
vector[N] B; // parameter for beta distn
LP = X * beta;
for (i in 1:N) {
mu[i] = inv_logit(LP[i]);
}
A = mu * phi;
B = (1.0 - mu) * phi;
// priors
theta ~ normal(0, 1);
phi ~ cauchy(0, 5); // different options for phi
//phi ~ inv_gamma(.001, .001);
//phi ~ uniform(0, 500); // put upper on phi if using this
// likelihood
y ~ beta(A, B);
}
generated quantities {
vector[N] y_rep;
for (i in 1:N) {
real mu;
real A;
real B;
mu = inv_logit(X[i] * beta);
A = mu * phi;
B = (1.0 - mu) * phi;
y_rep[i] = beta_rng(A, B);
}
}