# Partial pooling in generated quantities?

Hi,

I want to setup partial pooling in the generated quantities blog for two normal models. I can’t do it in the model itself or rather don’t want to because it would make a small computational convenience to speed up the model by a lot impossible. My question is whether this approach actually makes sense and/or whether there are better alternatives. The model below works in terms of giving the right result but there might be a better way.


a <- rnorm(1000, 2, 10)
b <- rnorm(1000, 4, 20)
c <- (1/var(a) * a+ 1/var(b) * b)/(1/var(a) + 1/var(b))
mean(c)
sd(c)
(1/10^2 * 2 + 1/20^2 * 4)/(1/10^2 + 1/20^2)
sqrt(1/(1/10^2 + 1/20^2))
pp_m <- cmdstan_model("code/R/computation/partial_pooling/partial_pooling_test.stan")
data <- list(
N = 100000,
y1 = rnorm(100000, 2, 10),
y2 = rnorm(100000, 4, 20)
)
fit <- pp_m$sample( data = data, seed = 123, chains = 4, parallel_chains = 4, refresh = 500 ) fit$summary("y_pooled")

data {
int N;
vector[N] y1;
vector[N] y2;
}
parameters {
real mu1;
real mu2;
real<lower=0> sigma1;
real<lower=0> sigma2;
}
model {
sigma1 ~ normal(0, 20);
sigma2 ~ normal(0, 20);
mu1 ~ normal(0, 10);
mu2 ~ normal(0, 10);
y1 ~ normal(mu1, sigma1);
y2 ~ normal(mu2, sigma2);
}
generated quantities {
real y_pooled;
y_pooled = normal_rng((1/square(sigma1) * mu1 + 1/square(sigma2) * mu2)/(1/square(sigma1) + 1/square(sigma2)),
sqrt(1/(1/square(sigma1) + 1/square(sigma2))));
}


What is y_pooled as a quantity intended to be? The mean if you made the model hierarchical such that mu1 and mu2 as samples from a normal?

Yes, that’s one interpretation. More specifically, mu1 and mu2 are two forecasts that I want to produce a weighted average of.

I’m not very math-saavy, so I’ll assume your formulae accurately reflect the analytic solution to the expectation for the mean and variance of the weighted mean. In that case, I’d simply mention that I think this approach implies a prior that’s uniform from -\infty to +\infty, which may not be ideal?

Presumably the model you’ve posted is a minimal toy and your concern about performance comes from knowing you’ll be doing a much more complicated model?

Well, in principle you could say the ex ante prediction is like normal(0.5, 0.15) for the vote share or something like this. So yeah, there is a uniform prior assumption. For the full model I essentially have two random walks that I only compute in the model block until the last day of polling/there is data and do any future prediction in the generated quantities block. And so if I were to partially pool in the model block for the final outcome I couldn’t do that in the generated quantities block and would have to compute all that stuff which I don’t want. The gain with the current approach is a speed increase by a factor 2-3+ with a model that originally takes 35 minutes or so.

1 Like

Current model code here if you are curious (and for anyone else who has pent-up anger over confusing model formulations)

data {
int N_state;
int N_national;
int S;
int T;
int T_plus;
int t_pos[T - 1];
int s[N_state];
int t_state[N_state];
int t_national[N_national];
cov_matrix[S] cov_mat_mu_b;
int y_state[N_state];
int n_state[N_state];
int y_national[N_national];
int n_national[N_national];
// prior data
vector[S] prev_ele;
int prev_ele_days;
// weights
row_vector[S] weights;
// past election data
int T_past;
vector[T_past * S] y_past;
matrix[T_past, S] x_past;
int s_past[T_past * S];
int t_past[T_past * S];
int t_between_ele[T_past - 1];
// Fundamentals prediction
int t_to_election;
vector[S] x_last;
}
transformed data {
cholesky_factor_cov[S] cholesky_cov_mu_b;
real sqrt_t_pos[T - 1];
vector[S] prev_ele_logit;
real sqrt_prev_ele_days;
vector[T_past - 1] sqrt_past_t_bet_ele;
matrix[T_past - 1, S] diff_x_past;
real sqrt_t_to_election;
cholesky_cov_mu_b = cholesky_decompose(cov_mat_mu_b);
sqrt_t_pos = sqrt(t_pos);
prev_ele_logit = logit(prev_ele);
sqrt_prev_ele_days = sqrt(prev_ele_days);
sqrt_past_t_bet_ele = to_vector(sqrt(t_between_ele));
diff_x_past = x_past[2:T_past] - x_past[1:(T_past - 1)];
sqrt_t_to_election = sqrt(t_to_election);
}
parameters {
// Poll model
matrix[S, T] raw_mu_b;
// Fundamentals
//matrix[2, S] raw_beta;
matrix[S, 2] raw_beta;
//cholesky_factor_corr L_omega;
vector<lower = 0> tau;
real<lower = 0> sigma_past;
vector<lower = 0>[S] sigma_x; // Variation of economic factors
// Pooled
vector[S] pred_p;
vector[S] pred_f;
}
transformed parameters {
// Poll model
matrix[S, T] mu_b;
vector[N_state] theta_state;
vector[N_national] theta_national;
vector[T] mu_weighted;
matrix[S, 2] beta;
mu_b[, 1] = sqrt_prev_ele_days * cholesky_cov_mu_b * raw_mu_b[, 1] + prev_ele_logit;
for (i in 2:T) mu_b[:, i] = sqrt_t_pos[i - 1] * cholesky_cov_mu_b * raw_mu_b[:, i] + mu_b[:, i - 1];
// National weighted
mu_weighted = logit(weights * inv_logit(mu_b))';
// Assign
for (i in 1:N_state){
theta_state[i] = mu_b[s[i], t_state[i]];
}
theta_national = mu_weighted[t_national];
// Fundamentals
//beta = (diag_pre_multiply(tau, L_omega) * raw_beta)';
for (i in 1:2) beta[,i] = raw_beta[,i] * tau[i];
}
model {
// Priors
// Poll model
to_vector(raw_mu_b) ~ std_normal();
// Fundamentals
sigma_past ~ normal(0, 0.1);
to_vector(raw_beta) ~ std_normal();
tau ~ normal(0, 1);
//L_omega ~ lkj_corr_cholesky(1);
sigma_x ~ normal(0, 0.2);
// Likelihood
// Poll model
y_state ~ binomial_logit(n_state, theta_state);
y_national ~ binomial_logit(n_national, theta_national);
// Fundamentals
for (i in 1:S){
diff_x_past[,i] ~ normal(0, sigma_x[i] * sqrt_past_t_bet_ele);
}
for (i in 1:(T_past * S)){
y_past[i] ~ normal(beta[s_past[i], 1] +
beta[s_past[i], 2] * x_past[t_past[i], s_past[i]],
sigma_past);
}
// Pooled
pred_f ~ normal(beta[,1] + beta[,2] .* x_last, sqrt(square(sigma_x * sqrt_t_to_election) + square(sigma_past)));
pred_p ~ multi_normal_cholesky(mu_b[:,T], cholesky_cov_mu_b * sqrt(T_plus));
}
generated quantities {
matrix[T + T_plus, S] predicted_score;
vector[T + T_plus] predicted_score_national;
int y_rep_national[N_national];
int y_rep_state[N_state];
for (i in 1:S){
predicted_score[1:T,i] = inv_logit(to_vector(mu_b[i]));
}
for (i in 1:T_plus){
predicted_score[T + i,] = inv_logit(multi_normal_cholesky_rng(mu_b[,T], cholesky_cov_mu_b * sqrt(i)))';
}
predicted_score_national = predicted_score * weights';
y_rep_state = binomial_rng(n_state, inv_logit(theta_state));
y_rep_national = binomial_rng(n_national, inv_logit(theta_national));
}