8schools but they are correlated

A coauthor and I are thinking about an 8schools like case in which there are multiple experiments and you only have the mean and the standard deviation of the effect. The question is how to adjust for correlation between the schools whereby you have prior information on the correlation parameters but are not able to fold that information into a multilevel structure. Would the implementation below be appropriate?

data {
  int<lower=0> J;         // number of schools
  vector[J] y;              // estimated treatment effects
  vector<lower=0>[J] sigma; // standard error of effect estimates
}
parameters {
  real mu;                // population treatment effect
  real<lower=0> tau;      // standard deviation in treatment effects
  vector[J] eta;          // unscaled deviation from mu by school
  cholesky_factor_corr[J] L_Omega;
}
transformed parameters {
  matrix[J, J] L_Sigma;
  vector[J] theta = mu + tau * eta;        // school treatment effects
  L_Sigma = diag_pre_multiply(sigma, L_Omega);
}
model {
  L_Omega ~ lkj_corr_cholesky(4);
  target += normal_lpdf(eta | 0, 1);       // prior log-density
  target += multi_normal_cholesky_lpdf(y | theta, L_Sigma); // log-likelihood
}

You want to integrate eta out of that posterior distribution analytically and perhaps draw from its full-conditional distribution in the generated quantities block. One of the things that I don’t like about the 8 schools example is that we get caught up on talking about centered vs. non-centered parameterizations and in a lot of cases with meta-analysis or something similar, neither parameterization works that well in Stan. The right parameterization is to not have eta.

data {
  int<lower=0> J;         // number of schools
  vector[J] y;              // estimated treatment effects
  vector<lower=0>[J] sigma; // standard error of effect estimates  
}
transformed data {
  vector[J] sigma2 = square(sigma);
}
parameters {
  real mu;                // population treatment effect
  real<lower=0> tau;      // standard deviation in treatment effects
  cholesky_factor_corr[J] L_Omega;
}
model {
  matrix[J,J]  Omega = multiply_lower_tri_self_transpose(L_Omega);
  real tau2 = square(tau);
  for (j in 1:J) Omega[j,j] = 1 + tau2 + sigma2[j];
  target += multi_normal_lpdf(y | rep_vector(mu, J), Omega);
  target += lkj_corr_cholesky(L_Omega | 4); // 4 is usually too big
  // prior on mu and tau
}
1 Like

This model indeed does not have eta, but I do not believe it is the correct model :P.

Edit: Courtesy your edits my joke is obsolete

Still funny though

Thanks! I will try that. The help I received so far from other sources was not as helpful…

4 Likes

Yeah, I don’t think BDA covers marginalizing eta out.

Raccoon cat. Send that photo to the Gelblogg.

So I tried the model above with the DGP below but it appears that the model pools the estimates all the way to the mean. Is there an error or am I generally approaching this in the wrong way?

# libraries

library(rstan); library(MASS)

# sampling
chains <- 4
iter <- 2000

# general
T <- 2
N <- rpois(T, 1000)

## effects live here (you just can't see them)

# hyperparameter
center_mu <- 1.5; deviation_mu <- 0.5
center_sigma <- 0; deviation_sigma <- 2
mu_mu <- rnorm(1, center_mu, deviation_mu)
mu_sigma <- abs(rnorm(1, center_sigma, deviation_sigma))

# effect in test
mu <- rnorm(T, mu_mu, mu_sigma)

# covariance matrix
p <- qr.Q(qr(matrix(rnorm(T^2, 0.5), T)))
Sigma <- crossprod(p, p*(T:1))

# observed effects
effect.dist.mean <- rep(NA, T)
effect.dist.sd <- rep(NA, T)
beta <- MASS::mvrnorm(max(N), mu, Sigma)  ## draw effect distribution

for (i in 1:T){
  x <- rbinom(N[i], 1, 0.5)
  alpha_mu <- rnorm(2, 2, 0.5)
  sigma_y <- abs(rnorm(1, 0, 1))
  alpha <- rnorm(N[i], alpha_mu, 0.2)
  y <- rnorm(N[i], alpha + beta[1:N[i], i] * x, sigma_y)
  summary <- summary(lm(y ~ 1 + x))
  effect.dist.mean[i] <- summary$coefficients[2, 1]
  effect.dist.sd[i] <- summary$coefficients[2, 2]
}


data <- list(
  J = T,
  y = effect.dist.mean,
  sigma = effect.dist.sd
)


setwd("~/Dropbox/QuantMethods/S I M O N E")
m.mvn_wpriors <- rstan::stan_model("codeStan_wMVN_v4.stan")
fit <- rstan::sampling(m.mvn_wpriors, data = data)
fit_extract <- rstan::extract(fit)

print("effect sizes")
colMeans(fit_extract$theta_rep)
mu
print("hyper mu")
mean(fit_extract$mu)
mu_mu

I feel a bit stuck (…) since I also tried inputting the covariance matrix as known but even then it pooled all the way to the mean. Even adjusting tau didn’t change that result.

1 Like