Hi STAN community! I am a total beginner to STAN trying to specify a model to generate data from. The idea is to generate data for a meta-analysis with specific effect and heterogeneity parameters that I can then reproduce using the rma function from the metafor package in R. Below is my model.
data{
int N; // number of studies
}
parameters{
real mu;
real <lower=.0001>tau;
}
model{
mu ~ normal(.5, .1); // mean of true effects(d = .5)
tau ~ gamma(.5, 1); // heterogeneity parameter(tau = .5)
}
generated quantities{
vector[N] theta;
vector[N] sigma;
vector[N] y;
for(i in 1:N){
theta[i] = normal_rng(mu, tau); // drawn true population effect
sigma[i] = normal_rng(1, .1); // population deviation
y[i] = normal_rng(theta[i], sigma[i]); // observed effects
}
}
My thinking is that this represent something like this:
\mu \sim N(0.5, 0.1)
\tau \sim \gamma(0.5, .1)
\theta_i \sim N(\mu, \tau)
\sigma_i \sim N(1, 1.1)
y_i \sim N(\theta_i, \sigma_i)
The issue is that when I simulate data from this model in R and try to estimate tau and mu the estimations are horribly off. For example, if I sample from this model for 5 studies with n = 300 like so:
n <- 300
iter <- 1000
set.seed(58)
sim_outcome <- sampling(model,
iter = iter,
warmup = iter - n,
chains = 1,
refresh = 0,
data = list(N = 5))
meta_data <- data.frame(extract(sim_outcome))
yi <- (unlist(lapply(meta_data[12:16],
FUN = mean)))
vi <- (unlist(lapply(meta_data[12:16],
FUN = var)))
rma(yi = yi, vi = vi, method = 'DL')
The rma results return a tau estimate of 0, which is confusing me. I have tried different variance estimators (e.g REML, ML) and also checked the model on big samples with many studies but it still does not work as I want (no heterogeneity). I can do this in base r without much issue, but I would really prefer to do it in STAN. Any guidance/help would be greatly appreciated!