Hi,
A question about interpretation - predictive distribution versus missing data. Assume a series of exchangeable experiments (w/ data y_1 and n_1 below) inform a parameter (the logit-probability mu_1). In another series of exchangeable experiments, another (different) parameter (the logit-probability mu_2) can be estimated from data ( y_2 and n_2).
A third(!) series of exchangeable experiments were performed where only data on a function of those two initial parameters are quantified but quantities directly informing the initial parameters themselves weren’t. I use data from 3rd series of experiments to improve estimation of the two first quantities, given finite number of experiments (in an evidence synthesis context).
My question: the synthesis of these different datasets requires predicting the unobserved quantities (ytilde_1 and ytilde_2) for the 3rd set of experiments where these quantities are not observed (but a function of their values is). The sampling of ytilde_1 and ytilde_2 in the model block here – can it be interpreted as posterior predictive simulations (even if not in generated quantities block and not sampling using _rng) or would the correct interpretation be as missing data? (noting the information flow from ytilde_1 and ytilde_2 to other parameters - which is the goal and clear from running the model). It would be great to know your thoughts (sorry for long - possibly unclear - description)
data{
int N_1; // number of experiments - informing parameter 1
int N_2; // number of experiments - informing parameter 2
int N_3; // number of experiments - informing combination of parameters 1 and 2
int y_1[N_1];
int y_2[N_2];
int n_1[N_1];
int n_2[N_2];
int n_3[N_3];
int y_3[N_3];
}
parameters{
real mu_1;
real mu_2;
real<lower=0> tau_1;
real<lower=0> tau_2;
vector[N_1] mu_study_1;
vector[N_2] mu_study_2;
vector[N_3] ytilde_1;
vector[N_3] ytilde_2;
}
transformed parameters{
vector[N_3] p_3 = inv_logit(ytilde_1).*inv_logit(ytilde_2); // Function of quantities derived from two initial parameters
}
model{
// Priors
mu_1 ~ normal(0, 5);
mu_2 ~ normal(0, 5);
tau_1 ~ normal(0, 1);
tau_2 ~ normal(0, 1);
//Hierarchical Priors
mu_study_1 ~ normal(mu_1, tau_1);
mu_study_2 ~ normal(mu_2, tau_2);
// Likelihood
y_1 ~ binomial_logit(n_1, mu_study_1);
y_2 ~ binomial_logit(n_2, mu_study_2);
// Predicting unobservable values of first two parameters in 3rd set of experiments
ytilde_1 ~ normal(mu_1, tau_1);
ytilde_2 ~ normal(mu_2, tau_2);
// Likelihood (3rd set)
y_3 ~ binomial(n_3, p_3);
}