I have a multivariate model in which two responses (A
, B
) are jointly modeled using a set of predictors (x1
, x2
). With brms
, the model formula is:
bf(A ~ x1 + x2) + bf(B ~ x1 + x2) + gaussian() + set_rescor(FALSE)
Using the fitted model, posterior expectations of A
and B
are computed for a separate data set, and are then used to calculate a derived quantity X = A*B
. This X
is then used to predict Y
in another model, but I want to accommodate the posterior distribution of X
in doing so. In the absence of implementing a fully joint model (which is a little tricky here because the data sets for A
/B
and Y
are separate), I was thinking I might be able to propagate the posterior uncertainty of X
to the second model by treating X
as measured with ‘measurement error’. Here’s the brms
model formula that I have right now to do this:
bf(Y ~ mi(Xmean)) + bf(Xmean | mi(Xsd) ~ 1) + gaussian() + set_rescor(FALSE)
where Xmean
and Xsd
are the posterior mean and standard deviation of X
, respectively. While this model can be readily fit, I have a couple (mostly conceptual) questions/concerns about this procedure that I was hoping to get some input on.
(1) Digging into the Stan code for the brms
measurement error model, I see that it uses roughly the following specification for imputing the latent ‘true’ X
:
Y_i \sim \text{Normal}(\alpha + \beta X_i, \sigma_Y) \quad [1]
X_i \sim \text{Normal}(\mu_X, \sigma_X) \quad [2]
X_{\text{mean},i} \sim \text{Normal}(X_i, \sigma_{X_{\text{mean}},i}) \quad [3]
Although I take it that this is a common specification for Bayesian imputation, my intuition expects something like X_i \sim \text{Normal}(X_{\text{mean},i}, \sigma_{X_{\text{mean}},i}) for [3] considering that I’m trying to parametrically represent the posterior distribution for X here. Does what I’m thinking make sense, and if so, is there a way to specify the model as such with brms
? I’m also not sure what [2] is meant to be doing here. I don’t think I’m interested in the quantities \mu_X and \sigma_X; if anything, I would think these quantities should vary by each case in the data, as in the alternative expression I gave earlier.
(2) Another consequence of (1) seems to be that the posterior distribution of each imputed X_i deviates from the actual posterior distribution of X_i that I’m trying to build into the second model. Again, I realize this is probably expected behavior of Bayesian imputation, but this doesn’t seems to be achieving exactly what I want it to.
(3) I think that introducing posterior uncertainty as measurement error in this way will break any posterior correlations that might exist among elements of X. Is this thinking correct? While I don’t foresee this to be a major issue in this case, it might be something worth keeping in mind.
With all that said, I wanted to know what others think of this approach. The alternative approach I can think of is to do multiple imputation, where I fit the second model to multiple realizations of X using brm_multiple()
, but this has the downside of taking very long to iterate over even a small portion (e.g. 10% = 400 samples) of the posterior for X. It looks like there has been a similar discussion here, but I still had these questions/concerns after reading through the thread. Any thoughts or input would be much appreciated!
Here’s the Stan code for the brms
measurement error model, in case it’s helpful to have a look at:
// generated with brms 2.20.4
functions {
}
data {
int<lower=1> N; // total number of observations
int<lower=1> N_Y; // number of observations
vector[N_Y] Y_Y; // response variable
int<lower=1> Ksp_Y; // number of special effects terms
int<lower=1> N_Xmean; // number of observations
vector[N_Xmean] Y_Xmean; // response variable
// data for measurement-error in the response
vector<lower=0>[N_Xmean] noise_Xmean;
// information about non-missings
int<lower=0> Nme_Xmean;
array[Nme_Xmean] int<lower=1> Jme_Xmean;
int prior_only; // should the likelihood be ignored?
}
transformed data {
}
parameters {
real Intercept_Y; // temporary intercept for centered predictors
vector[Ksp_Y] bsp_Y; // special effects coefficients
real<lower=0> sigma_Y; // dispersion parameter
vector[N_Xmean] Yl_Xmean; // latent variable
real Intercept_Xmean; // temporary intercept for centered predictors
real<lower=0> sigma_Xmean; // dispersion parameter
}
transformed parameters {
real lprior = 0; // prior contributions to the log posterior
lprior += student_t_lpdf(Intercept_Y | 3, 0.1, 2.5);
lprior += student_t_lpdf(sigma_Y | 3, 0, 2.5)
- 1 * student_t_lccdf(0 | 3, 0, 2.5);
lprior += student_t_lpdf(Intercept_Xmean | 3, 0, 2.5);
lprior += student_t_lpdf(sigma_Xmean | 3, 0, 2.5)
- 1 * student_t_lccdf(0 | 3, 0, 2.5);
}
model {
// likelihood including constants
if (!prior_only) {
// initialize linear predictor term
vector[N_Y] mu_Y = rep_vector(0.0, N_Y);
// initialize linear predictor term
vector[N_Xmean] mu_Xmean = rep_vector(0.0, N_Xmean);
mu_Y += Intercept_Y;
mu_Xmean += Intercept_Xmean;
for (n in 1:N_Y) {
// add more terms to the linear predictor
mu_Y[n] += (bsp_Y[1]) * Yl_Xmean[n];
}
target += normal_lpdf(Y_Y | mu_Y, sigma_Y);
target += normal_lpdf(Yl_Xmean | mu_Xmean, sigma_Xmean);
}
// priors including constants
target += lprior;
target += normal_lpdf(Y_Xmean[Jme_Xmean] | Yl_Xmean[Jme_Xmean], noise_Xmean[Jme_Xmean]);
}
generated quantities {
// actual population-level intercept
real b_Y_Intercept = Intercept_Y;
// actual population-level intercept
real b_Xmean_Intercept = Intercept_Xmean;
}