Hi everybody!
In a simple linear regression, say I have a manifest outcome Y and a predictor X that is aggregated (i.e., the values of X are means of distributions of true parameters with known variances).
Would it be a sound modeling strategy to do the following:
data {
vector[N] y;
vector[N] X_means;
vector<lower=0.0>[N] X_sds;
}
parameters {
vector[N] X_i;
real<lower=0.0> sd_residual;
real beta_0;
real beta_1;
}
model {
X_i ~ normal(X_means, X_sds);
y ~ normal(beta_0 + beta_1 * X_i, sd_residual);
// Priors...
}
I couldnât really figure out which part of the modeling world this belongs to, it is kind of a reverse-latent variable modeling, since we have the means and errors and are interested in the manifest values (that we donât know). On the other hand, itâs not really mixed modeling either, at least I canât manage to make it fit into the format.
I have to say I am a bit unsure if this is even permissable, because if you reparameterize X_i and plug it into the regression equation, it reads:
Y_i = \beta_0 + \beta_1(\bar{X}_i + \sigma_{X_i}X_i) + \varepsilon_i = \beta_0 + \beta_1\bar{X}_i + \beta_1\sigma_{X_i}X_i + \varepsilon_i
with X_i\sim N(0,1). Now how can we disentangle the terms \beta_1\sigma_{X_i}X_i and \varepsilon_i?
If it is actually a sensible model, I would be happy if you could tell me if this has a name and how I can find further information.
Thank you!