Aggregated predictor in a simple linear regression

abeeisnotabug · November 4, 2021, 10:58am

Hi everybody!

In a simple linear regression, say I have a manifest outcome Y and a predictor X that is aggregated (i.e., the values of X are means of distributions of true parameters with known variances).

Would it be a sound modeling strategy to do the following:

data {
  vector[N] y;
  vector[N] X_means;
  vector<lower=0.0>[N] X_sds;
}

parameters {
  vector[N] X_i;
  real<lower=0.0> sd_residual;
  real beta_0;
  real beta_1;
}

model {
  X_i ~ normal(X_means, X_sds);
  y ~ normal(beta_0 + beta_1 * X_i, sd_residual);
  // Priors...
}

I couldn’t really figure out which part of the modeling world this belongs to, it is kind of a reverse-latent variable modeling, since we have the means and errors and are interested in the manifest values (that we don’t know). On the other hand, it’s not really mixed modeling either, at least I can’t manage to make it fit into the format.

I have to say I am a bit unsure if this is even permissable, because if you reparameterize X_i and plug it into the regression equation, it reads:

Y_i = \beta_0 + \beta_1(\bar{X}_i + \sigma_{X_i}X_i) + \varepsilon_i = \beta_0 + \beta_1\bar{X}_i + \beta_1\sigma_{X_i}X_i + \varepsilon_i

with X_i\sim N(0,1). Now how can we disentangle the terms \beta_1\sigma_{X_i}X_i and \varepsilon_i?

If it is actually a sensible model, I would be happy if you could tell me if this has a name and how I can find further information.

Thank you!

jsocolar · November 4, 2021, 4:46pm

This is valid (assuming the uncertainties in the X_i are independent and approximately Gaussian), and is often referred to as a “measurement error model”.

If \sigma_{X_i} is too large, then you won’t get identification. But it should be pretty straightforward to convince yourself that there won’t be identification problems if \sigma_{X_i} is sufficiently small. In the limit that \sigma_{X_i} approaches zero, this is just an ordinary linear regression.

Topic		Replies	Views
Linear model where each x has it's own normal distribution Modeling	6	623	March 1, 2019
Model Validation - Linear Regression with X and Y Uncertainty Modeling specification	4	1245	March 5, 2024
Scale transformation with error in linear regression Modeling	4	446	June 13, 2018
How to make predictions when the predictor is latent? Modeling	3	380	March 27, 2020
Struggles with Survey weighting and Regression Modeling General	6	567	March 24, 2020

Aggregated predictor in a simple linear regression

Related topics