I’ve been working on a linear regression problem where we don’t measure the covariate of interest (x_{true}) directly:
y \sim Normal(a+b*x_{true},\sigma_1)
Instead we measure a proxy variable (x_{obs}) and have calibration data available to fit another regression between the proxy variable and the covariate of interest:
x_{true} \sim Normal(c+d*x_{obs},\sigma_2).
For now, assume we will just use the point estiamtes of c,d,\sigma_2
I’ve coded the model as following:
data {
int<lower=0> N;
vector[N] y;
vector[N] x_obs;
real sigma_2;
real c;
real d;
}
parameters {
real a;
real b;
vector[N] x_true;
real<lower=0> sigma_1;
}
model {
a ~ normal(0,5);
b ~ normal(0,5);
sigma ~ cauchy(0,5);
x_true ~ normal(c+d*x_obs,sigma_2);
y ~ normal(a+b*x_true, sigma_1);
}
Would this be the right implementation? And, in general, will models like this lead to a different likelihood formulation than traditional measurement error models that suppose x_{obs} \sim Normal(x_{true},\tau) for known measurement error \tau or is there some equivalence I’m missing?