"Measurement error" model

I’ve been working on a linear regression problem where we don’t measure the covariate of interest (x_{true}) directly:

y \sim Normal(a+b*x_{true},\sigma_1)

Instead we measure a proxy variable (x_{obs}) and have calibration data available to fit another regression between the proxy variable and the covariate of interest:

x_{true} \sim Normal(c+d*x_{obs},\sigma_2).

For now, assume we will just use the point estiamtes of c,d,\sigma_2

I’ve coded the model as following:

data {
  int<lower=0> N;
  vector[N] y;
  vector[N] x_obs;
  real sigma_2;
  real c;
  real d;

parameters {
  real a;
  real b;
  vector[N] x_true;
  real<lower=0> sigma_1;

model {
  a ~ normal(0,5);
  b ~ normal(0,5);
  sigma ~ cauchy(0,5);
  x_true ~ normal(c+d*x_obs,sigma_2);
  y ~ normal(a+b*x_true, sigma_1);

Would this be the right implementation? And, in general, will models like this lead to a different likelihood formulation than traditional measurement error models that suppose x_{obs} \sim Normal(x_{true},\tau) for known measurement error \tau or is there some equivalence I’m missing?

1 Like

This is exactly equivalent. To see the equivalence, note that you have written down x_true ~ normal(x_obs2, sigma2) where x_obs2 = c+d*x_obs. Then notice that normal_lpdf(x_true | x_obs2, sigma2) is strictly equal to normal_lpdf(x_obs2 | x_true, sigma2). Thus you have a model that is precisely equivalent to

where \tau is sigma2 and x_{obs} is x_obs2.