Errors-in-variables model consistently underestimates sigma

Bob_Carpenter · March 17, 2023, 9:49pm

The problem is the two bugs in your R code:

xobs <- runif(n, -3, 3) + rnorm(n ,sigma)
yobs <- slope * xobs + intercept + rnorm(n ,sigma)

The first problem is that the rnorm here uses sigma as the location, not the scale.

The second problem is that yobs should be based on x, not xobs.

Going forward, you probably don’t want the same error scale for the measurement error (x_obs) and observations (y_obs).

Here’s an R program to simulate:

N <- 1000
sigma <- 0.5
alpha <- 1.3
beta <- -0.4
x <- runif(N, -3, 3)
y_obs <- rnorm(N, alpha + beta * x, sigma)
x_obs <- rnorm(N, x, sigma)

library('cmdstanr')

model <- cmdstan_model('foo.stan')
fit <- model$sample(data = list(N = N, x_obs = x_obs, y_obs = y_obs), parallel_chains=4)
print(fit$summary(variables = c("alpha", "beta", "sigma")))

and here’s the matching Stan model:

data{
  int<lower=0> N;
  array[N] real x_obs;
  array[N] real y_obs;
}
parameters {
  real alpha;
  real beta;
  real<lower=0> sigma;
  vector[N] x;
}
model {
  alpha ~ normal(0, 5);
  beta ~ normal(0, 5);
  sigma ~ normal(0, 5);
  x_obs ~ normal(x, sigma);
  y_obs ~ normal(alpha + beta * x, sigma);
}

and here’s the fit that the simulation script prints:

All 4 chains finished successfully.
Mean chain execution time: 3.1 seconds.
Total execution time: 3.2 seconds.

# A tibble: 3 × 10
  variable   mean median      sd     mad     q5    q95  rhat ess_bulk ess_tail
  <chr>     <dbl>  <dbl>   <dbl>   <dbl>  <dbl>  <dbl> <dbl>    <dbl>    <dbl>
1 alpha     1.32   1.32  0.0173  0.0168   1.29   1.34   1.00    4515.    3011.
2 beta     -0.370 -0.370 0.00912 0.00911 -0.384 -0.355  1.00    4293.    2735.
3 sigma     0.498  0.498 0.0114  0.0114   0.480  0.518  1.00    1812.    2185.

Topic		Replies	Views
Warnings for an errors-in-variables model and the risk of underestimating the slope Modeling rstan , fitting-issues	4	588	September 25, 2023
Measurement error - manual example on pp. 203-4 Modeling	9	1895	January 3, 2019
Linear regression with data-dependent error variances Modeling techniques	8	1396	August 1, 2024
Another measurement error modeling question Modeling	13	588	July 5, 2023
Simple measurement error model - unexpected output - misspecified? Modeling	4	570	February 14, 2019

Errors-in-variables model consistently underestimates sigma

Related topics