Errors-in-variables model consistently underestimates sigma

Greetings, I am trying to put together an errors-in-variables linear regression model in stan (as a starting point of its censored version). The model I have, even though it estimates the slope and intercept correctly, always understimates the error sigma (I have checked that it is not a confusion between variance and sd…). The initial model I tried is below:

data{
  int n;
  real xobs[n];
  real yobs[n];
}

parameters {
  real alpha;
  real beta;
  real<lower=0> sigma;
  real x[n];
}

model {
  // priors
  alpha ~ normal(0, 5);
  beta ~ normal(0, 5);
  sigma ~ normal(0, 5);

  // model structure  
  for (i in 1:n){
    xobs[i] ~ normal(x[i], sigma);
  }
  for (i in 1:n){
    yobs[i] ~ normal(alpha + beta*x[i], sigma);
  }
}

And I have run this in R and there I generate the data via (I set intercept to 0 for testing purposes but still underestimates sigma)

n <- 1000
slope <- 1
intercept <- 0
sigma <- 1

xobs <- runif(n, -3, 3) + rnorm(n,sigma)
yobs <- slope*xobs + intercept + rnorm(n,sigma) 

I get fitted sigma values of around 0.7 all the time. When I increase the real sigma to 2 then it is around 1.4.

I have also tried a more complicated model (inspired by stan users guide measurement error model which is

data {
  int n;
  real xobs[n];
  real yobs[n];
}
parameters {
  real alpha;           
  real beta;            
  real<lower=0> sigma;  
  vector[n] x;          
  real mu_x;         
  real<lower=0> sigma_x;       
}
model {
  x ~ normal(mu_x, sigma_x);  // prior
  xobs ~ normal(x, sigma);    // measurement model
  yobs ~ normal(alpha + beta * x, sigma);
  
  
  alpha ~ normal(0, 10);
  beta ~ normal(0, 10);
  sigma ~ cauchy(0, 5);
  sigma_x ~ cauchy(0, 5);
  mu_x ~ normal(0, 10);
}

This one also produces the similar underestimated sigmas. I have tried it for slope = 2, intercept = -1.5 and sigma= 1 and the mean value of the posterior samples (which look nice as a histogram) come out around 2, -0.49 and 0.46.

For people who are familiar with pymc3, I had previously done the second model above in pymc3 (should be more or less identical) which does give the correct sd. And I am super confused because to me the second model above and the model below are identical (apart from using HalfNormal instead of Cauchy for sigmas). And help would be much appreciated.

with pm.Model() as model:

      alpha = pm.Normal("intercept", 0., 5.)
      beta = pm.Normal("slope", 0., 5.)
      sigma_x = 0.01 + pm.HalfNormal("sigma_x", 5.)
      sigma = 0.01 + pm.HalfNormal("sigma", 5.)
      mu_x = pm.Normal("x_mu", 0., 10.)

      x = pm.Normal("x", mu_x, sigma_x, shape=x_obs.shape)

      y = pm.Deterministic("y_pred", alpha + beta * x)


      x_likelihood = pm.Normal("x_obs", mu=x, sigma=sigma, observed=x_obs,
                               shape=x_obs.shape)

      y_likelihood = pm.Normal("y", mu=y, sd=sigma, observed=y_obs,
                               shape=y_obs.shape)

      trace = pm.sample(400, tune=1000, chains=6, cores=6, target_accept=0.95,
                        return_inferencedata=False, init='jitter+adapt_diag')

The problem is the two bugs in your R code:

xobs <- runif(n, -3, 3) + rnorm(n ,sigma)
yobs <- slope * xobs + intercept + rnorm(n ,sigma) 

The first problem is that the rnorm here uses sigma as the location, not the scale.

The second problem is that yobs should be based on x, not xobs.

Going forward, you probably don’t want the same error scale for the measurement error (x_obs) and observations (y_obs).

Here’s an R program to simulate:

N <- 1000
sigma <- 0.5
alpha <- 1.3
beta <- -0.4
x <- runif(N, -3, 3)
y_obs <- rnorm(N, alpha + beta * x, sigma)
x_obs <- rnorm(N, x, sigma)

library('cmdstanr')

model <- cmdstan_model('foo.stan')
fit <- model$sample(data = list(N = N, x_obs = x_obs, y_obs = y_obs), parallel_chains=4)
print(fit$summary(variables = c("alpha", "beta", "sigma")))

and here’s the matching Stan model:

data{
  int<lower=0> N;
  array[N] real x_obs;
  array[N] real y_obs;
}
parameters {
  real alpha;
  real beta;
  real<lower=0> sigma;
  vector[N] x;
}
model {
  alpha ~ normal(0, 5);
  beta ~ normal(0, 5);
  sigma ~ normal(0, 5);
  x_obs ~ normal(x, sigma);
  y_obs ~ normal(alpha + beta * x, sigma);
}

and here’s the fit that the simulation script prints:

All 4 chains finished successfully.
Mean chain execution time: 3.1 seconds.
Total execution time: 3.2 seconds.

# A tibble: 3 × 10
  variable   mean median      sd     mad     q5    q95  rhat ess_bulk ess_tail
  <chr>     <dbl>  <dbl>   <dbl>   <dbl>  <dbl>  <dbl> <dbl>    <dbl>    <dbl>
1 alpha     1.32   1.32  0.0173  0.0168   1.29   1.34   1.00    4515.    3011.
2 beta     -0.370 -0.370 0.00912 0.00911 -0.384 -0.355  1.00    4293.    2735.
3 sigma     0.498  0.498 0.0114  0.0114   0.480  0.518  1.00    1812.    2185.
2 Likes