The problem is the two bugs in your R code:
xobs <- runif(n, -3, 3) + rnorm(n ,sigma)
yobs <- slope * xobs + intercept + rnorm(n ,sigma)
The first problem is that the rnorm here uses sigma as the location, not the scale.
The second problem is that yobs should be based on x, not xobs.
Going forward, you probably don’t want the same error scale for the measurement error (x_obs) and observations (y_obs).
Here’s an R program to simulate:
N <- 1000
sigma <- 0.5
alpha <- 1.3
beta <- -0.4
x <- runif(N, -3, 3)
y_obs <- rnorm(N, alpha + beta * x, sigma)
x_obs <- rnorm(N, x, sigma)
library('cmdstanr')
model <- cmdstan_model('foo.stan')
fit <- model$sample(data = list(N = N, x_obs = x_obs, y_obs = y_obs), parallel_chains=4)
print(fit$summary(variables = c("alpha", "beta", "sigma")))
and here’s the matching Stan model:
data{
int<lower=0> N;
array[N] real x_obs;
array[N] real y_obs;
}
parameters {
real alpha;
real beta;
real<lower=0> sigma;
vector[N] x;
}
model {
alpha ~ normal(0, 5);
beta ~ normal(0, 5);
sigma ~ normal(0, 5);
x_obs ~ normal(x, sigma);
y_obs ~ normal(alpha + beta * x, sigma);
}
and here’s the fit that the simulation script prints:
All 4 chains finished successfully.
Mean chain execution time: 3.1 seconds.
Total execution time: 3.2 seconds.
# A tibble: 3 × 10
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 alpha 1.32 1.32 0.0173 0.0168 1.29 1.34 1.00 4515. 3011.
2 beta -0.370 -0.370 0.00912 0.00911 -0.384 -0.355 1.00 4293. 2735.
3 sigma 0.498 0.498 0.0114 0.0114 0.480 0.518 1.00 1812. 2185.