I’m not sure what the code in the Stan manual is trying to do, but I posted a simple regression model with measurement errors in both variables along with a reproducible example a while back: https://discourse.mc-stan.org/t/estimating-slope-and-intercept-linear-regression-for-data-x-vs-y-with-uncertainties-in-both-x-and-y/5457/4
Here are the parameter and model blocks:
parameters {
vector[N] x_lat;
vector[N] y_lat;
real beta0;
real beta1;
real<lower=0> sigma;
}
transformed parameters {
vector[N] mu_yhat = beta0 + beta1 * x_lat;
}
model {
beta0 ~ normal(0., 2.);
beta1 ~ normal(0., 5.);
sigma ~ normal(0., 2.);
xhat ~ normal(x_lat, sd_xhat);
y_lat ~ normal(mu_yhat, sigma);
yhat ~ normal(y_lat, sd_yhat);
}
In this model the latent values of y are connected to the latent values of x through a simple linear regression, while the measured values are generated from the latent ones with known variances. The transformed parameter mu_yhat
could have been left out. Here’s a traceplot of a few parameters:
The variance parameter sigma
has a slightly low n_eff and is biased on the low side (the input was 0.1), but no warnings are triggered.
I think there’s an identification issue with the model in the manual. It would help if authors were asked to produce real or fake data examples.