Hello, I am getting extremely poor results with an extremely simple linear regression. Can anyone tell me where the problem is? Or if I need to tweak settings, can anyone explain why Stan would perform so poorly on such a simple linear regression with default settings?
R code:
N <- 1000
x <- 1:N
y <- 10 + 3*x
lm_fit <- lm(y ~ x)
stan_dat <- within(list(), {
N = N
y = y
x = x
})
stan_fit <- stan(file = "simple_regression.stan", data = stan_dat)
print(lm_fit)
print(stan_fit)
Stan code (simple_regression.stan)
data {
int<lower=0> N;
real y[N];
real x[N];
}
parameters {
real b0;
real b1;
real<lower=0> sigma;
}
model {
real mu[N];
for (i in 1:N)
mu[i] = b0+b1*x[i];
b0 ~ normal(0, 10000);
b1 ~ normal(0, 10000);
sigma ~ uniform(0, 100);
y ~ normal(mu, sigma);
}
end of output:
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
10 3
Inference for Stan model: simple_regression.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
b0 2.89 2.97 4.20 -0.28 -0.14 0.92 3.95 10.00 2 190583.39
b1 0.37 1.13 1.60 -1.15 -0.78 -0.19 0.96 3.00 2 53056.45
sigma 75.00 30.61 43.31 0.00 75.00 100.00 100.00 100.00 2 198812.79
lp__ -161189.34 80501.54 113874.84 -293878.71 -245587.99 -179357.88 -95092.74 7856.47 2 2902.19