I’m trying to run a fairly simple linear regression model with measurement error the response variable. But I’m getting issues with model convergence and am wondering if there’s something obviously wrong with my model specification.
data {
int<lower=0> N; // number of cases
vector[N] y_meas; // measurement of y
real<lower=0> tau; // measurement noise
vector[N] x; // predictor
}
model {
# Priors
alpha ~ normal(0, 10);
beta ~ normal(0, 10);
sigma ~ cauchy(0, 5);
# Model
y ~ normal(mu_y, sigma_y);
y_meas ~ normal(y, tau);
y ~ normal(alpha + x * beta, sigma);
}
Some reproducible data are:
dat <- list(
N=40,
y_meas = c(16.61, 28.97, 24.13, 14.74, 17.66, 13.16, 8.59, 17.47, 12.92, 10.75, 19.26, 14.01, 8.91, 18.02, 11.33, 12.83, 21.26, 16.86, 11.76, 20.91, 15.04, 20.50, 26.47, 13.21, 13.59, 22.42, 10.57, 12.61, 17.95, 10.96, 16.54, 26.33, 21.75, 14.22, 20.60, 16.71, 17.86, 15.24, 14.87),
x = c(35, 60, 100, 35, 60, 100, 35, 60, 100, 35, 60, 100, 35, 60, 100, 35, 60, 100, 35, 60, 100, 35, 60, 100, 35, 60, 100, 35, 60, 100, 35, 60, 100, 35, 60, 100, 35, 35, 35, 35)
)
And the model call I use is:
stan(file = "sample-model.stan", data = dat,
iter = 4000,
control=list(adapt_delta=0.95,
max_treedepth=13),
chains = 3)
Example model output are:
Warning messages:
1: There were 191 divergent transitions after warmup. Increasing adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
2: There were 8 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 13. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
3: There were 3 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mc-stan.org/misc/warnings.html#bfmi-low
4: Examine the pairs() plot to diagnose sampling problems
5: The largest R-hat is 3.46, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat
6: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess
7: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess
Here are the parameter stats:
print(m2,pars=c("alpha","beta","sigma"))
Inference for Stan model: polynomial_meas-err_uni.
3 chains, each with iter=4000; warmup=2000; thin=1;
post-warmup draws per chain=2000, total post-warmup draws=6000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
alpha 3.53 0.92 1.13 2.53 2.53 2.96 5.11 5.11 2 92701.63
beta 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 141 1.02
sigma 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2 152.15
Samples were drawn using NUTS(diag_e) at Wed Apr 15 14:13:49 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
Is there something obvious that I’m doing wrong with model specification? Or is it generally not advised to have a measurement error model for a response variable?