Dear Stan community,
I am interested in measurement error models, in particular conditional logit models, when some of the explanatory variables are measured with error. In order to get there, I have started from a far more simple example, planning to build up the complexity of the model. This simple example is a linear model with measurement error in x, when the variance of the measurement error is known. This is included in the manual on page 203.
The model is:
y_i = \alpha + \beta * x_i + \epsilon_i
i = 1, ..., N, where N is the number of observations
I am interested in estimating \alpha and \beta when x_i is measured with error. The most simple situation is that x_i^{meas} \sim N(x_i, \tau). Following Stan conventions, \tau is the standard deviation.
I have simulated data and then tried to recover the true parameters using the code in the manual on page 204, after minor adjustments to variables type to solve compile errors:
// y, x_meas, and x are changed to vector as compared "real" in the manual due to compile errors
data {
int<lower=0> N; // number of observations
vector[N] y; // outcome
vector[N] x_meas; // measurements of x
real<lower=0> tau; // measurement noise
}
parameters {
real alpha; // intercept
real beta; // slope
real<lower=0> sigma; // variance of the error term in y = alpha + beta*x_true + eps
vector[N] x; // unknown true value
real mu_x; // prior location
real<lower=0> sigma_x; // prior scale
}
model {
alpha ~ normal(0, 10);
beta ~ normal(0, 10);
sigma ~ cauchy(0, 5);
mu_x ~ normal(0, 10);
sigma_x ~ cauchy(0, 5);
x ~ normal(mu_x, sigma_x); // prior
x_meas ~ normal(x, tau); // measurement model
y ~ normal(alpha + beta * x, sigma);
}
I seem to have some problems retrieving the parameters:
- there are divergent transitions after warmup
- n_eff is very low and Rhat is too high
Warning messages:
1: There were 42 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
2: There were 58 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
3: There were 4 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mc-stan.org/misc/warnings.html#bfmi-low
4: Examine the pairs() plot to diagnose sampling problems
> print(res_LM_stan_example_V1, pars = c("alpha", "beta", "sigma", "mu_x", "sigma_x"))
Inference for Stan model: LM_stan_example_V1.
4 chains, each with iter=4000; warmup=2000; thin=1;
post-warmup draws per chain=2000, total post-warmup draws=8000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
alpha 0.87 0.02 0.17 0.54 0.77 0.87 0.98 1.21 57 1.04
beta 4.04 0.00 0.03 3.97 4.01 4.04 4.06 4.10 65 1.04
sigma 0.65 0.07 0.23 0.25 0.47 0.65 0.80 1.15 13 1.57
mu_x 4.05 0.00 0.07 3.92 4.01 4.05 4.10 4.19 1573 1.00
sigma_x 2.98 0.00 0.05 2.89 2.95 2.98 3.02 3.09 813 1.00
Samples were drawn using NUTS(diag_e) at Wed Nov 14 18:29:56 2018.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
> true_alpha
[1] 1
> true_beta
[1] 4
> true_sigma
[1] 1
> true_mu_x
[1] 4
> true_sigma_x
[1] 3
Based on the example on page 152 in the manual I have reparametrized the cauchy and moved x ~ normal(mu_x, sigma_x)
to the transformed parameters block. These changes are included in LM_stan_example_V2.stan. The results don’t improve that much. I still get divergent transitions and now n_eff is very low for sigma, mu_x, and sigma_x, even though it improves for alpha and beta.
> print(res_LM_stan_example_V2, pars = c("alpha", "beta", "sigma", "mu_x", "sigma_x"))
Inference for Stan model: LM_stan_example_V2.
4 chains, each with iter=4000; warmup=2000; thin=1;
post-warmup draws per chain=2000, total post-warmup draws=8000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
alpha 0.86 0.00 0.16 0.55 0.75 0.86 0.97 1.16 1434 1.00
beta 4.04 0.00 0.03 3.98 4.02 4.04 4.06 4.10 1112 1.00
sigma 0.62 0.04 0.18 0.25 0.49 0.62 0.75 0.97 18 1.18
mu_x 4.04 0.01 0.07 3.91 4.00 4.04 4.09 4.20 42 1.10
sigma_x 2.97 0.01 0.06 2.87 2.94 2.97 3.01 3.09 65 1.06
Samples were drawn using NUTS(diag_e) at Wed Nov 14 18:32:51 2018.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
There’s probably a silly error in my code, but I can’t seem to find it. Or maybe I should use some other priors. It can also be that I am missing out something more fundamental. Either way, I would really appreciate some suggestions on what I can improve. I have attached the R and Stan files.
Best,
Ana
LM_stan_example.R (2.0 KB)
LM_stan_example_V1.stan (896 Bytes)
LM_stan_example_V2.stan (1.1 KB)