To double-check things I recreated the synthetic data according to the specifications in your post:
Which in R is:
N <- 640
alpha <- 0.1
mu_u <- 0.20
sigma_u <- 0.15
lower_bound <- 0
sigma <- 0.13
u <- truncnorm::rtruncnorm(n = N, a = lower_bound, mean = mu_u, sd = sigma_u)
error <- extraDistr::rhnorm(n = N, sigma = sigma)
Y = alpha - u + error
Because your generating parameters are so close to 0, but are also lower-bounded at 0, the sampler is going to have a difficult time with the posterior.
To make things a bit easier, we can use the non-centered parameterisation for u
(an explanation of how to specify the bounds is in this forum post), and use tighter priors for the parameters:
data {
int<lower=0> N;
vector[N] Y;
}
parameters {
real<lower=0> mu_u;
real alpha;
real<lower=0> sigma;
real<lower=0> sigma_u;
vector<lower=-mu_u/sigma_u>[N] u_raw;
}
transformed parameters {
// Implies u ~ normal(mu_u, sigma_u) with lower bound of 0
vector<lower=0>[N] u = mu_u + u_raw * sigma_u;
// Calculate density adjustment for lower truncation at 0 for
// every observation
real trunc_adjustment = -N * normal_lccdf(0 | mu_u, sigma_u);
}
model {
alpha ~ normal(0, 0.5);
mu_u ~ normal(0, 0.5);
sigma_u ~ normal(0, 0.5);
sigma ~ normal(0, 0.5);
u_raw ~ std_normal();
target += trunc_adjustment;
Y ~ normal(alpha - u, sigma);
}
When running the model, you’ll also need to increase the adapt_delta
and max_treedepth
settings, so that the sampler has to operate with more precision when traversing the posterior:
mod <- cmdstan_model("trunc.stan")
samp <- mod$sample(data = list(N = N, Y = Y),
adapt_delta = 0.999,
max_treedepth = 12)
This appears to recover the parameters reasonably well:
> samp$summary(c("mu_u", "alpha", "sigma", "sigma_u"), c("mean", "rhat", "ess_bulk"))
# A tibble: 4 x 4
variable mean rhat ess_bulk
<chr> <dbl> <dbl> <dbl>
1 mu_u 0.123 1.74 6.12
2 alpha 0.135 1.61 6.69
3 sigma 0.113 1.43 8.37
4 sigma_u 0.114 1.04 127.
The chains haven’t completely converged, so you may need to increase the adapt_delta
even further