I have been struggling with the following code for a while and decided to seek help from experts here.
I am fitting a two component mixture model, one component normal and one component laplace. When I fit data that is generated from normal distribution with a large sample size of N=1000 or N=5000, I would expect the weight on the normal component to be close to 1. However, stan result always gives a weight close to 0.65. The same is true if the data is generated from a laplace distribution.
The code and result are
liao1 = "data {
int N; // number of observations
real theta[N];
}
parameters {
real<lower=0.0, upper= 20/sqrt(2)> tau1; //sqrt(2)*tau1 is the standard deviation of laplace distribution
real<lower=0.0, upper=20.0> tau2; //tau2 is the standard deviation of normal distribution
real<lower=0.0, upper=1.0> lambda;
}
model {
real term1;
real term2;
tau1 ~ uniform(0.0, 20/sqrt(2.));
tau2 ~ uniform(0.0, 20.0);
lambda ~ uniform(0.0, 1); //lambda<0.0 and lambda>=0 represents binary choice of laplace distribution and normal distribution
target += log_mix(lambda, double_exponential_lpdf(theta |0.0, tau1), normal_lpdf(theta | 0.0, tau2));
}
"
N = 1000
theta = rnorm(N, 0, 2)
my.data = list(N=N, theta=theta)
stan(model_code = liao1, data=my.data, iter=10000, chains=1)
Inference for Stan model: ae78845bd99245d1341f7cb022b70ce0.
1 chains, each with iter=10000; warmup=5000; thin=1;
post-warmup draws per chain=5000, total post-warmup draws=5000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
tau1 7.00 0.06 4.04 0.36 3.52 6.95 10.48 13.75 5000 1
tau2 2.01 0.00 0.05 1.93 1.98 2.01 2.04 2.10 4437 1
lambda 0.33 0.00 0.24 0.01 0.12 0.28 0.50 0.85 4709 1
lp__ -2118.42 0.03 1.38 -2121.92 -2119.08 -2118.05 -2117.39 -2116.82 2027 1
Samples were drawn using NUTS(diag_e) at Sat May 04 20:25:49 2019.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
Thanks in advance.
Jason Liao