Two component mixture model, strange result from stan

jg_liao · May 5, 2019, 12:29am

I have been struggling with the following code for a while and decided to seek help from experts here.

I am fitting a two component mixture model, one component normal and one component laplace. When I fit data that is generated from normal distribution with a large sample size of N=1000 or N=5000, I would expect the weight on the normal component to be close to 1. However, stan result always gives a weight close to 0.65. The same is true if the data is generated from a laplace distribution.

The code and result are

liao1 = "data {
int N; // number of observations
real theta[N];
}

parameters {
real<lower=0.0, upper= 20/sqrt(2)> tau1; //sqrt(2)*tau1 is the standard deviation of laplace distribution
real<lower=0.0, upper=20.0> tau2; //tau2 is the standard deviation of normal distribution
real<lower=0.0, upper=1.0> lambda;
}

model {

real term1;
real term2;

tau1 ~ uniform(0.0, 20/sqrt(2.));
tau2 ~ uniform(0.0, 20.0);

lambda ~ uniform(0.0, 1); //lambda<0.0 and lambda>=0 represents binary choice of laplace distribution and normal distribution

target += log_mix(lambda, double_exponential_lpdf(theta |0.0, tau1), normal_lpdf(theta | 0.0, tau2));
}
"

N = 1000
theta = rnorm(N, 0, 2)

my.data = list(N=N, theta=theta)

stan(model_code = liao1, data=my.data, iter=10000, chains=1)

Inference for Stan model: ae78845bd99245d1341f7cb022b70ce0.
1 chains, each with iter=10000; warmup=5000; thin=1;
post-warmup draws per chain=5000, total post-warmup draws=5000.

       mean se_mean   sd     2.5%      25%      50%      75%    97.5% n_eff Rhat

tau1 7.00 0.06 4.04 0.36 3.52 6.95 10.48 13.75 5000 1
tau2 2.01 0.00 0.05 1.93 1.98 2.01 2.04 2.10 4437 1
lambda 0.33 0.00 0.24 0.01 0.12 0.28 0.50 0.85 4709 1
lp__ -2118.42 0.03 1.38 -2121.92 -2119.08 -2118.05 -2117.39 -2116.82 2027 1

Samples were drawn using NUTS(diag_e) at Sat May 04 20:25:49 2019.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

Thanks in advance.

Jason Liao

Stephen_Martin · May 7, 2019, 12:17am

I’m fairly certain log_mix does not do what you think it is doing. Iirc, in the stan manual in the section on mixtures/clustering models, it talks about this.

You need to loop over the thetas.
Currently, you are estimating:

p(\theta) = (1-\lambda)\prod_{i=1}^Np(\theta|\tau_1) + \lambda\prod_{i=1}^Np(\theta|\tau_2)

What you probably want is:

p(\theta) = \prod_{i=1}^N \left[(1-\lambda)p(\theta_i|\tau_1) + \lambda p(\theta_i|\tau_2)\right]

Which means you want to loop over each theta and call log_mix. E.g.,

for(n in 1:N){
  target += log_mix(lambda, double_exponential_lpdf(theta[n]|0, tau1), normal_lpdf(theta[n]|0, tau2));
}

Someone correct me if I’m wrong.

jg_liao · May 7, 2019, 3:10am

Stephen, this is amazing. I changed the code to what you suggested and got the answer I desperately wanted.

liao1 = ’

data {
int N; // number of observations
real theta[N];
}

parameters {
real<lower=0.0, upper= 20/sqrt(2)> tau1; //sqrt(2)*tau1 is the standard deviation of laplace distribution
real<lower=0.0, upper=20.0> tau2; //tau2 is the standard deviation of normal distribution
real<lower=0.0, upper=1.0> lambda;
}

model {

real term1;
real term2;

tau1 ~ uniform(0.0, 20/sqrt(2.));
tau2 ~ uniform(0.0, 20.0);

lambda ~ uniform(0.0, 1); //lambda<0.0 and lambda>=0 represents binary choice of laplace distribution and normal distribution

for(n in 1:N){
target += log_mix(lambda, double_exponential_lpdf(theta[n]|0, tau1), normal_lpdf(theta[n]|0, tau2));
}
}
’
N = 1000
theta = rnorm(N, 0, 2)

#theta = rexp(N)
#theta = ifelse(rnorm(N)>0, theta, -theta)

my.data = list(N=N, theta=theta)

stan(model_code = liao1, data=my.data, iter=10000, chains=1)

     mean se_mean   sd     2.5%      25%      50%      75%    97.5% n_eff

tau1 1.84 0.09 2.39 0.13 0.70 1.15 1.69 10.57 655
tau2 1.95 0.00 0.05 1.86 1.92 1.95 1.99 2.06 1236
lambda 0.05 0.00 0.05 0.00 0.01 0.04 0.08 0.18 767
lp__ -2082.41 0.06 1.58 -2086.27 -2083.28 -2082.10 -2081.16 -2080.35 678

One million thanks.

Jason

Topic		Replies	Views
Coping with varying curvature Modeling fitting-issues	9	750	August 24, 2019
Mixture Models for sets of observations Modeling	3	653	May 8, 2020
Mixture Bayesian Poisson Regression Model Modeling	45	3899	April 17, 2019
A mixture model with skewed mixing distributions Modeling	3	836	September 13, 2017
Beta binomial mixture model Modeling mixture	5	1309	March 4, 2021

Two component mixture model, strange result from stan

Related topics