Hi,
My model is a mixture gaussian model, and it can give me reasonable parameters distribution. So now I want to use the parameters for posterior predictive checks.
The whole working model:
model='''
data {
int D; //number of dimensions
int K; //number of gaussians
int N; //number of data
//int M; // prediction points number
// int<lower=0,upper=1> prior_only; // =1 to get data from prior predictive
vector[D] y[N]; // observation data
real con[N]; //concentration
}
parameters {
simplex[K] theta; //mixing proportions
ordered[D] mu[K]; //mixture component means
cholesky_factor_corr[D] L[K]; //cholesky factor of correlation matrix
vector<lower=0>[D] sigma[K]; // standard deviations
}
transformed parameters{
cholesky_factor_cov[D,D] cov[K];
real ro[K];
matrix[D,D] Omega[K];
matrix[D,D] Sigma[K];
vector[N] lamba;
for (k in 1:K) {cov[k] = diag_pre_multiply(sigma[k],L[k]);
Omega[k] = multiply_lower_tri_self_transpose(L[k]);
Sigma[k] = quad_form_diag(Omega[k], sigma[k]);
ro[k]=Omega[k,2,1];}
for (i in 1 : N) {lamba[i] = 1/(theta[1]*(1./2./3.1415926/sqrt (Sigma[1, 1, 1])/sqrt (Sigma[1, 2, 2])/sqrt (1 - ro[1]*ro[1]))*exp (-1./2./(1 - ro[1]*ro[1])*(-(y[i, 1] - mu[1, 1])*(y[i, 1] - mu[1, 1])/Sigma[1, 1, 1] - (y[i, 2] - mu[1, 2])*(y[i, 2] - mu[1, 2])/Sigma[1, 2, 2] + 2.*ro[1]*(y[i, 1] - mu[1, 1])*(y[i, 2] - mu[1, 2])/sqrt (Sigma[1, 1, 1])/sqrt (Sigma[1, 2, 2]))) +
theta[2]*(1./2./3.1415926/sqrt (Sigma[2, 1, 1])/sqrt (Sigma[2, 2, 2])/sqrt (1 - ro[2]*ro[2]))*exp (-1./2./(1 - ro[2]*ro[2])*(-(y[i, 1] - mu[2, 1])*(y[i, 1] - mu[2, 1])/Sigma[2, 1, 1] - (y[i, 2] - mu[2, 2])*(y[i, 2] - mu[2, 2])/Sigma[2, 2, 2] + 2.*ro[2]*(y[i, 1] - mu[2, 1])*(y[i, 2] - mu[2, 2])/sqrt (Sigma[2, 1, 1])/sqrt (Sigma[2, 2, 2]))));}
}
model {
real ps[K];
theta ~ dirichlet(rep_vector(2.0, D));
for(k in 1:K){
mu[1,k] ~ normal(3.67,0.1);// uniform(340/100,380/100);//
mu[2,k]~ normal(31.80,0.1);//uniform(3160/100,3190/100);//
sigma[k] ~ exponential(0.5);//beta(2,5);//uniform(0.1,0.89); //////normal(1,2);
L[k] ~ lkj_corr_cholesky(2);// contain rho
con ~ exponential(lamba);
}
for (n in 1:N){
for (k in 1:K){
ps[k] = log(theta[k])+multi_normal_cholesky_lpdf(y[n] | mu[k], cov[k]); //increment log probability of the gaussian
}
target += log_sum_exp(ps);
}
for(i in 1:N){
target += - lamba[i]*con[i]+log(lamba[i]);
}
}
Then I add the posterior check part:
generated quantities {
int<lower=0,upper=K> comp[N];
vector[2] ynew[N];
for (n in 1:N){
comp[n] = categorical_rng(theta);
ynew[n] = multi_normal_rng(mu[comp[n]],Sigma[comp[n]]);
}
}
The comp[n] results are the same all the n times, and ynew[n] results are similiar. I think the comp[n] should be the same as the gaussian patches parameter ‘theta’. The ynew[n,1] and ynew[n,2] seems at the same location, but actually one patch at mu[1,1] = 3.61, mu[2,1] = 31.68 and the other patch at mu[1,2] = 3.73, mu[2,2] = 31.74. I am so confused on the fit result. Anyone can help me on interpreting the meaning of comp[n] and ynew[n]? Why they are not similar as the real data? How can I make the code correctly for prediction new case? Thanks.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
theta[1] 0.14 1.3e-3 0.1 0.02 0.06 0.12 0.19 0.37 5298.0 1.0
theta[2] 0.86 1.3e-3 0.1 0.63 0.81 0.88 0.94 0.98 5298.0 1.0
mu[1,1] 3.61 1.4e-3 0.08 3.45 3.56 3.62 3.67 3.77 3718.0 1.0
mu[2,1] 31.68 9.2e-4 0.05 31.57 31.65 31.69 31.72 31.77 2916.0 1.0
mu[1,2] 3.73 1.0e-3 0.08 3.56 3.67 3.73 3.78 3.89 6753.0 1.0
mu[2,2] 31.74 6.6e-4 0.03 31.7 31.73 31.74 31.75 31.81 1646.0 1.0
Sigma[1,1,1] 11.94 0.53 25.1 0.03 0.9 3.7 12.34 74.27 2227.0 1.0
Sigma[2,1,1] 421.37 1.15 95.6 274.68 353.47 406.66 473.11 643.65 6886.0 1.0
Sigma[1,2,1] 0.34 0.17 12.18 -24.38 -3.51 0.22 4.05 25.46 5188.0 1.0
Sigma[2,2,1] 0.14 9.5e-3 0.38 -0.48 -0.08 0.09 0.3 1.05 1611.0 1.0
Sigma[1,1,2] 0.34 0.17 12.18 -24.38 -3.51 0.22 4.05 25.46 5188.0 1.0
Sigma[2,1,2] 0.14 9.5e-3 0.38 -0.48 -0.08 0.09 0.3 1.05 1611.0 1.0
Sigma[1,2,2] 123.15 1.36 65.23 57.68 81.37 104.6 143.89 300.82 2284.0 1.0
Sigma[2,2,2] 8.5e-4 3.1e-5 1.2e-3 3.6e-4 4.6e-4 5.6e-4 7.9e-4 3.4e-3 1416.0 1.0
comp[1] 1.86 3.7e-3 0.35 1.0 2.0 2.0 2.0 2.0 9034.0 1.0
comp[2] 1.86 3.6e-3 0.35 1.0 2.0 2.0 2.0 2.0 9346.0 1.0
comp[3] 1.86 3.6e-3 0.35 1.0 2.0 2.0 2.0 2.0 9260.0 1.0
comp[4] 1.86 3.6e-3 0.35 1.0 2.0 2.0 2.0 2.0 9097.0 1.0
comp[5] 1.87 3.5e-3 0.34 1.0 2.0 2.0 2.0 2.0 9177.0 1.0
comp[6] 1.86 3.6e-3 0.35 1.0 2.0 2.0 2.0 2.0 9257.0 1.0
comp[7] 1.87 3.7e-3 0.34 1.0 2.0 2.0 2.0 2.0 8588.0 1.0
comp[8] 1.86 3.7e-3 0.35 1.0 2.0 2.0 2.0 2.0 8774.0 1.0
comp[9] 1.86 3.8e-3 0.34 1.0 2.0 2.0 2.0 2.0 8312.0 1.0
ynew[1,1] 27.68 0.23 21.46 -7.49 9.03 27.37 42.79 70.85 8906.0 1.0
ynew[2,1] 27.81 0.22 21.33 -6.86 9.18 27.81 42.86 70.43 9834.0 1.0
ynew[3,1] 27.77 0.22 21.43 -7.5 9.06 27.39 42.94 70.31 9107.0 1.0
ynew[4,1] 27.56 0.22 21.27 -8.6 9.2 27.78 42.3 70.07 9422.0 1.0
ynew[5,1] 27.87 0.23 21.71 -8.66 9.48 27.38 43.1 71.04 8536.0 1.0
ynew[6,1] 27.65 0.22 21.28 -7.14 9.25 27.51 42.41 70.57 9018.0 1.0
ynew[7,1] 27.94 0.22 21.46 -7.97 9.8 27.79 42.84 71.66 9460.0 1.0
ynew[8,1] 27.56 0.23 21.42 -7.6 8.95 27.35 42.61 70.98 8986.0 1.0
ynew[9,1] 27.96 0.22 21.23 -6.17 9.69 27.49 43.06 70.67 9351.0 1.0
ynew[1,2] 27.81 0.11 10.62 -6.6 31.71 31.73 31.76 31.83 9166.0 1.0
ynew[2,2] 27.88 0.11 10.58 -5.56 31.71 31.73 31.76 31.83 9482.0 1.0
ynew[3,2] 27.72 0.11 10.77 -6.56 31.71 31.73 31.76 31.83 9189.0 1.0
ynew[4,2] 27.87 0.11 10.6 -6.38 31.71 31.73 31.76 31.83 8993.0 1.0
ynew[5,2] 28.09 0.11 10.33 -5.47 31.71 31.74 31.76 31.83 9255.0 1.0
ynew[6,2] 27.87 0.11 10.54 -6.01 31.71 31.73 31.76 31.83 9375.0 1.0
ynew[7,2] 28.03 0.11 10.42 -5.67 31.71 31.73 31.76 31.83 8805.0 1.0
ynew[8,2] 27.77 0.11 10.68 -6.63 31.71 31.73 31.76 31.83 9113.0 1.0
ynew[9,2] 27.85 0.11 10.66 -6.43 31.71 31.73 31.76 31.83 9211.0 1.0
lp__ -118.2 0.05 2.45 -123.8 -119.6 -117.9 -116.5 -114.4 2377.0 1.0