Posterior Predictive Checks

Hi,

My model is a mixture gaussian model, and it can give me reasonable parameters distribution. So now I want to use the parameters for posterior predictive checks.

The whole working model:


model='''
data {
 int D; //number of dimensions
 int K; //number of gaussians
 int N; //number of data
 //int M; // prediction points number 
// int<lower=0,upper=1> prior_only; // =1 to get data from prior predictive 
 vector[D] y[N]; // observation data
 real con[N]; //concentration

}

parameters {
 simplex[K] theta; //mixing proportions
 ordered[D] mu[K]; //mixture component means
 cholesky_factor_corr[D] L[K]; //cholesky factor of correlation matrix
 vector<lower=0>[D] sigma[K]; // standard deviations
}


transformed parameters{    
	cholesky_factor_cov[D,D] cov[K];  
  real ro[K];
  matrix[D,D] Omega[K];
  matrix[D,D] Sigma[K];  
  vector[N] lamba;
    for (k in 1:K) {cov[k] = diag_pre_multiply(sigma[k],L[k]);
    Omega[k] = multiply_lower_tri_self_transpose(L[k]);
  Sigma[k] = quad_form_diag(Omega[k], sigma[k]); 
  ro[k]=Omega[k,2,1];}

  for (i in 1 : N) {lamba[i] = 1/(theta[1]*(1./2./3.1415926/sqrt (Sigma[1, 1, 1])/sqrt (Sigma[1, 2, 2])/sqrt (1 - ro[1]*ro[1]))*exp (-1./2./(1 - ro[1]*ro[1])*(-(y[i, 1] - mu[1, 1])*(y[i, 1] - mu[1, 1])/Sigma[1, 1, 1] - (y[i, 2] - mu[1, 2])*(y[i, 2] - mu[1, 2])/Sigma[1, 2, 2] + 2.*ro[1]*(y[i, 1] - mu[1, 1])*(y[i, 2] - mu[1, 2])/sqrt (Sigma[1, 1, 1])/sqrt (Sigma[1, 2, 2]))) + 
       theta[2]*(1./2./3.1415926/sqrt (Sigma[2, 1, 1])/sqrt (Sigma[2, 2, 2])/sqrt (1 - ro[2]*ro[2]))*exp (-1./2./(1 - ro[2]*ro[2])*(-(y[i, 1] - mu[2, 1])*(y[i, 1] - mu[2, 1])/Sigma[2, 1, 1] - (y[i, 2] - mu[2, 2])*(y[i, 2] - mu[2, 2])/Sigma[2, 2, 2] + 2.*ro[2]*(y[i, 1] - mu[2, 1])*(y[i, 2] - mu[2, 2])/sqrt (Sigma[2, 1, 1])/sqrt (Sigma[2, 2, 2]))));}
}

model {
 real ps[K];
 theta ~ dirichlet(rep_vector(2.0, D));
 for(k in 1:K){
 mu[1,k] ~ normal(3.67,0.1);// uniform(340/100,380/100);//
 mu[2,k]~  normal(31.80,0.1);//uniform(3160/100,3190/100);//
 sigma[k] ~ exponential(0.5);//beta(2,5);//uniform(0.1,0.89); //////normal(1,2);
 L[k] ~ lkj_corr_cholesky(2);// contain rho 
 con ~ exponential(lamba);
 }
 
 for (n in 1:N){
 for (k in 1:K){
 ps[k] = log(theta[k])+multi_normal_cholesky_lpdf(y[n] | mu[k], cov[k]); //increment log probability of the gaussian
 }
 target += log_sum_exp(ps);
 }
   for(i in 1:N){
   target +=   - lamba[i]*con[i]+log(lamba[i]);
  }
 }

Then I add the posterior check part:


generated quantities {
int<lower=0,upper=K> comp[N];
vector[2] ynew[N];
for (n in 1:N){
  comp[n] = categorical_rng(theta);
  ynew[n] = multi_normal_rng(mu[comp[n]],Sigma[comp[n]]);
}
}

The comp[n] results are the same all the n times, and ynew[n] results are similiar. I think the comp[n] should be the same as the gaussian patches parameter ‘theta’. The ynew[n,1] and ynew[n,2] seems at the same location, but actually one patch at mu[1,1] = 3.61, mu[2,1] = 31.68 and the other patch at mu[1,2] = 3.73, mu[2,2] = 31.74. I am so confused on the fit result. Anyone can help me on interpreting the meaning of comp[n] and ynew[n]? Why they are not similar as the real data? How can I make the code correctly for prediction new case? Thanks.


               mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
theta[1]       0.14  1.3e-3    0.1   0.02   0.06   0.12   0.19   0.37 5298.0    1.0
theta[2]       0.86  1.3e-3    0.1   0.63   0.81   0.88   0.94   0.98 5298.0    1.0
mu[1,1]        3.61  1.4e-3   0.08   3.45   3.56   3.62   3.67   3.77 3718.0    1.0
mu[2,1]       31.68  9.2e-4   0.05  31.57  31.65  31.69  31.72  31.77 2916.0    1.0
mu[1,2]        3.73  1.0e-3   0.08   3.56   3.67   3.73   3.78   3.89 6753.0    1.0
mu[2,2]       31.74  6.6e-4   0.03   31.7  31.73  31.74  31.75  31.81 1646.0    1.0
Sigma[1,1,1]  11.94    0.53   25.1   0.03    0.9    3.7  12.34  74.27 2227.0    1.0
Sigma[2,1,1] 421.37    1.15   95.6 274.68 353.47 406.66 473.11 643.65 6886.0    1.0
Sigma[1,2,1]   0.34    0.17  12.18 -24.38  -3.51   0.22   4.05  25.46 5188.0    1.0
Sigma[2,2,1]   0.14  9.5e-3   0.38  -0.48  -0.08   0.09    0.3   1.05 1611.0    1.0
Sigma[1,1,2]   0.34    0.17  12.18 -24.38  -3.51   0.22   4.05  25.46 5188.0    1.0
Sigma[2,1,2]   0.14  9.5e-3   0.38  -0.48  -0.08   0.09    0.3   1.05 1611.0    1.0
Sigma[1,2,2] 123.15    1.36  65.23  57.68  81.37  104.6 143.89 300.82 2284.0    1.0
Sigma[2,2,2] 8.5e-4  3.1e-5 1.2e-3 3.6e-4 4.6e-4 5.6e-4 7.9e-4 3.4e-3 1416.0    1.0
comp[1]        1.86  3.7e-3   0.35    1.0    2.0    2.0    2.0    2.0 9034.0    1.0
comp[2]        1.86  3.6e-3   0.35    1.0    2.0    2.0    2.0    2.0 9346.0    1.0
comp[3]        1.86  3.6e-3   0.35    1.0    2.0    2.0    2.0    2.0 9260.0    1.0
comp[4]        1.86  3.6e-3   0.35    1.0    2.0    2.0    2.0    2.0 9097.0    1.0
comp[5]        1.87  3.5e-3   0.34    1.0    2.0    2.0    2.0    2.0 9177.0    1.0
comp[6]        1.86  3.6e-3   0.35    1.0    2.0    2.0    2.0    2.0 9257.0    1.0
comp[7]        1.87  3.7e-3   0.34    1.0    2.0    2.0    2.0    2.0 8588.0    1.0
comp[8]        1.86  3.7e-3   0.35    1.0    2.0    2.0    2.0    2.0 8774.0    1.0
comp[9]        1.86  3.8e-3   0.34    1.0    2.0    2.0    2.0    2.0 8312.0    1.0
ynew[1,1]     27.68    0.23  21.46  -7.49   9.03  27.37  42.79  70.85 8906.0    1.0
ynew[2,1]     27.81    0.22  21.33  -6.86   9.18  27.81  42.86  70.43 9834.0    1.0
ynew[3,1]     27.77    0.22  21.43   -7.5   9.06  27.39  42.94  70.31 9107.0    1.0
ynew[4,1]     27.56    0.22  21.27   -8.6    9.2  27.78   42.3  70.07 9422.0    1.0
ynew[5,1]     27.87    0.23  21.71  -8.66   9.48  27.38   43.1  71.04 8536.0    1.0
ynew[6,1]     27.65    0.22  21.28  -7.14   9.25  27.51  42.41  70.57 9018.0    1.0
ynew[7,1]     27.94    0.22  21.46  -7.97    9.8  27.79  42.84  71.66 9460.0    1.0
ynew[8,1]     27.56    0.23  21.42   -7.6   8.95  27.35  42.61  70.98 8986.0    1.0
ynew[9,1]     27.96    0.22  21.23  -6.17   9.69  27.49  43.06  70.67 9351.0    1.0
ynew[1,2]     27.81    0.11  10.62   -6.6  31.71  31.73  31.76  31.83 9166.0    1.0
ynew[2,2]     27.88    0.11  10.58  -5.56  31.71  31.73  31.76  31.83 9482.0    1.0
ynew[3,2]     27.72    0.11  10.77  -6.56  31.71  31.73  31.76  31.83 9189.0    1.0
ynew[4,2]     27.87    0.11   10.6  -6.38  31.71  31.73  31.76  31.83 8993.0    1.0
ynew[5,2]     28.09    0.11  10.33  -5.47  31.71  31.74  31.76  31.83 9255.0    1.0
ynew[6,2]     27.87    0.11  10.54  -6.01  31.71  31.73  31.76  31.83 9375.0    1.0
ynew[7,2]     28.03    0.11  10.42  -5.67  31.71  31.73  31.76  31.83 8805.0    1.0
ynew[8,2]     27.77    0.11  10.68  -6.63  31.71  31.73  31.76  31.83 9113.0    1.0
ynew[9,2]     27.85    0.11  10.66  -6.43  31.71  31.73  31.76  31.83 9211.0    1.0
lp__         -118.2    0.05   2.45 -123.8 -119.6 -117.9 -116.5 -114.4 2377.0    1.0

I guess I know the reason for ynew. I assigned the wrong prior to mu. I am still confused on comp…

Figure_2
why the comp distribution like that? I think it should be either 1 or 2…

It is only 1 or 2. That is problem with KDE.

if you add arr.astype(int) to your script in plotting you get the histogram (which is better), but in this case the bar plot would be the correct. You would need to do it manually.

What is KDE? So you mean the result is right, the problem is the plotting?

Kernel Density Estimation

If you do np.unique you see only 1 and 2. One can see that also in summary for quantiles.

Are the results right, I can not answer this. I don’t even know what data you have.

Priors seem to be really tight, but I can not say they are “wrong”.

It does make sense! Thanks!