What does this warning means? Should I change the code? And how?

fitting-issues

#1

There are lots of warnings each time run. I wonder what those warning means? Does it means anything I need to change? Thanks,


ARNING:theano.configdefaults:install mkl with `conda install mkl-service`: No module named mkl
WARNING:theano.tensor.blas:Using NumPy C-API based implementation for BLAS functions.
DIAGNOSTIC(S) FROM PARSER:
Warning: left-hand side variable (name=lamba) occurs on right-hand side of assignment, causing inefficient deep copy to avoid aliasing.

INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_859efd3f82721222b491da93cf8e482d NOW.
Cannot build msvcr library: "msvcr90d.dll" not found
WARNING:pystan:Multiprocessing is not supported on Windows with Python 2.X. Setting n_jobs=1

Gradient evaluation took 0.001 seconds
1000 transitions using 10 leapfrog steps per transition would take 10 seconds.
Adjust your expectations accordingly!

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: validate transformed params: cov[k0__][k1__] is 0, but must be > 0!  (in 'unknown file name' at line 23)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: validate transformed params: cov[k0__][k1__] is 0, but must be > 0!  (in 'unknown file name' at line 23)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: validate transformed params: cov[k0__][k1__] is 0, but must be > 0!  (in 'unknown file name' at line 23)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: exponential_lpdf: Inverse scale parameter[1] is 0, but must be > 0!  (in 'unknown file name' at line 64)

Iteration:   1 / 1000 [  0%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: validate transformed params: cov[k0__][k1__] is 0, but must be > 0!  (in 'unknown file name' at line 23)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: validate transformed params: cov[k0__][k1__] is not lower triangular; cov[k0__][k1__][1,2]=nan  (in 'unknown file name' at line 23)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: exponential_lpdf: Inverse scale parameter[1] is 0, but must be > 0!  (in 'unknown file name' at line 64)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: exponential_lpdf: Inverse scale parameter[1] is 0, but must be > 0!  (in 'unknown file name' at line 64)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration: 100 / 1000 [ 10%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: exponential_lpdf: Inverse scale parameter[1] is 0, but must be > 0!  (in 'unknown file name' at line 64)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration: 200 / 1000 [ 20%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: exponential_lpdf: Inverse scale parameter[1] is 0, but must be > 0!  (in 'unknown file name' at line 64)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration: 300 / 1000 [ 30%]  (Warmup)
Iteration: 400 / 1000 [ 40%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: exponential_lpdf: Inverse scale parameter[1] is 0, but must be > 0!  (in 'unknown file name' at line 64)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration: 500 / 1000 [ 50%]  (Warmup)
Iteration: 501 / 1000 [ 50%]  (Sampling)
Iteration: 600 / 1000 [ 60%]  (Sampling)
Iteration: 700 / 1000 [ 70%]  (Sampling)
Iteration: 800 / 1000 [ 80%]  (Sampling)
Iteration: 900 / 1000 [ 90%]  (Sampling)
Iteration: 1000 / 1000 [100%]  (Sampling)

 Elapsed Time: 2.672 seconds (Warm-up)
               1.957 seconds (Sampling)
               4.629 seconds (Total)

WARNING:pystan:n_eff / iter below 0.001 indicates that the effective sample size has likely been overestimated
WARNING:pystan:Rhat for parameter L[1,1,1] is nan!
WARNING:pystan:Rhat for parameter cov[1,1,1,2] is nan!
WARNING:pystan:Rhat for parameter Omega[1,1,1] is nan!

WARNING:pystan:Rhat above 1.1 or below 0.9 indicates that the chains very likely have not mixed
Inference for Stan model: anon_model_859efd3f82721222b491da93cf8e482d.
1 chains, each with iter=1000; warmup=500; thin=1; 
post-warmup draws per chain=500, total post-warmup draws=500.
from __future__  import division
import pystan
import pymc3 as pm
from scipy import stats
import matplotlib.pyplot as plt
import theano.tensor as tt
import seaborn as sns
import numpy as np
import os 
import pandas as pd 

x = np.array([-18.59749343,9.29064092,  -6.74614328,  -7.58674294,   7.14575445,
   6.22886667,   3.66103665, -21.24150863,   2.936587,    -0.91788529])
y = np.array([-1.47716748,7.75337739,-4.81924601,-6.50501856, 0.83028804,10.83534704,
  5.76506376 ,-2.51054588 , 6.62877459  ,4.1892128 ])
x=np.array(zip(x,y)) 
c = np.array([1.3218437633965887e-17, 0.0006677388144037628, 0.0017339911877023338, 0.00026935063841600725, 0.004014659576775406, 0.009961389820779551, 0.004177733722366306, 3.8559281009422705e-21, 0.005608820830947575, 6.89137069068125e-05])
s = np.array([0,0])
st=[[4]]*10
model='''
data {
 int D; //number of dimensions
 int K; //number of gaussians
 int N; //number of data
 int  M; // patch number at time 
 vector[D] y[N]; // observation data
 real con[N]; //concentration
 vector[D] s;//oil spill location
 matrix[N,M] st;
//real se[M]; // spill end time
}

parameters {
 simplex[K] theta; //mixing proportions
 vector<lower=-10,upper=10>[D] v[K];
 //vector[D] v[K];
  vector<lower=0>[D] Dif[K];
 cholesky_factor_corr[D] L[K]; //cholesky factor of correlation matrix
}

transformed parameters{    
	cholesky_factor_cov[D,D] cov[K*M,N];
  vector<lower=0>[D] sigma[K*M,N]; // standard deviations  
  vector[D] mu[K*M,N];
  real<lower=-1,upper=1> ro[K];
  matrix[D,D] Omega[K];
  matrix[D,D] Sigma[K*M,N];  
  vector<lower=0>[N] lamba;  

  for (k in 1:K) {  
  Omega[k] = multiply_lower_tri_self_transpose(L[k]);
  ro[k]=Omega[k,2,1];
  for (m in 1:M){ 
     for (n in 1:N) {sigma[k+K*(m-1),n] = 0.05 + sqrt(2*(st[n][m])*Dif[k]);
      mu[k+K*(m-1),n] = s+v[k]*st[n][m];
      cov[k+K*(m-1),n] = diag_pre_multiply(sigma[k+K*(m-1),n],L[k]);
  Sigma[k+K*(m-1),n] = quad_form_diag(Omega[k], sigma[k+K*(m-1),n]);}
     }
     }
 
  for (i in 1:N) {
  lamba[i]=0;
  for (k in 1:K){
  for (m in 1:M)
  {lamba[i] = lamba[i]+(0.25/M)*(1./2./3.1415926/sqrt(Sigma[k+K*(m-1),i, 1, 1])/sqrt (Sigma[k+K*(m-1),i, 2, 2])/sqrt (1 - ro[k]*ro[k]))*exp (-1./2./(1 - ro[k]*ro[k])*(-(y[i, 1] - mu[k+K*(m-1),i, 1])*(y[i, 1] - mu[k+K*(m-1),i, 1])/Sigma[k+K*(m-1), i,1, 1] - (y[i, 2] - mu[k+K*(m-1), i,2])*(y[i, 2] - mu[k+K*(m-1), i,2])/Sigma[k+K*(m-1),i, 2, 2] 
  + 2.*ro[k]*(y[i, 1] - mu[k+K*(m-1),i, 1])*(y[i, 2] - mu[k+K*(m-1),i, 2])/sqrt (Sigma[k+K*(m-1), i,1, 1])/sqrt (Sigma[k+K*(m-1),i, 2, 2])));
 // +     theta[2]*(1./2./3.1415926/sqrt (Sigma[k+K*(m-1), i,1, 1])/sqrt (Sigma[2,i, 2, 2])/sqrt (1 - ro[2]*ro[2]))*exp (-1./2./(1 - ro[2]*ro[2])*(-(y[i, 1] - mu[2, i,1])*(y[i, 1] - mu[2, i,1])/Sigma[2, i,1, 1] - (y[i, 2] - mu[2,i, 2])*(y[i, 2] - mu[2, i,2])/Sigma[2,i, 2, 2] 
 //      + 2.*ro[2]*(y[i, 1] - mu[2, i,1])*(y[i, 2] - mu[2, i,2])/sqrt (Sigma[2, i,1, 1])/sqrt (Sigma[2, i,2, 2]))));}
}
}
lamba[i]=1/lamba[i];
}
}

model {
 real ps[K*M];
 //theta ~ dirichlet(rep_vector(2.0, K));
 for(k in 1:K){
 //v[k,1] ~ normal(3.67,5.0);//normal(0,4.1);//// uniform(340/100,380/100);//
 //v[k,2]~  normal(2,5.0);//normal(0,4.1);////uniform(3160/100,3190/100);//
 Dif[k] ~ exponential(0.5); //normal(0.5,5);//normal(5,10);////beta(2,5);//uniform(0.1,0.89); //////normal(1,2); 
 L[k] ~ lkj_corr_cholesky(2);// contain rho 
  con ~ exponential(lamba);
 }
 for (i in 1:N){
 con[i] ~ exponential(lamba[i]);
 }
 for (n in 1:N){
 for (k in 1:K){
 for (m in 1:M){
 ps[k+K*(m-1)] = log(theta[k]/M)+multi_normal_cholesky_lpdf(y[n] | mu[k+K*(m-1),n], cov[k+K*(m-1),n]); //increment log probability of the gaussian
 } 
 }
//increment_log_prob(log_sum_exp(ps));
//target += ps;
target += log_sum_exp(ps);
 }
  for(i in 1:N){
   target +=  - lamba[i]*con[i]+log(lamba[i]);
  }
 }

'''

dat={'D':2,'K':4,'N':10,'y':x,'con':c,'s':s,'st':st,'M':1}
fit = pystan.stan(model_code=model,data=dat,iter=1000,warmup=500, chains=1,init_r=0.5)
print(fit)
trace = fit.extract()
print trace.keys()


#2

see Ben’s answer here:


#3

Thanks, does it mean the warning does not matter for Metropolis warning before warm-up?
Meanwhile, I have other warnings before warm-up, like


Warning: left-hand side variable (name=lamba) occurs on right-hand side of assignment, causing inefficient deep copy to avoid aliasing.

Also, other warnings after the warm-up, like below


WARNING:pystan:n_eff / iter below 0.001 indicates that the effective sample size has likely been overestimated
WARNING:pystan:Rhat for parameter L[1,1,1] is nan!
WARNING:pystan:Rhat for parameter cov[1,1,1,2] is nan!
WARNING:pystan:Rhat for parameter Omega[1,1,1] is nan!

How could those parameters be nan? Thanks.


#4

The warning:

Warning: left-hand side variable (name=lamba) occurs on right-hand side of assignment, causing inefficient deep copy to avoid aliasing.

can be ignored - it isn’t a problem and the warning has been removed from later versions of the program.


#5

Hi thanks, how about the other like the parameter is nan. I wonder whether is it because the code value underflow or something?


#7

One possibility is that the standard deviation for those parameters is zero. If that’s the case, then an nan is benign. Otherwise, you have a serious problem.


#8

Hi,
I found my output is like below. It means the sd is 0, so does it matter? Also, I just want to make sure if the Rhat = 1, does it mean that the parameter has converged? Thanks.


                  mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
L[1,1,1]           1.0     nan    0.0    1.0    1.0    1.0    1.0    1.0    nan    nan
L[2,1,1]           1.0     nan    0.0    1.0    1.0    1.0    1.0    1.0    nan    nan

#9

If the sd is 0, then you should be okay. That means that there is no variation in the parameter’s values at all – which is fine if the parameter is supposed to be constrained to a particular value (as is the case for the upper-triangular components of a Cholesky matrix). Here’s the definition of Rhat in PyStan’s _chains.pyx:

srhat = sqrt((var_between / var_within + n_samples / 2 - 1) / (n_samples / 2))

If the parameter values have no variation at all, then both var_between and var_within are zero, so srhat is nan.

It’s probably more accurate to say that if Rhat is too far from 1, then the parameter has not converged. An Rhat close to 1 is a good sign but not a perfect assurance.


#10

Hi so besides the Rhat as a sign for converge, is there other parameters which can indicate the convergence of the model?