Joint Modeling of Beta-binomial Outcome using Gauss Copula;inv_Phi: Probability variable is -2.14096e+06

Hi all!
I‘m working on joint modeling of beta-binomial outcomes using Gauss Copula.
But stan warns that

Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue: 
Chain 1 Exception: Exception: inv_Phi: Probability variable is -2.14096e+06, but must be in the interval [0, 1] (in 'C:/Users/AAA/AppData/Local/Temp/RtmpKYJF6n/model-36d45da92db6.stan', line 19, column 6 to column 57) (in 'C:/Users/AAA/AppData/Local/Temp/RtmpKYJF6n/model-36d45da92db6.stan', line 129, column 1 to line 130, column 90) 
Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine, Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

My model is:
y_1i~Beta-binomial(n_1i,alpha1,beta1);
y_2i~Beta-binomial(n_2i,alpha2,beta2);
(y1,y2)~Gauss_copula(L);

Here is my stan code:
The functions in the functions block except the beta_binomial_marginal could be found :here.
And the beta_binomial_marginal function is a modified version of binomial_marginal in the hyperlink.

functions{
  //
array[] matrix beta_binomial_marginal(array[,] int num, array[,] int den, matrix alpha, matrix beta, 
                                 matrix u_raw) {
  int N = rows(alpha);
  int J = cols(alpha);
  
  array[2] matrix[N, J] rtn;
  
  for (j in 1 : J) {
    for (n in 1 : N) {
      real Ubound = beta_binomial_cdf(num[n, j] | den[n, j], alpha[n,j], beta[n,j]);
      real Lbound = 0;
      if (num[n, j] > 0) {
        Lbound = beta_binomial_cdf(num[n, j]-1 | den[n, j], alpha[n,j], beta[n,j]);
      }
      
      real UmL = Ubound - Lbound;  // Ubound>0,UmL>0  
      rtn[1][n, j] = inv_Phi(Lbound + UmL * u_raw[n, j]);
      if(is_inf(rtn[1][n, j])) rtn[1][n, j] =negative_infinity();
      
      if(is_nan(log(UmL))) rtn[2][n, j] = negative_infinity();
      else rtn[2][n, j] = log(UmL);
    }
  }
  
  return rtn;
}

real multi_normal_cholesky_copula_lpdf(matrix U, matrix L) {
  int N = rows(U);
  int J = cols(U);
  matrix[J, J] Gammainv = chol2inv(L);
  return  - 0.5 * sum(add_diag(Gammainv, -1.0) .* crossprod(U))-N * sum(log(diagonal(L)));
}



real centered_gaussian_copula_cholesky_lpdf(array[,] matrix marginals, matrix L) {
  // Extract dimensions of final outcome matrix
  int N = rows(marginals[1][1]);
  int J = rows(L);
  matrix[N, J] U;
  
  // Iterate through marginal arrays, concatenating the outcome matrices by column
  // and aggregating the log-likelihoods (from continuous marginals) and jacobian
  // adjustments (from discrete marginals)
  real adj = 0;
  int pos = 1;
  for (m in 1 : size(marginals)) {
    int curr_cols = cols(marginals[m][1]);
    
    U[ : , pos : (pos + curr_cols - 1)] = marginals[m][1];
    
    adj += sum(marginals[m][2]);
    
    pos += curr_cols;
  }

  // Return the sum of the log-probability for copula outcomes and jacobian adjustments
  return multi_normal_cholesky_copula_lpdf(U | L) + adj;
}

  
}

data{
  int<lower=0> N;
  array[N,1] int<lower=0> y1;
  array[N,1] int<lower=0> n1;
  array[N,1] int<lower=0> y2;
  array[N,1] int<lower=0> n2;
}


parameters{
  real<lower=0,upper=1> a1; //a1=alpha1/(alpha1+beta1) 
  real<lower=1> b1;         //b1=alpha1+beta1
  real<lower=0,upper=1> a2;
  real<lower=1>  b2;
  
  
  
  matrix<lower=0,upper=1>[N,1] u1;  
  matrix<lower=0,upper=1>[N,1] u2;  

  
  cholesky_factor_corr[2] L;
  
}

transformed parameters{
  real<lower=0>  alpha1; 
  real<lower=0>  alpha2;
  real<lower=0>  beta1;
  real<lower=0>  beta2;
  
  alpha1=a1*b1;
  beta1=b1-alpha1;
  
  alpha2=a2*b2;
  beta2=b2-alpha2;
  

}

model{
  //prior for a1,a2,b1,b2,rho
  
  b1~pareto(1, 1.5);//
  b2~pareto(1, 1.5);
  L ~ lkj_corr_cholesky(1.0);
  
  matrix[N,1] alpha_j1;
  matrix[N,1] alpha_j2;
  matrix[N,1] beta_j1;
  matrix[N,1] beta_j2;
  
 
  for(i in 1:N){
    alpha_j1[i,1]=alpha1;
    alpha_j2[i,1]=alpha2;
    beta_j1[i,1]=beta1;
    beta_j2[i,1]=beta2;

  }
  

 {beta_binomial_marginal(y1,n1,alpha_j1,beta_j1,u1),
  beta_binomial_marginal(y2,n2,alpha_j2,beta_j2,u2)}~centered_gaussian_copula_cholesky(L);
  //print("target=",target());
}

Here is my data and R code:

data<-read.table(text = "
37     50 28  44
49     65 33  55
59     79 46  64
25     33 24  21
59     79 45  64
13     18 12  14
25     34 17  31
27     35 23  34
107    141 99 108
76    101 64  78")

colnames(data)<-c("n1","n2","y1","y2")
stan_data<-list(N=dim(data)[1],
                y1=as.matrix(data$y1),
                n1=as.matrix(data$n1),
                y2=as.matrix(data$y2),
                n2=as.matrix(data$n2))
fit_betabinomial_gauss <- mod_betabinomial_gauss$sample(data = stan_data,#init = lapply(1:4,init_gen),
                                                        chains = 4,
                                                        parallel_chains = 4,
                                                        refresh = 0, 
                                                        step_size = 0.01,adapt_delta=0.99,max_treedepth = 20)

Line 19 in the stan code is rtn[1][n, j] = inv_Phi(Lbound + UmL * u_raw[n, j]);

The Ubound and Lbound are calculated from the bulid-in function beta_binomial_cdf and u_raw(u1,u2) is bounded from 0 to 1 in the parameters block. It seems that Probability variable Lbound + UmL * u_raw[n, j] is bounded from 0 to 1 but in the warning it equals -2.14096e+06 .
Here is a screenshot in my testing:


If anyone has advice on how I can avoid getting this error, that’d be really appreciated.

It seems that there are some problems with the bulit-in function beta_binomial_cdf

In order to figure out the reason,I modified my beta_binomial_marginal function like this:
(stan will output the variables if the probability variable Lbound + UmL * u_raw[n, j] is out of boundary)

array[] matrix beta_binomial_marginal(array[,] int num, array[,] int den, matrix alpha, matrix beta, 
                                 matrix u_raw) {
  int N = rows(alpha);
  int J = cols(alpha);
  
  array[2] matrix[N, J] rtn;
  
  for (j in 1 : J) {
    for (n in 1 : N) {
      real Ubound = beta_binomial_cdf(num[n, j] | den[n, j], alpha[n,j], beta[n,j]);
      real Lbound = 0;
      if (num[n, j] > 0) {
        Lbound = beta_binomial_cdf(num[n, j]-1 | den[n, j], alpha[n,j], beta[n,j]);
      }
      
      real UmL = Ubound - Lbound;  // Ubound>0,UmL>0  
      
      if(Lbound + UmL * u_raw[n, j]<0 || Lbound + UmL * u_raw[n, j]>1) {
        vector[7] diagno;
        diagno[1]=num[n, j];
        diagno[2]=den[n, j];
        diagno[3]=alpha[n,j];
        diagno[4]=beta[n,j];
        diagno[5]=Ubound;
        diagno[6]=Lbound;
        diagno[7]=u_raw[n, j];
        print("diagno=",diagno);
        
        }
      
      rtn[1][n, j] = inv_Phi(Lbound + UmL * u_raw[n, j]);
      if(is_inf(rtn[1][n, j])) rtn[1][n, j] =negative_infinity();
      
      if(is_nan(log(UmL))) rtn[2][n, j] = negative_infinity();
      else rtn[2][n, j] = log(UmL);
    }
  }
  
  return rtn;
}

When I run the model, it prints that in the console:

Chain 2 diagno=[15,17,1,2.1072e-13,-0.00105485,-0.00105485,0.13774] 
Chain 2 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 2 Exception: Exception: inv_Phi: Probability variable is -0.00105485, but must be in the interval [0, 1] (in 'C:/Users/AAA/AppData/Local/Temp/RtmpKYJF6n/model-36d47815358d.stan', line 33, column 6 to column 57) (in 'C:/Users/AAA/AppData/Local/Temp/RtmpKYJF6n/model-36d47815358d.stan', line 143, column 1 to line 144, column 90)
Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Chain 2 
Chain 3 diagno=[116,242,995.687,59.8834,-9.10383e-15,7.64944e-14,0.996693] 
Chain 3 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 3 Exception: Exception: inv_Phi: Probability variable is -8.82072e-15, but must be in the interval [0, 1] (in 'C:/Users/AAA/AppData/Local/Temp/RtmpKYJF6n/model-36d47815358d.stan', line 33, column 6 to column 57) (in 'C:/Users/AAA/AppData/Local/Temp/RtmpKYJF6n/model-36d47815358d.stan', line 143, column 1 to line 144, column 90)
Chain 3 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Chain 3 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Chain 3 
Chain 1 diagno=[37,41,1,1.53979e-11,-2.88417e-05,-2.88417e-05,0.999998] 
Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 1 Exception: Exception: inv_Phi: Probability variable is -2.88417e-05, but must be in the interval [0, 1] (in 'C:/Users/AAA/AppData/Local/Temp/RtmpKYJF6n/model-36d47815358d.stan', line 33, column 6 to column 57) (in 'C:/Users/AAA/AppData/Local/Temp/RtmpKYJF6n/model-36d47815358d.stan', line 143, column 1 to line 144, column 90)
Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Chain 1 
Chain 3 finished in 68.1 seconds.
Chain 2 finished in 76.7 seconds.
Chain 1 finished in 80.3 seconds.
Chain 4 finished in 95.5 seconds.

All 4 chains finished successfully.
Mean chain execution time: 80.1 seconds.
Total execution time: 95.9 seconds.

In the warning message, Lbound and Ubound are less than 0, but they are calculated from the bulit-in function beta_binomial_cdf, so I wonder that there might be some problems with beta_binomial_cdf. It seems that this problem would occur when the parameter alpha and beta is on the boundary(For example, beta is very close to 0 in the first line of the warning ).
What’s more, can I ignore those warnings since it occurs not so frequently?

  real inv_Phi_log_lambda(real log_p) {
    vector[8] a
        = [3.3871328727963666080e+00, 1.3314166789178437745e+02,
           1.9715909503065514427e+03, 1.3731693765509461125e+04,
           4.5921953931549871457e+04, 6.7265770927008700853e+04,
           3.3430575583588128105e+04, 2.5090809287301226727e+03]';
    vector[7] b
        = [4.2313330701600911252e+01, 6.8718700749205790830e+02,
           5.3941960214247511077e+03, 2.1213794301586595867e+04,
           3.9307895800092710610e+04, 2.8729085735721942674e+04,
           5.2264952788528545610e+03]';
    vector[8] c
        = [1.42343711074968357734e+00, 4.63033784615654529590e+00,
           5.76949722146069140550e+00, 3.64784832476320460504e+00,
           1.27045825245236838258e+00, 2.41780725177450611770e-01,
           2.27238449892691845833e-02, 7.74545014278341407640e-04]';
    vector[7] d
        = [2.05319162663775882187e+00, 1.67638483018380384940e+00,
           6.89767334985100004550e-01, 1.48103976427480074590e-01,
           1.51986665636164571966e-02, 5.47593808499534494600e-04,
           1.05075007164441684324e-09]';
    vector[8] e
        = [6.65790464350110377720e+00, 5.46378491116411436990e+00,
           1.78482653991729133580e+00, 2.96560571828504891230e-01,
           2.65321895265761230930e-02, 1.24266094738807843860e-03,
           2.71155556874348757815e-05, 2.01033439929228813265e-07]';
    vector[7] f
        = [5.99832206555887937690e-01, 1.36929880922735805310e-01,
           1.48753612908506148525e-02, 7.86869131145613259100e-04,
           1.84631831751005468180e-05, 1.42151175831644588870e-07,
           2.04426310338993978564e-15]';
    real log_q = log_p <= log(0.5) ? log_diff_exp(log(1), log_sum_exp(log_p, log(0.5))) : log_diff_exp(log_p, log(0.5));
    int log_q_sign = log_p <= log(0.5) ? -1 : 1;
    real val;

    if (log_p == log(1)) {
      return positive_infinity();
    }

    if (log_q <= log(.425)) {
      real log_r;
      real log_rtn;
      real log_agg_a;
      real log_agg_b;
      vector[8] log_a = log(a);
      vector[7] log_b = log(b);
      log_r = log_diff_exp(log(.180625), 2 * log_q);
      log_agg_a = log_sum_exp(log_a[8] + log_r, log_a[7]);
      log_agg_a = log_sum_exp(log_agg_a + log_r, log_a[6]);
      log_agg_a = log_sum_exp(log_agg_a + log_r, log_a[5]);
      log_agg_a = log_sum_exp(log_agg_a + log_r, log_a[4]);
      log_agg_a = log_sum_exp(log_agg_a + log_r, log_a[3]);
      log_agg_a = log_sum_exp(log_agg_a + log_r, log_a[2]);
      log_agg_a = log_sum_exp(log_agg_a + log_r, log_a[1]);

      log_agg_b = log_sum_exp(log_b[7] + log_r, log_b[6]);
      log_agg_b = log_sum_exp(log_agg_b + log_r, log_b[5]);
      log_agg_b = log_sum_exp(log_agg_b + log_r, log_b[4]);
      log_agg_b = log_sum_exp(log_agg_b + log_r, log_b[3]);
      log_agg_b = log_sum_exp(log_agg_b + log_r, log_b[2]);
      log_agg_b = log_sum_exp(log_agg_b + log_r, log_b[1]);
      log_agg_b = log_sum_exp(log_agg_b + log_r, 0);

      log_rtn = log_q + log_agg_a - log_agg_b;
      return log_q_sign * exp(log_rtn);
    } else {
      real log_r = log_q_sign == -1 ? log_p : log_diff_exp(log(1), log_p);

      if (is_inf(log_r)) {
        return 0;
      }

      log_r = log(sqrt(-log_r));

      if (log_r <= log(5.0)) {
        vector[8] log_c = log(c);
        vector[7] log_d = log(d);
        real log_agg_c;
        real log_agg_d;
        log_r = log_diff_exp(log_r, log(1.6));

        log_agg_c = log_sum_exp(log_c[8] + log_r, log_c[7]);
        log_agg_c = log_sum_exp(log_agg_c + log_r, log_c[6]);
        log_agg_c = log_sum_exp(log_agg_c + log_r, log_c[5]);
        log_agg_c = log_sum_exp(log_agg_c + log_r, log_c[4]);
        log_agg_c = log_sum_exp(log_agg_c + log_r, log_c[3]);
        log_agg_c = log_sum_exp(log_agg_c + log_r, log_c[2]);
        log_agg_c = log_sum_exp(log_agg_c + log_r, log_c[1]);

        log_agg_d = log_sum_exp(log_d[7] + log_r, log_d[6]);
        log_agg_d = log_sum_exp(log_agg_d + log_r, log_d[5]);
        log_agg_d = log_sum_exp(log_agg_d + log_r, log_d[4]);
        log_agg_d = log_sum_exp(log_agg_d + log_r, log_d[3]);
        log_agg_d = log_sum_exp(log_agg_d + log_r, log_d[2]);
        log_agg_d = log_sum_exp(log_agg_d + log_r, log_d[1]);
        log_agg_d = log_sum_exp(log_agg_d + log_r, 0);

        val = exp(log_agg_c - log_agg_d);
      } else {
        vector[8] log_e = log(e);
        vector[7] log_f = log(f);
        real log_agg_e;
        real log_agg_f;
        log_r = log_diff_exp(log_r, log(5));

        log_agg_e = log_sum_exp(log_e[8] + log_r, log_e[7]);
        log_agg_e = log_sum_exp(log_agg_e + log_r, log_e[6]);
        log_agg_e = log_sum_exp(log_agg_e + log_r, log_e[5]);
        log_agg_e = log_sum_exp(log_agg_e + log_r, log_e[4]);
        log_agg_e = log_sum_exp(log_agg_e + log_r, log_e[3]);
        log_agg_e = log_sum_exp(log_agg_e + log_r, log_e[2]);
        log_agg_e = log_sum_exp(log_agg_e + log_r, log_e[1]);

        log_agg_f = log_sum_exp(log_f[7] + log_r, log_f[6]);
        log_agg_f = log_sum_exp(log_agg_f + log_r, log_f[5]);
        log_agg_f = log_sum_exp(log_agg_f + log_r, log_f[4]);
        log_agg_f = log_sum_exp(log_agg_f + log_r, log_f[3]);
        log_agg_f = log_sum_exp(log_agg_f + log_r, log_f[2]);
        log_agg_f = log_sum_exp(log_agg_f + log_r, log_f[1]);
        log_agg_f = log_sum_exp(log_agg_f + log_r, 0);

        val = exp(log_agg_e - log_agg_f);
      }
      if (log_q_sign == -1)
        return -val;
    }
    return val;
  }

  real inv_Phi_log_fun(real log_p) {
    real log_BIGINT = log(2000000000);
    return log_p >= log(0.9999) ? -inv_Phi_log_lambda(
               log_diff_exp(log_BIGINT, log_BIGINT + log_p) - log_BIGINT)
                       : inv_Phi_log_lambda(log_p);
  }

Try calculating the lower and upper bound and u_raw on the log scale then using inv_Phi_log I posted above. This takes in the log of the input but still outputs on the probability scale. @andrjohns found that there are floating point errors when calculating on the probability scale.

Also, the inv_Phi_log function should be in the next version of Stan as std_normal_log_qf, it’s already in stan-math and just needs to be exposed to the language ( see math/std_normal_log_qf.hpp at develop · stan-dev/math · GitHub).

1 Like

Thank you for your reply. It works well when working on the log scale, but I didn’t find inv_Phi_log so I tried inv_Phi_log_fun, does that matter? Thanks again.

1 Like

Doesn’t matter, that’s the function to use!