Numerical stability of GPs with negative binomial likelihood

Recently I have started to see a lot of

 [1] "Chain 1: Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:"                                                                                              
 [2] "Chain 1: Exception: cholesky_decompose: Matrix m is not positive definite  (in 'chunks/parameters-transformed.stan' at line 20; included from 'model_lgp' at line 40)"                                                
 [3] "Chain 1: If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,"                                                                      
 [4] "Chain 1: but if this warning occurs often then your model may be either severely ill-conditioned or misspecified."                                                                                                    
 [5] "Chain 1: "                                                                                                                                                                                                            
 [6] "Chain 1: Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:"                                                                                              
 [7] "Chain 1: Exception: cholesky_decompose: Matrix m is not positive definite  (in 'chunks/parameters-transformed.stan' at line 20; included from 'model_lgp' at line 40)"                                                
 [8] "Chain 1: If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,"                                                                      
 [9] "Chain 1: but if this warning occurs often then your model may be either severely ill-conditioned or misspecified."                                                                                                    
[10] "Chain 1: "                                                                                                                                                                                                            
[11] "Chain 1: Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:"                                                                                              
[12] "Chain 1: Exception: Exception: gp_exp_quad_cov: magnitude is 0, but must be > 0!  (in 'chunks/functions-kernels.stan' at line 101; included from 'model_lgp' at line 24)"                                             
[13] "  (in 'chunks/parameters-transformed.stan' at line 14; included from 'model_lgp' at line 40)"                                                                                                                         
[14] "Chain 1: If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,"                                                                      
[15] "Chain 1: but if this warning occurs often then your model may be either severely ill-conditioned or misspecified."                                                                                                    
[16] "Chain 1: "                                                                                                                                                                                                            
[17] "Chain 1: Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:"                                                                                              
[18] "Chain 1: Exception: cholesky_decompose: Matrix m is not positive definite  (in 'chunks/parameters-transformed.stan' at line 20; included from 'model_lgp' at line 40)"                                                
[19] "Chain 1: If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,"                                                                      
[20] "Chain 1: but if this warning occurs often then your model may be either severely ill-conditioned or misspecified."                                                                                                    
[21] "Chain 1: "                                                                                                                                                                                                            
[22] "Error in sampler$call_sampler(args_list[[i]]) : "                                                                                                                                                                     
[23] "  Exception: neg_binomial_2_log_rng: Random number that came from gamma distribution is 8.79585e+09, but must be less than 1.07374e+09  (in 'chunks/generated.stan' at line 34; included from 'model_lgp' at line 52)"
error occurred during calling the sampler; sampling not done

with rstan 2.21. Is it common that this happens during warmup and earlier the informational messages were just not printed out? I have seen it also during sampling though. Why can it happen that the magnitude parameter for gp_exp_quad_cov goes to exactly zero? I have a lower=0 and half-student-t prior for it.

Actually that did result in an error which stopped the chain and that’s why the informational messages were printed. I didn’t realize it because it was sent to me by someone else using my code. And the actual error that stops the execution is drawing the random number in generated quantities block. So I guess the fix could be to ensure that the parameters given to neg_binomial_2_log_rng are not too small or large

Care to post the full model? If you’re getting errors in the generated quantities section, it still might mean that the model has specification issues.

It is here: https://github.com/jtimonen/lgpr/tree/master/inst/stan I can try to create an example data and input that causes that to happen

Yeah I’ve had lots of problems (not just when doing a GP) with neg_binomial_2_log_rng during warmup because the (log) mean parameter can get way too large. You could try something like this:

  int neg_binomial_2_log_safe_rng(real eta, real phi) {
    real gamma_rate = gamma_rng(phi, phi / exp(eta));
    if (gamma_rate > exp(20.7)) gamma_rate = exp(20.7); // i think this is the max value before overflow but haven't double checked
    return poisson_rng(gamma_rate);
  }
5 Likes

Hi! I’ve been having the same problem with the neg_binomial_2_log_rng. I’m modeling cases / exposures in a negative binomial GLM with log-link, offset, and informative priors (they came from a process of elicitation and the idea is to used them as main inputs of the modeling strategy). The model is simple:

data {
/* Dimensions /
int<lower=0> N;
int<lower=0> M;
/
Hyperparameters*/
cov_matrix[M] s;
vector[M] m;
/* Design Matrix */
matrix[N,M] X;

/* Outcome (a real vector of length n) */ 
  int<lower=0> y[N]; 
/* Offset */  
  vector[N] off; 

}

parameters {
vector[M] beta;
real<lower=0> phi; // neg. binomial dispersion parameter
}

transformed parameters {
vector[N] eta;

eta = X * beta +off;
}

model {
/* Prior */
beta ~ multi_normal(m, s);

/* Likelihood /
/
Explicit contribution to target */
for (i in 1:N) {
target += neg_binomial_2_log_lpmf(y[i] | eta[i],phi);
}
}

generated quantities {
vector[N] log_lik;
int<lower=0> y_new[N];

for (i in 1:N) {
log_lik[i] = neg_binomial_2_log_lpmf(y[i] | eta[i],phi);
y_new[i] = neg_binomial_2_log_rng(eta[i],phi);
}
}

I saw that because of the size of the log mean parameter, the sampling of the negative binomial is unfeasible. The model with flat priors U(0,1) and offset works fine, same as the model with the informative elicited priors but without the offset (both ways make eta small). The common solutions of restricting the value of y_new[i] to -1 or the maximum aloud would imply that my predicted outcomes, y_new, are incorrect. The purpose of my model is to predict the outcome Y according to the model, the elicited prior and the offset. So i need to estimate correctly y_new. Is there any other solution instead of restricting y_new or the gamma_rate in the sampling process? And I’m wondering why there exist such limits for neg_binomial_2_log_rng.
Thanks!