Error message 'RuntimeError: Exception during call to services function: `ValueError("Initialization failed. Rejecting initial value: Log probability evaluates to log(0), i.e. negative infinity ...")`

Hi, I am working on a stan model with the default initialization in StanModel, but it keeps returning me the error messages


RuntimeError: Exception during call to services function: ValueError("Initialization failed. Rejecting initial value: Log probability evaluates to log(0), i.e. negative infinity. Stan can't start sampling from t his initial value. Rejecting initial value: ...")


I am really confused and not sure where I should look into. I really appreciate it if you could provide any insights into it, here is the stan codes I am using and I have attached the data that needs to be fed
dataToStan.csv (184.5 KB)
into the model,


functions {
	int mapping(int D, int S, int U){
		int I = (S - 1)*16 + (D - 1)*4 + U;
		return I; 
	}
}
data {
        int<lower = 1> K;   //number of sigs
        int<lower = 2> C;	//number of pairs of flanking bases excluding single-based substitutions
        int<lower = 1> N;	//Total number of non-zero entries
        int<lower = 1> doc[N];	//sample ID
        int<lower = 1> X[N];    // mutational counts for each substition in each sample
	int<lower = 1, upper = 4> D_array[N, C];	//downstream index array starting from d_0
	int<lower = 1, upper = 6> S_array[N]; 	// single-base substitution array
	int<lower = 1, upper = 4> U_array[N, C];	//upstream index array starting from u_0
        
        real<lower = 0> alpha;   // hyperparam for signature rates
        real<lower = 0> beta;    //hyperparam for up/downstream bases rates
        
        real<lower = 0> gamma0;  // hyperparam for shape parameters
        real<lower = 0> gamma1;  // hyperparam for shape parameters
        real<lower = 0> delta0;  // hyperparam for mean loadings
        real<lower = 0> delta1;  // hyperparam for mean loadings
    }
    
transformed data{
        vector<lower = 0>[96] alpha_array = rep_vector(alpha, 96); //Dirichlet params for signatures
        vector<lower = 0>[4] beta_array = rep_vector(beta, 4);
        int<lower = 1> J = max(doc);	// Total number of samples
	int<lower = 1> T = C - 1; //number of pairs of flanking bases excluding d_0 and u_0
	int<lower = 1, upper = 96> I_array[N];	//tri-nucleotide index array
	for (n in 1:N) {
		I_array[n] = mapping(D_array[n, 1], S_array[n], U_array[n, 1]);
	}
}

parameters {
        vector<lower=0>[K] nu; //inferred shaped parameters for loadings
        vector<lower=0>[K] mu; // inferred mean loadings
        matrix<lower=0>[K, J] theta;	//inferred loadings 
        simplex[4] u[T, K, 4, 6];	//inferred upstream bases
        simplex[4] d[T, K, 4, 6];	//inferred downstream bases
        simplex[96] r[K];	//inferred sigs K by I
}

model {
	real mutation_rate;
	real base_rate;

	for (k in 1:K) {
		nu[k] ~ inv_gamma(gamma0, gamma1);
		mu[k] ~ inv_gamma(delta0, delta1);
		r[k] ~ dirichlet(alpha_array);
	}
	
	for (j in 1:J){
		for (k in 1:K){
			theta[k, j] ~ gamma(nu[k], nu[k]/mu[k]);
		}
	}
	
	for (t in 1:T)
		for (k in 1:K)
			for (u0 in 1:4)
				for (s in 1:6){
					d[t, k, u0, s] ~ dirichlet(beta_array);
					u[t, k, u0, s] ~ dirichlet(beta_array);
				}
				
				
	for (n in 1:N){
		for (k in 1:K){
			base_rate = 1;
			for (t in 1:T){
				base_rate *= d[t, k, D_array[n, t], S_array[n], D_array[n, t+1]]*u[t, k, U_array[n, t], S_array[n], U_array[n, t+1]];
			}
			mutation_rate += theta[k, doc[n]] * r[k, I_array[n]] * base_rate;
		}	
		target +=  X[n] * log(mutation_rate);
	}
	target +=  sum(theta);
}

What may be happening is the randomly-chosen parameters from the chain have very low probability which underflows to zero (and it’s associated negative infinity log); with that the chain cannot go anywhere – instead you need a finite probability, however small. The easiest way to test if that’s really what is happening is choosing some fixed parameter values (and later relaxing that to make sure chains are converging to the correct region).

It may be a bug in the code that generates a log(0) values somewhere, but it may also be that the model is fine and it’s just starting out somewhere weird.