Error: Chain 1: Rejecting initial value: mu[1] is nan, but must be greater than or equal to 0

Hi guys,
I am trying to fit a regression model to my data. But I keep receiving this error message:

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: validate transformed params: mu[1] is nan, but must be greater than or equal to 0  (in 'model32b08805100_tweedie' at line 63)

Chain 1: 
Chain 1: Initialization between (-2, 2) failed after 100 attempts. 
Chain 1:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."
[1] "error occurred during calling the sampler; sampling not done"

How can I solve this problem?
I really need your help.

Many thanks

Could you post the code for your model please?

Because it’s part of the dissertation code, unfortunately I can’t fully share it. But the error is at line 63 which is about “mu” in transformed parameters part:


convolved_re =  sqrt(1 - rho) * theta + sqrt(rho / scaling_factor) * sai;
	
	
	for (i in 1:S) {
		for (j in 1:TJ){
		                               
                                                
      mu[i*TJ-TJ+j]= exp(X[i*TJ-TJ+j, ] * beta + convolved_re[i] * sigma + Rw1[j] + R2[j] );
			lambda[i*TJ-TJ+j] = 1/phi*mu[i*TJ-TJ+j]^(2-p)/(2-p);
			alpha[i*TJ-TJ+j] = (2-p)/(p-1);
			gama[i*TJ-TJ+j] = 1/phi*mu[i*TJ-TJ+j]^(1-p)/(p-1);
			
			}
   	}

} 

Does the problem is in selecting the priors?

The model is spatio-temporal. spatial with BYM2 and temporal with Random Walk2.

Did you declare bounds on rho to be between 0 and 1, so that the computations for convolved_re are well defined?

Many thanks mcol, yes I did. Even I changed the prior for rho from beta(3, 3) to uniform(0,1) But I keep receiving the same error message.

Have you checked that you have correct bounds on your other parameters (such as phi should never be 0, p should never be 1 or 2, etc)? Is it possible that the value in the exponentiation may be very large and cause an overflow? Other than that, without seeing how your parameters are defined, it’s hard to give more precise suggestions.

Perhaps @mitzimorris has more experience with those kind of models and may be of help.

how have you set up the nodes list for the neighborhood graph?
is it fully connected?
the list of pairwise neighbors are symmetric, this means that for neighbors i,j
you only list edge (i,j) and don’t list edge (j,i) as well.

Many thanks for your guidance @mcol. I checked all the constraint terms and I didn’t find any problem. With the goal of solving the problem, I sent more code here.

data {
 
 int<lower=0> N;     
 int <lower=0> M;  
 int <lower=0> K;  
 
 int<lower=1> N_edges;
 int<lower=1, upper=N> node1[N_edges];  
 int<lower=1, upper=N> node2[N_edges];  
 real<lower=0.00001> scaling_factor; 

 vector [N] y;    
 matrix [N, K] X; 


int <lower=1> S;     // Spaces
 int <lower=1> TJ;    // Times
}


parameters {
 
 //distribution
 real<lower=0.00001> phi;
 real<lower=1.0001, upper=1.9999> p;

 
 //BYM2 model
 vector[N] theta;       
 vector[N] sai;         
 real<lower=0> sigma;        
 real<lower=0, upper=1> rho; 
   //Reg model
   vector[K] beta;

 // Random walk
 real<lower=0> sigmaRw;
 real<lower=0> sigmaR;
}


transformed parameters {
 
 vector[N] convolved_re;
 vector<lower=0>[N] mu;
 vector<lower=0.1>[N] lambda;
   vector<lower=0.000001>[N] alpha;
   vector<lower=0.000001>[N] gama;
   vector [TJ] Rw1;
   vector [TJ] R2; 
   

 convolved_re =  sqrt(1 - rho) * theta + sqrt(rho / scaling_factor) * sai;
   
   
   for (i in 1:S) {
   	for (j in 1:TJ){
   	                               
                                              
     mu[i*TJ-TJ+j]= exp(X[i*TJ-TJ+j, ] * beta + convolved_re[i] * sigma + Rw1[j] + R2[j] );
   		lambda[i*TJ-TJ+j] = 1/phi*mu[i*TJ-TJ+j]^(2-p)/(2-p);
   		alpha[i*TJ-TJ+j] = (2-p)/(p-1);
   		gama[i*TJ-TJ+j] = 1/phi*mu[i*TJ-TJ+j]^(1-p)/(p-1);
   		
   		}
  	}

}  

model {

Thank you in advance for spending time

Many thanks for your guidance @mitzimorris. I sent codes here for more information. But I didn’t understand your mean. May you explain more?

you’re passing in two arrays, node1 and node2.
at every position i, is it the case that node1[i] < node2[i] ?
you could check this by adding a transformed data block:

transformed data {
  for (i in 1:N) {
    if (node1[i] >= node2[i])
       reject("edge pair i=", i, ", node1[i] must be less than node2[i]");
  }
}

also, the graph described by edge arrays node1 and node2 must be fully connected.
that means that there has to be a path from every node to every other node in the graph.
no islands, no separate subcomponents.

i.e., if you had some district-level map over the continental US, Alaska, and Hawaii, maybe Alaska has 7 districts, each island of Hawaii is its own district, and the rest of the US has one or more district per state, you’ve got a map which contains both disconnected components and islands.

2 Likes

Many thanks for your reply @mitsimorris. There are some islands in our country, but all the islands belong to a province that is neighboring with other provinces and has been continuously connected. The geographical units in my study are the provinces. There are 31 provinces and the data are gathered for 8 years. I added these lines to the codes and the error changed to this:

CHECKING DATA AND PREPROCESSING FOR MODEL 'BESAG-temporal_RW' NOW.

COMPILING MODEL 'BESAG-temporal_RW' NOW.

STARTING SAMPLER FOR MODEL 'BESAG-temporal_RW' NOW.
Error in new_CppObject_xp(fields$.module, fields$.pointer, ...) : 
  Exception: []: accessing element out of range. index 75 out of range; expecting index to be between 1 and 74; index position = 1node2  (in 'model2f3013de104a_BESAG_temporal_RW' at line 35)

failed to create the sampler; sampling not done

“line 35” is about the transformed data which I added. Now, does it means the problem is related to the edge arrays?

Thank you in advance

sorry, my mistake - the transformed data block should be:

Thank you so much. I modified the code, but the result is still the same.

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: validate transformed params: mu[1] is nan, but must be greater than or equal to 0  (in 'model96012e4e00_BESAG_temporal_RW' at line 68)

Chain 1: 
Chain 1: Initialization between (-2, 2) failed after 100 attempts. 
Chain 1:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."
[1] "error occurred during calling the sampler; sampling not done"

I really appreciate your help.

These

   vector [TJ] Rw1;
   vector [TJ] R2; 

are declared in transformed parameters but I don’t see assignment that defines their values. They must be assigned to before calculating anything else that depends on them (such as mu). Or if they are meant to be sampled from some distribution defined in the model block then they belong in the parameters block.

Many thanks @nhuurre, yes you are right. I modified the code. But the error did not change.

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: validate transformed params: mu[156] is nan, but must be greater than or equal to 0  (in 'model9603ce696d_BESAG_temporal_RW' at line 72)

Chain 1: 
Chain 1: Initialization between (-2, 2) failed after 100 attempts. 
Chain 1:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."
[1] "error occurred during calling the sampler; sampling not done" 

“line 72” is:

vector<lower=0>[N] mu;

What’s noticeable is that in the previous case,

mu[1] is nan, but must be greater...

but now:

mu[156] is nan, but must be greater...

Why?!!

how are you computing theta - that’s the spatial component in the BYM2 model?

Through a prior

theta ~ normal(0, 1);

and also for “sai” the prior is:

target += -0.5 * dot_self(sai[node1] - sai[node2]);

Yes, it is a random effect (heterogeneous effects) in BYM2 model and “sai” is spatial effect