Error: Initialization failed

kevin-kvp · April 21, 2020, 1:23am

Hi all,
I would like to build the Hierarchical Logistic Regression. Also, I use the flat prior for \mu and \sigma (global parameters in this case)

data {
  int<lower = 1> k; // Number of beta (10)
  int<lower = 0> N_train; // Number of observation. (train)
  int<lower = 0> N_all; // Number of observation. (all data) 
  int<lower = 1> cluster; // Number of clusters
  int<lower = 0, upper = 1> y_train[N_train]; // response
  int<lower = 1, upper = cluster> cluster_train[N_train]; // indicating cluster (train)
  int<lower = 1, upper = cluster> cluster_all[N_all]; // indicating cluster (all)
  matrix[N_train,k] X_train; // train_data
  matrix[N_all,k] X_all; // train_data
}

parameters {
  real mu[k];
  real<lower = 0> sigma[k];
  real alpha[cluster];
  vector[k] beta[cluster];
}

model{
  real eta[N_train];
  for(i in 1:k){
    // mu[i] ~ normal(0,2);
    // sigma[i] ~ normal(0,2);
    for(l in 1:cluster){
      beta[l,i] ~ normal(mu[i],sigma[i]);
    }
  }
  
  for(n in 1:N_train){
    eta[n] = alpha[cluster_train[n]] + (X_train[n] * beta[cluster_train[n]]);
    y_train[n] ~ bernoulli(inv_logit(eta[n]));
  }
  
}

generated quantities {
  vector[N_all] y_return;
  for(i in 1:N_all) {
    y_return[i] = bernoulli_rng(inv_logit(alpha[cluster_all[i]] + (X_all[i] * beta[cluster_all[i]])));
  }
}

However, when I ran this code in R, it gave an error.

SAMPLING FOR MODEL ‘xxxxx’ NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1: Log probability evaluates to log(0), i.e. negative infinity.
Chain 1: Stan can’t start sampling from this initial value.
.
.
.
Chain 1:
Chain 1: Initialization between (-2, 2) failed after 100 attempts.
Chain 1: Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
error occurred during calling the sampler; sampling not done

I have tried debugging the code, such as changing the prior but it does not help. please help me

Million Thanks!

Jean_Billie · April 21, 2020, 10:27am

If the stan file is correct, then by changing seed variable of the function stan() or sampling(), sometimes sampling will start. If it dose not work, then, change data in the function stan() or sampling().

nhuurre · April 21, 2020, 11:13am

Might want to use bernoulli_logit here. It’s the same but avoids rounding (so works even if the initial guess predicts the wrong outcome with very high probability.)

y_train[n] ~ bernoulli_logit(eta[n]);

kevin-kvp · April 22, 2020, 5:23am

Hi all,
Thank you so much for your help. I just tried this code as @nhuurre suggests.

However, the problem still occurs. So, I have tried normalized the data into [0,1] range and change the seed as @Jean_Billie suggests.

The problem is gone, however, the run time is extremely long. So, I just wonder that is it possible that the runtime of my code after editing is long, let’s say I spend around 20 hours to finish it?

PS

The dimension of X_train is 11,096 rows with 12 columns.
The dimension of X_all is 14,795 rows with 12 columns.

Topic		Replies	Views
Initialization between (-2, 2) failed after 100 attempts Modeling rstan , fitting-issues	3	953	July 6, 2023
Failure to start because of initial values Modeling	16	3493	July 31, 2017
Error calling sampler when fitting hierarchical model RStan specification , hierarchical-model	17	606	June 5, 2021
Initialization failure in rstan Modeling	1	540	May 15, 2020
Multiple logistic regression Modeling rstan	1	575	January 18, 2022

Error: Initialization failed

Related topics