Initialization between (-2, 2) failed after 100 attempts in Bass model fitting

Hi, I am really new to Stan and trying to fit Bass Model with hierarchical structure.
I succeeded in estimating all the coefficients in the model by MH algorithm using R before. However, I have some trouble now. In inference, I faced error codes like

Chain 1: Rejecting initial value:
Chain 1: Gradient evaluated at the initial value is not finite.
Chain 1: Stan can’t start sampling from this initial value.
Chain 1:
Chain 1: Initialization between (-2, 2) failed after 100 attempts.
Chain 1: Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”

Because I use multiplicative error model, I evaluate likelihood in log-normal form, which could cause numerical issue such as log(0), I conjecture. However, I could not find any 0 in my simulated data. Also I added very very small number to the expected DV to avoid log(0) . Still I face the same problem. Do you have any nice idea to solve this problem? Thanks.


stancode <- 
' 
  data {
    int<lower=0> N;      ##Num Total observation
    int<lower=1> H;      ##Num Firms
    int<lower=1> ncoef;  ##Num of Coefficients (p,q,m)
    int<lower=1> nhier;  ##Num of Firm level coefficients 
    int<lower=1> ntime;  ##Time 
    
    vector[ntime] time;
    row_vector[nhier] Z[H];    ## Firm level Covariates including intercept (H*nhier)
    vector<lower=0>[N] sales;  ## Sales of all firms (deprendent variable)
    vector[ncoef] mu;          ## Just 0,0,0 
  }

parameters {
  corr_matrix[ncoef] C;        ## prior correlation by Stan reference
  vector<lower=0> [ncoef] tau; ## prior scale       by Stan reference
  
  real delta1;                 ## Coefficient1 for firm level covariate
  real delta2;                 ## Coefficient2 for firm level covariate
  real delta3;                 ## Coefficient3 for firm level covariate
  real delta4;                 ## Coefficient4 for firm level covariate
  real delta5;                 ## Coefficient5 for firm level covariate
  real delta6;                 ## Coefficient6 for firm level covariate
  real delta7;                 ## Coefficient7 for firm level covariate
  real delta8;                 ## Coefficient8 for firm level covariate
  real delta9;                 ## Coefficient9 for firm level covariate
  
  real<lower=0> sigmasq;       ## scale parameter
  matrix [H,ncoef] eta;        ## Firm level error term for p,q,m
  
  
}


transformed parameters{
  
  real <lower=0> p[H];         ## each firm\'s p
  real <lower=0> q[H];         ## each firm\'s q     
  real <lower=0> m[H];         ## each firm\'s r  
  
  vector<lower=0,upper=1> [ntime] F;  ## Cumulative Dist of sales
  vector<lower=0,upper=1> [ntime] lf; ## lag Cumulative Dist
  vector<lower=0,upper=1> [ntime] f;  ## Prob Dist
  vector<lower=0> [N] mf;             ## Expected Sales calculated by martket potential*Prob Dist
  
  
  for(h in 1:H){
    p[h]=exp(Z[h,1]*delta1+Z[h,2]*delta2+Z[h,3]*delta3)*exp(eta[h,1]);  ## Multiplicative Error
    q[h]=exp(Z[h,1]*delta4+Z[h,2]*delta5+Z[h,3]*delta6)*exp(eta[h,2]);  ## Multiplicative Error
    m[h]=exp(Z[h,1]*delta7+Z[h,2]*delta8+Z[h,3]*delta9)*exp(eta[h,3]);  ## Multiplicative Error
  
    for(t in 1:ntime){
      F[t]=(1-exp(-(p[h]+q[h])*time[t]))/(1+(q[h]/p[h])*exp(-(p[h]+q[h])*time[t])) ;  ## Formula for calculating Cum
    }
    lf[1]=0;
    for(i in 2:ntime){
      lf[i]=F[i-1];
      f=F-lf;
    }
   
    mf[(ntime*(h-1)+1):(ntime*h)]=m[h]*f+0.00000000000000000000001; ### To avoid exact 0 
 
  }
}


model{
  
  matrix[ncoef, ncoef] Sigma_beta;
  
  tau ~ cauchy(0, 2.5);    ## By Stan Manual
  C ~ lkj_corr(2);         ## By Stan Manual
  
 delta1~normal(-1.7,5);    ## Priod of deltas
 delta2~normal(0,5);
 delta3~normal(0,5);
 delta4~normal(-1.73,5);
 delta5~normal(0,5);
 delta6~normal(0,5);
 delta7~normal(13,5);
 delta8~normal(0,5);
 delta9~normal(0,5);
 
  Sigma_beta = quad_form_diag(C, tau); ## By Stan Manual
  
  for (h in 1:H){
    eta[h]~multi_normal(mu, Sigma_beta);  ### errors in p,q,m
  
  }
  
  log(sales)~normal(log(mf),sigmasq);  ####Likelihood function
 
}  '

initf <- function() list(delta1=delta1,delta2=delta2,delta3=delta3,delta4=delta4,delta5=delta5,delta6=delta6,delta7=delta7,delta8=delta8,delta9=delta9)
bassstan <- stan_model(model_code = stancode, verbose = TRUE)
 

Here my data generation code



library(dplyr)
options(scipen=999)
library(rstan)
library(tidybayes)

H<-2     #Number of Firms
ntime<-20  #Number of time
N=H*ntime  #Number of total overvation
ncoef=3    #Number of coefficients (p,q,m)
nhier=3    #Number of coefficients for firm level covariates 

mu=as.vector(rep(0,ncoef))  ###Just 0,0,0
sigsq=0.01
#prec=1/sigsq

z1=rnorm(H,0,0.1) ### Firm level covariates
z2=rnorm(H,0,0.1) ### Firm level covariates  

Z=cbind(1,z1,z2)  ### Firm level covariates matrix

delta1=-1.702585
delta2=0.0000005
delta3=-0.0000007
delta4=-1.731472
delta5=-0.0000001
delta6=0.00000009
delta7=13.81551
delta8=0.0000001
delta9=-0.0000001



deltamat=matrix(c(delta1,delta2,delta3,delta4,delta5,delta6,delta7,delta8,delta9),nrow=nhier,ncol=ncoef)


betabar=Z%*%deltamat
beta=matrix(0,H,ncoef)

###### Making beta matrix
sigeta=0.001
for(i in 1:H){
   beta[i,]=exp(betabar[i,])%*%matrix(c(exp(rnorm(1,0,sigeta)),0,0,0,exp(rnorm(1,0,sigeta)),0,0,0,exp(rnorm(1,0,sigeta))),ncoef,ncoef )
  }

time=1:(ntime)       ####Time
sale=matrix(0,N,1)   ####Dependent variable

for(i in 1:H){
  
  u=rnorm(ntime,0,sd=sqrt(sigsq))
  eps=exp(u)
  
  F=((1-exp(-(beta[i,1]+beta[i,2])*time))/(1+(beta[i,2]/beta[i,1])*exp(-(beta[i,1]+beta[i,2])*time)))
  lf=dplyr::lag(F,1)
  lf[1]=0
  f=F-lf
  
  sale00=beta[i,3]*f
  sale0=sale00*eps
  sale[(ntime*(i-1)+1):(ntime*i)]=sale0
  
}


data=list(sales=as.vector(sale),time=as.vector(1:ntime),mu=mu,Z=Z,N=N,H=H,ntime=ntime,ncoef=ncoef,nhier=nhier)


Hi and welcome!

I don’t understand your model, but the error seems to occur because f[1] is nan. Did you forget a f[1] = 0 or something similar? There also seem to be some indices missing in that expression: f=F-lf;.

Changing these two things makes the model run, but with some divergent transitions and poor sampling, which can probably be solved by reparameterizing tau, see https://betanalpha.github.io/assets/case_studies/fitting_the_cauchy.html.

Please report back if that doesn’t help.

Best regards,

Daniel

1 Like

Hi Daniel,

I sincerely appreciate your suggestion. That was my mistake. In debugging, I made all the vector forms to scalar forms in the code not to be confused. However, I forgot doing it in the f part. Following your recommendation, I made my code run.

Regarding poor mixing, I faced the same trouble in linear regression with hierarchical structure (random effect). I am not sure the reason because I replicated all the priors in Stan manual to recover covariance matrix. I tried inverse Wishart prior also like I had done in Gibbs but it seems not the solution in Stan. I will refer the github site you linked and study more about it.

Thank you so much again!!

I correct an error in my stan code. ‘+very small number (0.0000000000001)’ should be removed even though some initial parameters generate mf=0. It is better to select appropriate initial parameters to avoid log(0) in evaluating log likelihood.