LP3 model with custom likelihood function | RuntimeError: Initialization failed

Hello community,

for a project at the university, we are in the need of some advice. Our goal is to develop an LP3 model for flood frequency analysis based on the paper “Bayesian MCMC flood frequency analysis with historical information” by Reis and Stedinger (2005) - Link Paper.

It is the first time we got in contact with Stan and not everything is working as planned. The model compiles as expected without any error until we are trying to fit the model with our data.

We are using Python to generate the data as follows:

#Generate Data
data=np.random.lognormal(mean=6.9, sigma=0.83, size=120)

X_0=np.quantile(data,0.99) #99th Quantile, censoring threshold
k_data=data[data>X_0] #k (historical floods)
syst_data=data[100:] #systematic Data
#h_data=data*(data>X_0)

#setting threshold for historical data
for i in range(0, 100, 1):
  if data[i] < X_0:
    data[i] = 0

hist_data = data[:100]

For the specific case of this paper, we already figured out to use a custom likelihood function with the help of another post. Our model looks currently like this in pystan:

model_LP3_string = 
  functions{

    //LP3 probability density function

    vector LP3_pdf(vector x, real mu, real sigma, real gamma){
      vector[num_elements(x)] results;
      vector[num_elements(x)] z;
      vector[num_elements(x)] nom;
      vector[num_elements(x)] denom;

      for(i in 1:num_elements(x)){ ////loop over data and store each density
          z[i] = gamma/6. + (6./gamma) * ((gamma/2.) * ((x[i]-mu)/sigma) + 1)^(inv(3.)) - 6./gamma;
          nom[i] = 1/(sqrt(2.*pi())) * exp(x[i]^2 * 1./2.); //normal distribution
          denom[i] = sigma * (gamma * 0.5 * (x[i]-mu)/sigma + 1)^(2. / 3.); 

          results[i] = nom[i]/denom[i];
        }
      return results;
      }

      //LP3 Cumulative density function estimated for one real value (later x0) 

      real LP3_cdf_real(real x, real mu, real sigma, real gamma){
        real z;

        z = gamma/6. + (6./gamma) * ((gamma/2.) * ((x-mu)/sigma) + 1)^(inv(3.)) - 6./gamma; 
  
      return normal_cdf(z, 0., 1.);
      }

    //Fakultät
    real fak(int n){
      real erg;
      for(i in 1:n){
            erg = erg * i;
        }
      return erg;
    }

    //Likelihoodfunction (Paper equation 2) 

    real LP3_lik_cens(vector x, vector y, real mu, real sigma, real gamma, int h, int k, int s, real x0){
        real hpdfs;
        real syst;
        real hist;
        vector[k] LP3_pdf_hist;
        vector[s] LP3_pdf_syst;

        hpdfs = 0;
        syst = 0;
        
        LP3_pdf_hist = LP3_pdf(y, mu, sigma, gamma);
        LP3_pdf_syst = LP3_pdf(x, mu, sigma, gamma);

        for (i in 1:k){
          hpdfs = hpdfs * LP3_pdf_hist[i];
        }
        for (i in 1:s){
          syst = syst * LP3_pdf_syst[i];
        }
  
        hist = fak(h)/(fak(k)*fak(h-k)) * LP3_cdf_real(x0, mu, sigma, gamma)^(h-k) * hpdfs;

        return hist*syst;
    }
  }

  data{
    int<lower = 0> k;
    int<lower = 0> s;
    int<lower = 0> h;
    real x0;
    vector[k] hist_floods;
    vector[s] syst_floods; 
  }
  parameters{
    real mu;
    real<lower = 0> sigma;
    real gamma;
  }
  model{
    //non-informative Prior Distribution
    mu ~ normal(7, 10);
    sigma ~ normal(10, 100); //FIXME
    gamma ~ normal(0, 0.3);

    //sampling distribtion
    target += LP3_lik_cens(syst_floods, hist_floods, mu, sigma, gamma, h, k, s, x0);
  }

EDIT: We are modifying our generated data from above a bit, so that it fits to our model as follows:

# defining other pre-requisites for the model
y_hist=hist_data[hist_data>X_0] #k (historical floods)

k = y_hist.shape[0]
s = syst_data.shape[0]
h = hist_data.shape[0]
X0 = X_0
hist_floods = y_hist
syst_floods = syst_data

#passing Data
data_LP3 = dict(k = k, s = s, h = h, x0 = X0, hist_floods = hist_floods, syst_floods = syst_floods)

# fitting the Stan model to the data
fit_LP3 = model_test.sampling(data = data_LP3, iter = 50000, chains = 1, warmup = 10000, seed = 42, algorithm = "HMC")

While executing the last line, we are getting the following error code:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-5-6068024545a3> in <module>()
     23 
     24 # fitting the Stan model to the data
---> 25 fit_LP3 = model_test.sampling(data = data_LP3, iter = 50000, chains = 1, warmup = 10000, seed = 42, algorithm = "HMC")

1 frames
/usr/local/lib/python3.7/dist-packages/pystan/model.py in sampling(self, data, pars, chains, iter, warmup, thin, seed, init, sample_file, diagnostic_file, verbose, algorithm, control, n_jobs, **kwargs)
    811         call_sampler_args = izip(itertools.repeat(data), args_list, itertools.repeat(pars))
    812         call_sampler_star = self.module._call_sampler_star
--> 813         ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)
    814         samples = [smpl for _, smpl in ret_and_samples]
    815 

/usr/local/lib/python3.7/dist-packages/pystan/model.py in _map_parallel(function, args, n_jobs)
     88             pool.join()
     89     else:
---> 90         map_result = list(map(function, args))
     91     return map_result
     92 

stanfit4anon_model_5cd2ed03b9e63fab404287dfb826ebc5_8232970312130431370.pyx in stanfit4anon_model_5cd2ed03b9e63fab404287dfb826ebc5_8232970312130431370._call_sampler_star()

stanfit4anon_model_5cd2ed03b9e63fab404287dfb826ebc5_8232970312130431370.pyx in stanfit4anon_model_5cd2ed03b9e63fab404287dfb826ebc5_8232970312130431370._call_sampler()

RuntimeError: Initialization failed.

We are a bit lost with this RuntimeError and several tries. Maybe we need to loop in the definition of the model overall observations, but as described, we are relatively new to Stan.

Your knowledge and help is very much appreciated and a big thank you to all who are sticking till the end of this post.

Cheers,
Felix