RuntimeError : Initialization failed

Hi,
I am getting error while sampling.
The following is the code I have used

# 2.2 Marketing Mix Model
data_mmm, sc_mmm = mean_log1p_trandform(data, ['Overall_Revenue'])
mu_mdip = data[mdip_cols].apply(np.mean, axis=0).values
max_lag = 8
num_media = len(mdip_cols)
# padding zero * (max_lag-1) rows
X_media = np.concatenate((np.zeros((max_lag-1, num_media)), data[mdip_cols].values), axis=0)
#X_ctrl = data_mmm['base_sales'].values.reshape(len(data),1)
model_data2 = {
    'N': len(data),
    'max_lag': max_lag, 
    'num_media': num_media,
    'X_media': X_media, 
    'mu_mdip': mu_mdip,
    #'num_ctrl': X_ctrl.shape[1],
    #'X_ctrl': [], 
    'y': data_mmm['Overall_Revenue'].values
}

model_code2 = '''
functions {
  // the adstock transformation with a vector of weights
  real Adstock(vector t, row_vector weights) {
    return dot_product(t, weights) / sum(weights);
  }
}
data {
  // the total number of observations
  int<lower=1> N;
  // the vector of sales
  real y[N];
  // the maximum duration of lag effect, in weeks
  int<lower=1> max_lag;
  // the number of media channels
  int<lower=1> num_media;
  // matrix of media variables
  matrix[N+max_lag-1, num_media] X_media;
  // vector of media variables' mean
  real mu_mdip[num_media];
  
  
}
parameters {
  // residual variance
  real<lower=0> noise_var;
  // the intercept
  real tau;
  // the coefficients for media variables and base sales
  vector<lower=0>[num_media] beta;
  // the decay and peak parameter for the adstock transformation of
  // each media
  vector<lower=0,upper=1>[num_media] decay;
  vector<lower=0,upper=ceil(max_lag/2)>[num_media] peak;
}
transformed parameters{
  // the cumulative media effect after adstock
  real cum_effect;
  // matrix of media variables after adstock
  matrix[N, num_media] X_media_adstocked;
  // matrix of all predictors
  matrix[N, num_media] X;
  
  // adstock, mean-center, log1p transformation
  row_vector[max_lag] lag_weights;
  for (nn in 1:N){
    for (media in 1 : num_media){
      for (lag in 1 : max_lag){
        lag_weights[max_lag-lag+1] <- pow(decay[media], (lag - 1 - peak[media]) ^ 2);
      }
     cum_effect <- Adstock(sub_col(X_media, nn, media, max_lag), lag_weights);
     X_media_adstocked[nn, media] <- log1p(cum_effect/mu_mdip[media]);
    }
  X <- X_media_adstocked;
  } 
}
model{
  decay ~ beta(3,3);
  peak ~ uniform(0, ceil(max_lag/2));
  tau ~ normal(0, 5);
  for (i in 1 : num_media){
    beta[i] ~ normal(0, 1);
  }
  noise_var ~ inv_gamma(0.05, 0.05 * 0.01);
  y ~ normal(tau + X * beta, sqrt(noise_var));
}
'''

stan_model1 = pystan.StanModel(model_code=model_code2, verbose=True)
fit1 = stan_model1.sampling(data=model_data2, iter=1000, chains=3)
fit1_result = fit1.extract()

Here I have commented and deleted control model related codes because in my dataset there are no control variables.
And I have checked there are no zero values and also negative values in data.

I am getting error as follows

RuntimeError                              Traceback (most recent call last)
C:\Users\703310~1\AppData\Local\Temp/ipykernel_10744/980771661.py in <module>
     87 
     88 stan_model1 = pystan.StanModel(model_code=model_code2, verbose=True)
---> 89 fit1 = stan_model1.sampling(data=model_data2, iter=1000, chains=3)
     90 fit1_result = fit1.extract()

~\anaconda\lib\site-packages\pystan\model.py in sampling(self, data, pars, chains, iter, warmup, thin, seed, init, sample_file, diagnostic_file, verbose, algorithm, control, n_jobs, **kwargs)
    811         call_sampler_args = izip(itertools.repeat(data), args_list, itertools.repeat(pars))
    812         call_sampler_star = self.module._call_sampler_star
--> 813         ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)
    814         samples = [smpl for _, smpl in ret_and_samples]
    815 

~\anaconda\lib\site-packages\pystan\model.py in _map_parallel(function, args, n_jobs)
     83         try:
     84             pool = multiprocessing.Pool(processes=n_jobs)
---> 85             map_result = pool.map(function, args)
     86         finally:
     87             pool.close()

~\anaconda\lib\multiprocessing\pool.py in map(self, func, iterable, chunksize)
    362         in a list that is returned.
    363         '''
--> 364         return self._map_async(func, iterable, mapstar, chunksize).get()
    365 
    366     def starmap(self, func, iterable, chunksize=None):

~\anaconda\lib\multiprocessing\pool.py in get(self, timeout)
    769             return self._value
    770         else:
--> 771             raise self._value
    772 
    773     def _set(self, i, obj):

RuntimeError: Initialization failed.

I am new to this pystan, any suggestions will be appreciated.
Thanks,

I am not sure, but I would guess that you need to supply sensible initial values for your parameters. See

:: Pystan :: Init values

Hi @edm, Thanks for the reply
The same code which I shared has ran successfully earlier, I am getting this error when I re-run this code. I am not getting why is this.

Thanks.

I think it is because, if you don’t specify initial values, Stan will randomly generate the initial values. Sometimes those values will work, and other times they will lead to an evaluation of log(0) or infinity, and then it fails.

For your model, I don’t think that the random initial values will always respect the bounds of your parameters. If an initial value is outside those bounds, then you will have problems.

I would only recommend this approach as a last resort.

This is usually the problem. Stan expects any parameter values meeting the constraints declared on the parameters to be in support (have non-zero density, aka finite log density).

What you can do is put print("target = ", target()) statements into the code to see where the log density goes bad. I don’t see the problem from looking at the model. I’d probably look at regular culprits like log1p(cum_effect / mu_mdip[medial]). If cum_effect / mu_mdip[medial] < 1, this will throw an error and I don’t see anything constraining that not to happen because mu_mdip is unconstrained. The best thing to do in these cases is to figure out how to constrain the parameters to do the right thing.

The definition of X <- X_media_adstocked is also suspect. That’s in a loop, so you keep setting the X to different things, so it’ll wind up taking the value of the last thing it was set to. Is there a reason you don’t just define X directly?

P.S. I would strongly recommend putting Stan code in standalone files where you can read line numbers from errors and don’t have to worry about escaping quote signs.

P.P.S. We deprecated <- and will be removing it in a few months, so those should change to = signs.

Thanks for the explanation @Bob_Carpenter and @edm
I am new to this pystan, can you give me any example that how to work with constraints in stan language.
And If we don’t want to use the power function transformation can we replace this with the direct transformed data? in the following code

// adstock, mean-center, log1p transformation
  row_vector[max_lag] lag_weights;
  for (nn in 1:N){
    for (media in 1 : num_media){
      for (lag in 1 : max_lag){
        lag_weights[max_lag-lag+1] <- pow(decay[media], (lag - 1 - peak[media]) ^ 2);
      }
     cum_effect <- Adstock(sub_col(X_media, nn, media, max_lag), lag_weights);
     X_media_adstocked[nn, media] <- log1p(cum_effect/mu_mdip[media]);
    }
  X <- X_media_adstocked;
  } 
}

Please guide me on this.
Thank you.