Initial value rejected. More diagnostics?

I am getting the message

Chain 3: Rejecting initial value:
Chain 3:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 3:   Stan can't start sampling from this initial value.

when running my model. It seems that Stan is then able to proceed normally as sampling then continues.

However, I thought I had all my variables constrained, such that this case is not possible, i.e. I should never get values that cause this problem.

Is it somehow possible to get more diagnostics from Stan so it can tell me where this problem occurs?

Here is my current model:

data {
  int<lower=1>                 N;          // Number of trials
  int<lower=1>                 M;          // Number of subjects
  vector<lower=0>[N]           RT;         // Reaction times
  int<lower=1>                 subj[N];    // Subject number for RT i
  vector<lower=0,upper=1>[N]   resp_l;     // 1 if the response was on the left, 0 otherwise
  vector<lower=0,upper=1>[N]   incomp;     // 1 if the trial was incompatible, 0 otherwiese
  vector<lower=0,upper=1>[N]   acc;        // Accuracy: correct (1) or incorrect (0) response
  int<lower=1,upper=2>         hp[N];      // Hand position: Down (1) or up (2)
  real<lower=0>                NDTMin;     // Minimal Non-Decision time
  real<lower=0>                minRT[M];
}

parameters {
  // Group level parameters
  real                                   alpha_mu[2];       // Boundary separation mean     (parameter to log normal)
  real<lower=0>                          alpha_sigma[2];    // Boundary separation variance (parameter to log normal)

  real<lower=1>                          beta_alpha[2];        // alpha parameter to beta distribution for starting value
  real<lower=1>                          beta_beta[2];         // beta  parameter to beta distribution for starting value

  real                                   delta_mu[2];       // mean drift rate (group level)
  real<lower=0>                          delta_sigma[2];    // variance

  real                                   eta_mu[2];            // Drift rate difference between compatible / incompatible trials
  real<lower=0>                          eta_sigma[2];         // Drift rate difference (variance component)

  // Individual parameters
  vector<lower=0>[M]                     alpha[2];   // Individual boundary separation
  vector<lower=0,upper=1>[M]             beta[2];    // Individual starting value
  vector[M]                              delta[2];   // Individual drift rate
  vector<lower=NDTMin>[M]                tau;        // non-decision time (no hierarchical model)

  vector[M]                              eta_z[2];   // Congruency effect of this participants (z-score)
}

transformed parameters {
  vector[N] alpha_trl;
  vector[N] beta_trl;   // Beta for each trial
  vector[N] delta_trl;  // Drift rate in each trial

  vector[M] eta[2];     // Individual compatibility effects

  for(i in 1:2) {
    eta[i] = eta_mu[i] + eta_z[i]*eta_sigma[i];
  }

  for(i in 1:N) {
    alpha_trl[i] = alpha[hp[i],subj[i]];
    // initial offset should mostly depend on handedness etc.
    // i.e. a single offset towards left/right responses
    // therefore, we reverse the beta, if the response was on
    // the left
    beta_trl[i] = beta[hp[i],subj[i]] + resp_l[i]-2*beta[hp[i],subj[i]] .* resp_l[i];
    delta_trl[i] = (delta[hp[i],subj[i]] + incomp[i] .* eta[hp[i],subj[i]]) .* (2*acc[i]-1);
  }
}

model {
  alpha_mu    ~ std_normal();
  alpha_sigma ~ exponential(10);

  tau         ~ uniform(NDTMin, minRT);

  delta_mu    ~ normal(0,10);
  delta_sigma ~ cauchy(0,10);

  beta_alpha ~ exponential(1);
  beta_beta  ~ exponential(1);

  eta_mu      ~ normal(0,10);
  eta_sigma   ~ cauchy(0,100);



  for(i in 1:2) {
    alpha[i]   ~ lognormal(alpha_mu[i],alpha_sigma[i]);
    beta[i]    ~ beta(beta_alpha[i], beta_beta[i]);
    delta[i]   ~ normal(delta_mu[i],delta_sigma[i]);
    eta_z[i]   ~ std_normal();
  }

  RT ~ wiener(alpha_trl, tau[subj], beta_trl, delta_trl);

}

The problem does not appear if I just take a subset of the data, but I have not figured out which part of the data is responsible, yet.

It’s a bit clunky but you can debug this by putting print statements in the model block to figure out at which points the log probability evaluates to -inf. If you put print(target()); that will print out the log probability so if you move around the print statement in the model block you can find at which line the problem occurs. Instead of moving it around you could also put a print statement after each line in the model block, e.g.,

a ~ std_normal();
print("target_1 = ", target());
b ~ std_normal();
print("target_2 = ", target());

If you do that then I recommend only running for a few iterations or you’ll get a ton of print output.

Glancing at your Stan program, the first thing I noticed is that you have

but tau is only declared with a lower bound of NDTmin and no upper bound:

So Stan may randomly initialize tau to a value above minRT but that value isn’t valid with the uniform distribution bounded from above at minRT. That could result in the messages you’re seeing about initialization. It will try many times, so if it happens to randomly initialize to a valid point it will run, which could be why you get these messages but then it runs ok.

There could be other issues but that’s the first one to check.

1 Like

Thank you, I will try the print tomorrow.

I guess I should have been more clear on the tau, which is initialized explicitly (by drawing from the same uniform distribution in R) instead of having Stan initialize it. I can’t put an upper value on that, as minRT is different for each participant and I can’t use an array as an upper limit.

I guess I should still double-check once more if the initialization is correctly used.