Scale parameter negative in normal_lpdf despite positive initialization

Dear Stan users,

I’m using a Stan model adapted from The Virtual Brain to fit real SEEG data.
The model computes the joint log probability of slp (SEEG log power) and snsr_pwr.

During fitting (using optimize with LBFGS), I encountered the following error messages in the early phase of optimization:

Exception: normal_lpdf: Scale parameter is -85.0778, but must be > 0!
Error evaluating model log probability: Non-finite function evaluation.

However, the optimization eventually terminates normally.

My questions:

  1. I confirmed that the initial values for the scale parameters eps_slp and eps_snsr_pwr are strictly positive (both set to 1.0), and they are also constrained in the model using T[0,].
    → Is it still possible for normal_lpdf(..., ..., scale) to evaluate with a negative scale parameter during optimization?

  2. Since this error occurred early and the optimization finished successfully,
    Can these errors be safely ignored, or are they indicative of deeper instability in the model?

I attach the stan code for your reference.

transformed parameters{   
  // Euler integration of the epileptor without noise 
  row_vector[nn] x[nt];
  row_vector[nn] z[nt];
  row_vector[ns] mu_slp[nt];
  row_vector[ns] mu_snsr_pwr = rep_row_vector(0, ns);
  for (t in 1:nt) {
    if(t == 1){
      x[t] = x_step(x_init, z_init, I1, time_step);
      z[t] = z_step(x_init, z_init, x0, K*SC, time_step, tau0);
    }
    else{
      x[t] = x_step(x[t-1], z[t-1], I1, time_step);
      z[t] = z_step(x[t-1], z[t-1], x0, K*SC, time_step, tau0);
    }
    mu_slp[t] = amplitude * (log(gain * exp(x[t])')' + offset);
    mu_snsr_pwr += mu_slp[t] .* mu_slp[t];
  }
  mu_snsr_pwr = mu_snsr_pwr / nt;
}

model {
  x0 ~ normal(x0_mu, 1.0);
  amplitude ~ normal(1.0, 10.0)T[0,];
  offset ~ normal(0, 10.0);
  tau0 ~ normal(20, 10.0)T[5,];
  K ~ normal(1.0, 10.0)T[0,];
  for (i in 1:nn){
    x_init[i] ~ normal(-2.0, 10.0);
    z_init[i] ~ normal(3.5, 10.0);
  }
  eps_slp ~ normal(1, 10)T[0,];
  eps_snsr_pwr ~ normal(1, 10)T[0,];
  for (t in 1:nt) {
    target += normal_lpdf(slp[t] | mu_slp[t], eps_slp);
  }
  target += normal_lpdf(snsr_pwr | mu_snsr_pwr, eps_snsr_pwr);
}

All initial parameter values (including x_init, z_init, eps_slp, etc.) are confirmed to be in valid ranges.

Any help in understanding whether this issue is critical or safe to ignore would be greatly appreciated!

Thanks in advance!

No. But…eps is “sampled” with constraint, is it defined with constraints? You are not sharing the parameters block so no way to tell.

No, as the optimization is not done yet, and we cannot tell from the shared blocks what could cause this. I also wonder why optimization instead of sampling.

Another note, depends on the scale, the priors could be too wide: normal(1, 10) can give a value far away from 1, assuming that’s the “typical “ value.

Thanks for your reply!

Here is the parameters block from my model:

data {
  int nn;
  int ns;
  int nt;
  matrix[ns,nn] gain;
  matrix<lower=0.0>[nn, nn] SC;

  // Modelled data
  row_vector[ns] slp[nt]; //seeg log power
  row_vector[ns] snsr_pwr; //Total power in each sensor

  // mean of x0 priors, EZ hypothesis can be embedded using this
  row_vector[nn] x0_mu;
}

parameters {
  row_vector[nn] x0;
  real amplitude;
  real offset;
  real K;
  real tau0;
  row_vector[nn] x_init;
  row_vector[nn] z_init;
  real eps_slp;
  real eps_snsr_pwr;
}

As you can see, eps_slp and eps_snsr_pwr are declared as real,
and they are constrained using T[0,] in the model block.

I have a few follow-up questions:

  1. Do you recommend narrowing the prior range?
    For now, both scale parameters are defined as:
    eps_slp ~ normal(1, 10)T[0,];
    I wonder if something like normal(1, 1) would be more stable.

  2. I’m using both optimization and sampling (optimize and sample)
    And I plan to use the output from both methods.
    But even when using sampling, a similar warning occurs during the early warmup phase:

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: multiply: B[1] is -nan, but must not be nan! (in ‘…/rpne_040.stan’ at line 114)
If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

This warning appears several times at the beginning.
Is this something I should worry about, or can I safely ignore it if the chains stabilize afterward?

Any further suggestions on improving model robustness would be much appreciated.
Thank you again for your support!

The definition of eps need to be contrained, c.f. Truncated or Censored Data. Specifically you may want to use lower=0 to contrain eps.

The modeler will need to assess the prior. Posterior and Prior Predictive Checks may help.

See the truncation & constraint reference above.

It may sound a lot to ask, but following guidance in [2011.01808] Bayesian Workflow would answer many general questions on modeling using Stan.

2 Likes

This is the right answer. But if you have a variable sampled with T[0, ] in the modeling block it is required to be declared with lower=0. Stan requires a model to have support for all variables satisfying the constraints. If you only use the T[0, ] it’ll turn into rejection sampling and you’ll have all sorts of problems.

Also, the way you’re coding this is very inefficient.

eps_slp ~ normal(1, 10)T[0,];
eps_snsr_pwr ~ normal(1, 10)T[0,];

What you want to do is declare as

real<lower=0> eps_slp;
real<lower=0> eps_snsr_pwr;

and then you can drop the truncation in the distribution statements. When the parameters to a truncated distribution are constant and you have specified them to satisfy the constraint, then the truncation just adds a very expensive constant to the log density and hence doesn’t affect sampling.