Error evaluating the log probability at the initial value

I am getting

Rejecting initial value:
  Error evaluating the log probability at the initial value.
Exception: lognormal_lpdf: Random variable[3] is -0.150929, but must be >= 0!  (in 'modela6b869eb4ad3_LorenzStan' at line 51)

and eventually

Initialization between (-2, 2) failed after 100 attempts. 
 Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."

The offending line (51) is

  x[1,] ~ lognormal(log(1.0), 0.2);

The full model is below. Aren’t I just asking the prior for x[1,] to be sampled from a log normal distribution? But it looks like negative values are being sampled.

functions {
  real[] ds_dt(real t,       // time
               real[] s,     // system state
               real[] theta, // parameters
               real[] x_r,   // data
               int[] x_i)    // unused integer data
  {
    real x = s[1];
    real y = s[2];
    real z = s[3];

    real alpha = theta[1];

    real rho = x_r[1];
    real beta = x_r[2];

    real dx_dt = alpha * (y - x);
    real dy_dt = x * (rho - z) - y;
    real dz_dt = x * y - beta * z;

    return { dx_dt, dy_dt, dz_dt };
  }
}

data {
  int<lower = 1> N; // Number of measurements
  real ts[N];       // Times of the measurements
  real y[N];        // The measurements themselves
}

transformed data {
  real rho  = 45.92;
  real beta =  4.00;
  real rdummy[2];
  rdummy[1] = rho;
  rdummy[2] = beta;
}

parameters {
  real x[N,3];
  real ln_alpha[N];
  real mu_alpha;
  real<lower = 0> sigma_alpha;
}

model {
  real parms[1];
  real deltaT;
  mu_alpha ~ uniform(12.0, 20.0);
  sigma_alpha ~ uniform(0.0, 0.5);
  x[1,] ~ lognormal(log(1.0), 0.2);
  ln_alpha[1] ~ normal(log(mu_alpha), sigma_alpha);
  for (i in 2:N) {
    parms[1] = exp(ln_alpha[i-1]);
    target += normal_lpdf(x[i,] | (integrate_ode_rk45(ds_dt, x[i-1,], ts[i-1], ts[i:i], parms, rdummy, rep_array(0, 0)))[1], 1.0E-3);
    deltaT = ts[i] - ts[i-1];
    target += normal_lpdf(ln_alpha[i] | ln_alpha[i - 1] - deltaT * sigma_alpha * sigma_alpha / 2, sqrt(deltaT) * sigma_alpha);
  }
  y ~ normal(x[,1], 0.2);
}

You need to put a lower bound on the thing that has a lognormal prior, i.e.

real<lower=0> x[N,3];
2 Likes

Thanks very much but why do I need to do this? A log normal RV is a priori positive: X = e^{(\mu + \sigma Z)} where Z is normal with mean zero and unit variance. So how can a sampler do anything other than draw positive samples from such a distribution?

Not that an experiment is proof but also

min(rlnorm(1000000, meanlog = 0, sdlog = 1))
0.00678572495336411

The way the sampler proposes possible values from the posterior distribution is a separate process, and can have absolutely nothing to do with what prior distributions you put on parameters.

When you declare a parameter:

parameters {
  real <upper = 5> a_parameter;
}

You are declaring that a_parameter has support (-\infty, 5], and Stan will the propose values from within that support. If you then put a lognormal prior on a_parameter, you need to match the declared bounds on the parameter with the support of the prior.

Hi @idontgetoutmuch,

(Love your screen name!) Wow, I had this exact same problem about four days ago. I solved it by looking through the source code – as you know, the problem you’re encountering occurs during initialization. What happens is that if you don’t specify any constraints (or initial values) for a parameter, then initial values for that parameter will be drawn randomly from a uniform distribution with range [-2, 2] by default. That’s even if you’ve specified a lognormal prior for that parameter. If you specify <lower=0>, then the parameter values will be drawn from roughly the range [0.14, 7]. I checked this and I think the code actually uses a lognormal distribution with a location parameter of 0 (i.e. log of 1) and a scale parameter of 1 (I just checked some initial values against some randomly generated values, I didn’t look for the C++ code). So that’s why you need to specify <lower=0> even though you’re correct that your lognormal prior isn’t actually getting used in generating initial values for x[1,].

One thing I don’t understand, though, is that it looks like you only have three variables in x[1,], which means that one out of eight times you should still get positive values for all three. I’m not sure why the initialization failed after 100 attempts. Maybe someone else here could clear that up.

Just a heads-up: here’s a post with the problem I ran into two days after this one that you’re posting about, maybe it can help you avoid the next pothole.

Good luck!

That’s very useful - thanks very much. I probably don’t need a lognormal here anyway but I was trying to keep my model the same as the one I wrote in libbi.

There’s a full explanation of what’s going on in the JSS paper and in the reference manual.

Stan transforms any constrained variables to be unconstrained. For example, if there is a positive constrained variable, it is log transformed. Stan then assumes support for the entire unconstrained space. So when you don’t constrain a variable that has to be positive to have a lower bound of zero, Stan will assume negative values are legal. They then get rejected because they have no support and cause initialization to fail.

A positive constrained variable will be initialized with a (non-uniform) draw from (exp(-2), exp(2)). What happens is that if you have

real<lower = 0> sigma;

then sigma is log transformed so sigma_unconstrained = log(sigma) ranges over (-\infty, \infty). Stan samples using the transformed variable sigma_unconstrained. When Stan evaluates the model density, it uses the inverse transform, mapping the unconstrained value sigma_unconstrained back to (0, \infty) using exp(sigma_unconstrained) = sigma. (Rounding can result in a value of 0 with floating point arithmtic rather than real numbers.)

The key thing to note is that Stan automatically applies the log Jacobian of the inverse transform evaluated at the unconstrained parameter value in order to make sure that the distribution for sigma (the positively constrained one) is uniform from 0 to infinity.

1 Like