Scale parameter is 0, but must be > 0! - can I do anything to deal with this?

I am building a more complicated model, but have reproduced the warning in a very simple case. The warning does indicate that there is nothing really going wrong, but I don’t like getting that warning - is there anything to be done?

Here is the simple Stan model:

data {
    int<lower=1> N;
    real y[N];
}

parameters {
    real mu;
    real<lower=0> sigma;
}

model {
    mu ~ normal(0, 10);
    sigma ~ exponential(1);
    y ~ normal(mu, sigma);
}

And here is the R code:

set.seed(334)

y <- rnorm(30, 10, 3)
N <- length(y)

mod <- cmdstan_model("KSG/simple_normal.stan")

fit <- mod$sample(
  data = list(N=N,y=y),
  seed = 123,
  chains = 4,
  parallel_chains = 4,
  refresh = 500,
  iter_warmup = 500,
  iter_sampling = 1000
)

And here are the warnings:

Running MCMC with 4 parallel chains...

Chain 1 Iteration:    1 / 1500 [  0%]  (Warmup) 
Chain 1 Iteration:  500 / 1500 [ 33%]  (Warmup) 
Chain 1 Iteration:  501 / 1500 [ 33%]  (Sampling) 
Chain 1 Iteration: 1000 / 1500 [ 66%]  (Sampling) 
Chain 1 Iteration: 1500 / 1500 [100%]  (Sampling) 
Chain 2 Iteration:    1 / 1500 [  0%]  (Warmup) 
Chain 2 Iteration:  500 / 1500 [ 33%]  (Warmup) 
Chain 2 Iteration:  501 / 1500 [ 33%]  (Sampling) 
Chain 2 Iteration: 1000 / 1500 [ 66%]  (Sampling) 
Chain 2 Iteration: 1500 / 1500 [100%]  (Sampling) 
Chain 2 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 2 Exception: normal_lpdf: Scale parameter is 0, but must be > 0! (in '/var/folders/wt/rrrkt68n08b0jrstl_87kpkc0000gn/T/RtmpAna9YA/model-1fc7e1360c5.stan', line 14, column 4 to column 26)
Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Chain 2 
Chain 3 Iteration:    1 / 1500 [  0%]  (Warmup) 
Chain 3 Iteration:  500 / 1500 [ 33%]  (Warmup) 
Chain 3 Iteration:  501 / 1500 [ 33%]  (Sampling) 
Chain 3 Iteration: 1000 / 1500 [ 66%]  (Sampling) 
Chain 3 Iteration: 1500 / 1500 [100%]  (Sampling) 
Chain 4 Iteration:    1 / 1500 [  0%]  (Warmup) 
Chain 4 Iteration:  500 / 1500 [ 33%]  (Warmup) 
Chain 4 Iteration:  501 / 1500 [ 33%]  (Sampling) 
Chain 4 Iteration: 1000 / 1500 [ 66%]  (Sampling) 
Chain 4 Iteration: 1500 / 1500 [100%]  (Sampling) 
Chain 1 finished in 0.0 seconds.
Chain 2 finished in 0.0 seconds.
Chain 3 finished in 0.0 seconds.
Chain 4 finished in 0.0 seconds.

All 4 chains finished successfully.
Mean chain execution time: 0.0 seconds.
Total execution time: 0.2 seconds.

Theoretically that warning shouldn’t be occurring at all with the lower=0 constraint, but it can happen if the unconstrained value is sampled as -infinity.

The way the constraints work is that instead of sampling sigma (which is constrained), Stan samples log(sigma) (which is unconstrained) and then exponentiates the result to put it back on the constrained scale (i.e., greater than 0). Accordingly, if for some reason log(sigma) is sampled as -infinity, then exponentiating the result will return 0 and give that error.

As you say, it’s not concerning given that the warning only appears once, but it is interesting. @nhuurre have I got that explanation right?

1 Like

if for some reason log(sigma) is sampled as -infinity, then exponentiating the result will return 0 and give that error.

I’ve wondered about this too. Given the typical initialisation range, I don’t think it’s likely that negative infinity is getting sampled. But I have no alternative hypotheses. I guess we could check by checking the unconstrained values associated with the samples yielding those warnings, but those aren’t saved, right?

Even if they were I don’t think we could access the problematic values, since those proposals were rejected.

I wonder if it could be something to do with the exponential distribution, since the exponential(1) prior is putting the majority of the probability mass at 0. But I’m not sure tbh

Doesn’t need to go all the way to infinity, any value beyond -300 or so exponentiates to zero because the result falls outside the range of 64-bit floating point numbers.

That’s still very far from anything reasonable. Although the warning is displayed at the end of chain 2 I believe the problem happened at the very beginning of the chain and the output was delayed for some reason. The chain initializes at some random position quite far from the typical set. The gradient points towards the typical set but if stepsize is too large the next proposal can easily overshoot. The sampler adjusts stepsize during warmup so the problem should disappear after a few samples.

You could try setting the initial stepsize to smaller value, e.g.

fit <- mod$sample(
  step_size = 0.1,
  data = list(N=N,y=y),
  ...
2 Likes

Thanks - the step_size specification you suggested works, but it seems odd that it is necessary, particularly given how simple this example this is. As a comparison, I tried this using rstan, and there were no issues at at all.

Shrinking the stepsize should definitely help, but in this case even just changing the seed is enough for me to get rid of the warning.

The interface used shouldn’t matter here (although RStan and CmdStanR are using different versions of Stan so it could matter for other things). Changing the seed a few times was enough to get rid of the warning for me using CmdStanR too.