# Defining a custom lpdf for the Normal-Laplace convoluted density in Stan

I am trying to define the Normal-Laplace density as stated here at the bottom of Page 5 https://arxiv.org/pdf/1712.07216.pdf but also supporting a location in the Normal component. I will attach my code here, but find the examples in the documentation hard to understand in this context so there may be some obvious issues with it:

``````functions {
real mills_ratio(real z) {
// return (1 - normal_cdf(z, 0, 1)) / exp(normal_lpdf(z | 0, 1));
return (1 - Phi_approx(z)) / exp(std_normal_lpdf(z));
}

real normal_laplace_lpdf(vector y, real mu, real sigma, real scale) {
real total = 0.0;
for (i in 1:num_elements(y)) {
total += log((1 / sqrt(2) * scale) * std_normal_lpdf((y[i] - mu) / sigma) * (mills_ratio((sqrt(2) * sigma) / scale - (y[i] - mu) / sigma) + mills_ratio((sqrt(2) * sigma) / scale + (y[i] - mu) / sigma)));
}
}
}
``````

I am looking to call it like this:

`y2 ~ normal_laplace(mu, sqrt(sigma2), scale);`

Currently this results in the following working with pystan:

Rejecting initial value:
Log probability evaluates to log(0), i.e. negative infinity.
Stan canâ€™t start sampling from this initial value.
Rejecting initial value:
Log probability evaluates to log(0), i.e. negative infinity.
Stan canâ€™t start sampling from this initial value.
Rejecting initial value:
Log probability evaluates to log(0), i.e. negative infinity.
Stan canâ€™t start sampling from this initial value.
Rejecting initial value:
Log probability evaluates to log(0), i.e. negative infinity.
Stan canâ€™t start sampling from this initial value.
Rejecting initial value:
Log probability evaluates to log(0), i.e. negative infinity.
Stan canâ€™t start sampling from this initial value.
Rejecting initial value:
Log probability evaluates to log(0), i.e. negative infinity.
Stan canâ€™t start sampling from this initial value.
Rejecting initial value:
Log probability evaluates to log(0), i.e. negative infinity.
Stan canâ€™t start sampling from this initial value.

Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

y2 is an input of samples taken from a Normal(5,2) distribution with some Laplace(0,1.5) noise applied to them, scale is given as a known quantity to the stan model in the data section.

Any help would be much appreciated.

1 Like

Part of the problem is that you are taking the logarithm of the standard normal log-PDF rather than the logarithm of the standard normal PDF.

3 Likes

I think the problem is that you have `std_normal_lpdf((y[i] - mu) / sigma)` where it should be `exp(std_normal_lpdf((y[i] - mu) / sigma))`.

I recommend calculating everything on log scale, like this

``````functions {
real log_mills_ratio(real z) {
return log1m(Phi(z)) - std_normal_lpdf(z);
}

real normal_laplace_lpdf(vector y, real mu, real sigma, real scale) {
real total = 0.0;
real k = (sqrt(2) * sigma) / scale;
for (i in 1:num_elements(y)) {
real r = (y[i] - mu) / sigma;
total += -log(sqrt(2) * scale)
+ std_normal_lpdf(r)
+ log_sum_exp(log_mills_ratio(k - r),
log_mills_ratio(k + r));
}
}
}
``````
3 Likes

Thank you for the great help, I totally missed that `exp`, this does now work at least in part, my experiment involves varying the proportion of contaminated data and the laplace scale, i can now get through a few parameter configurations before seeing this:

â€¦
Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan canâ€™t start sampling from this initial value.
Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan canâ€™t start sampling from this initial value.
Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan canâ€™t start sampling from this initial value.
Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan canâ€™t start sampling from this initial value.

Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
multiprocessing.pool.RemoteTraceback:
â€śâ€ť"
Traceback (most recent call last):
File â€ś/usr/local/Caskroom/miniconda/base/envs/bayes/lib/python3.7/multiprocessing/pool.pyâ€ť, line 121, in worker
result = (True, func(*args, **kwds))
File â€ś/usr/local/Caskroom/miniconda/base/envs/bayes/lib/python3.7/multiprocessing/pool.pyâ€ť, line 44, in mapstar
return list(map(*args))
File â€śstanfit4anon_model_69327d427e3eac45965d5233a6417087_4342000952850253565.pyxâ€ť, line 373, in stanfit4anon_model_69327d427e3eac45965d5233a6417087_4342000952850253565._call_sampler_star
File â€śstanfit4anon_model_69327d427e3eac45965d5233a6417087_4342000952850253565.pyxâ€ť, line 406, in stanfit4anon_model_69327d427e3eac45965d5233a6417087_4342000952850253565._call_sampler
RuntimeError: Initialization failed.
â€śâ€ť"

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File â€śrun.pyâ€ť, line 90, in
unseen_data, ytildes, klmodel, betamodel, knownmodel, args.cpu_count, noise_config[0], *prior_config)
File â€ś/Users/Library/Mobile Documents/com~apple~CloudDocs/PhD/Synthetic Data/experiment.pyâ€ť, line 41, in run_experiment
knownfit = knownmodel.sampling(data=data, warmup=warmup, iter=num_iter, chains=chains, n_jobs=cpu_count)
File â€ś/usr/local/Caskroom/miniconda/base/envs/bayes/lib/python3.7/site-packages/pystan/model.pyâ€ť, line 813, in sampling
ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)
File â€ś/usr/local/Caskroom/miniconda/base/envs/bayes/lib/python3.7/site-packages/pystan/model.pyâ€ť, line 85, in _map_parallel
map_result = pool.map(function, args)
File â€ś/usr/local/Caskroom/miniconda/base/envs/bayes/lib/python3.7/multiprocessing/pool.pyâ€ť, line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File â€ś/usr/local/Caskroom/miniconda/base/envs/bayes/lib/python3.7/multiprocessing/pool.pyâ€ť, line 657, in get
raise self._value
RuntimeError: Initialization failed.

Do you have any further advice for troubleshooting this. I am currently using the code you suggested for my normal_laplace_lpdf

So the log probability is finite but itâ€™s gradient is undefined? Ouch, that sounds rather tricky. My best guess is that the ratio calculation under flows. If so, this check will print a warning

``````real log_mills_ratio(real z) {
real lr = log1m(Phi(z)) - std_normal_lpdf(z);
if (lr < -1e30) {
print("Mills ratio failed! ", z, " -> ", lr);
}
return lr;
}
``````
3 Likes

Yep I am getting a lot of -Infs here, do you have any ideas for how to fix? Some sample output:

Phi(z): 1
Std Norm lpdf: -177.215
Mills ratio failed! 18.7774 -> -inf
Phi(z): 1
Std Norm lpdf: -198.119
Mills ratio failed! 19.8595 -> -inf
Phi(z): 1
Std Norm lpdf: -598.526
Mills ratio failed! 34.5719 -> -inf
Phi(z): 1
Std Norm lpdf: -343.762
Mills ratio failed! 26.1856 -> -inf

These values donâ€™t fill me with confidence.

`log(Phi(-z))` should be more stable than (but otherwise equal to) `log1m(Phi(z))`.

The issue probably only hurts initialization. Once the sampler finds the correct posterior the errors should disappear.
Standardizing your data and parameters may help initialization. The idea is that Stan first sets all parameters to values between -2 and 2, and you need to write model so that that guess is not too far from the true values. â€śToo farâ€ť in this case meaning maybe 20 standard deviations.

2 Likes

The `infs` are going to be difficult to avoid with the `Phi` function, since Phi will underflow to 0 below -37.5 and overflow to 1 above 8.

So for `log1m(Phi(z))`, if z is > 8, this evaluates to `log(1 - 1) = log(0) = -inf`.

For `log(Phi(-z))`, if z >= 37.5, this evaluates to `log(Phi(-37.5)) = log(0) = -inf`

1 Like

Thatâ€™s right. According to manual, `Phi_approx` should work better in the tails.

If at all useful, tensorflow_probability has special functions for `log(Phi(x))` which have better numerical stability and might help here:

A very basic example of usage:

``````import tensorflow_probability as tfp

from tensorflow_probability.python.internal import special_math

dist = tfp.distributions.Normal(0.0, 1.0)

>>> dist.log_cdf(-100.0)
<tf.Tensor: shape=(), dtype=float32, numpy=-5005.524>
>>> special_math.log_ndtr(-100.0)
<tf.Tensor: shape=(), dtype=float32, numpy=-5005.524>
>>> dist.cdf(-100.0)
<tf.Tensor: shape=(), dtype=float32, numpy=0.0>
``````
2 Likes

I think the manual is a bit misleading here, because `Phi_approx(z)` will underflow to 0 at roughly z < -21.5 and overflow to 1 at roughly z > 5.

1 Like

If you have a suggestions on how to improve it, put it here and Iâ€™ll make a PR about it (unless you want to do it yourself directly, of course).

I usually just stick to `Phi()` if I canâ€™t avoid it. ;) The issue with `Phi_approx()` just came up here a few times and I guess the consensus was that Stan would profit from better (log) cdf implementations in general. Iâ€™m useless at anything C++, so I would probably only mess everything upâ€¦ Haha. But, I guess the tfp implementation above would maybe be a good start, no?

I meant to ask you for suggestions on how to improve the manual, but rereading what I wrote, itâ€™s clear that I didnâ€™t convey thatâ€¦ :)

Thanks for the link. Thatâ€™s some terse doc! I looked at the code and itâ€™s a lot more complex than the doc indicates. But definitely something we can do if someone has interest. Itâ€™s an important feature.

R can also handle logs of normal cdfs to much more precision than we can.

I thought someone was working on this already in Stan, but maybe Iâ€™m confusing it with something else.

Haha, okay, I misread that completely! Hm, Iâ€™m afraid I donâ€™t have any specific suggestion for the manual.