I’m trying to write a stable probit-based model, and I’ve noticed when using optimizing
for rough exploration, that a lot of starting points are rejected. In the manual I found that normal_lcdf
underflows to -Inf
if the argument is less than -37.5. This is the same point at which R’s pnorm
underflows to 0, but pnorm
with log=TRUE
gives an answer (presumably accurate) until the argument is less than -1 \times 10^{154}.
Phi_approx
is suggested as a solution, but in fact appears to underflow even sooner, and appears to be unacceptably inaccurate on the log scale:
data{
int N;
real x[N];
}
parameters{
}
model{
}
generated quantities{
real logphi_exact[N];
real logphi_approx[N];
for(i in 1:N){
logphi_exact[i] = normal_lcdf(x[i]|0.0,1.0);
logphi_approx[i] = log(Phi_approx(x[i]));
}
}
sampling(phi_test,data=list(N=5,x=c(-1,-10,-20,-30,-40)),iter=1,chain=1,algorithm="Fix")
SAMPLING FOR MODEL 'stan-35456da26e9' NOW (CHAIN 1).
Iteration: 1 / 1 [100%] (Sampling)
Elapsed Time: 0 seconds (Warm-up)
0.000139 seconds (Sampling)
0.000139 seconds (Total)
Inference for Stan model: stan-35456da26e9.
1 chains, each with iter=1; warmup=0; thin=1;
post-warmup draws per chain=1, total post-warmup draws=1.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
logphi_exact[1] -1.84 NA NA -1.84 -1.84 -1.84 -1.84 -1.84 1 NaN
logphi_exact[2] -53.23 NA NA -53.23 -53.23 -53.23 -53.23 -53.23 1 NaN
logphi_exact[3] -203.92 NA NA -203.92 -203.92 -203.92 -203.92 -203.92 1 NaN
logphi_exact[4] -454.32 NA NA -454.32 -454.32 -454.32 -454.32 -454.32 1 NaN
logphi_exact[5] -Inf NA NA -Inf -Inf -Inf -Inf -Inf 1 NaN
logphi_approx[1] -1.84 NA NA -1.84 -1.84 -1.84 -1.84 -1.84 1 NaN
logphi_approx[2] -86.54 NA NA -86.54 -86.54 -86.54 -86.54 -86.54 1 NaN
logphi_approx[3] -596.43 NA NA -596.43 -596.43 -596.43 -596.43 -596.43 1 NaN
logphi_approx[4] -Inf NA NA -Inf -Inf -Inf -Inf -Inf 1 NaN
logphi_approx[5] -Inf NA NA -Inf -Inf -Inf -Inf -Inf 1 NaN
lp__ 0.00 NA NA 0.00 0.00 0.00 0.00 0.00 1 NaN
Here’s the output from R’s function
pnorm(c(-1,-10,-20,-30,-40),log=T)
[1] -1.841022 -53.231285 -203.917155 -454.321244 -804.608442
I’m just wondering if there’s any way to improve this in Stan? Unless it doesn’t really matter for sampling, but I can’t help thinking it would introduce some bias somewhere, because if one observation gives -Inf
that propagates to the entire log posterior evaluation. For one outlier the true contribution to the log posterior might only be \log\Phi(-38)=-727, which means that the entire log posterior might be very far from -Inf
.