Strange behavior with lognormal() sampling statement

Hello Stan cognoscenti. This one’s a real head-scratcher.

I recently started getting pathological behavior (divergences, stuck chains, etc.) from some code that I had run without incident many, many times in the past (but not since the release of rstan 2.19). After weeks of trial-and-error sleuthing, I finally boiled the problem down to the following reprex:

parameters {
  real<lower=0> theta;
}

model {
  theta ~ lognormal(2,2);
}
library(here)
test <- rstan::stan(here("test.stan"), chains = 3, cores = 3)
Warning messages:
1: There were 2969 divergent transitions after warmup. 
Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 
2: There were 31 transitions after warmup that exceeded the maximum treedepth. 
Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded 
3: Examine the pairs() plot to diagnose sampling problems
 
4: The largest R-hat is NA, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat 
5: Bulk Effective Samples Size (ESS) is too low, indicating posterior 
means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess 
6: Tail Effective Samples Size (ESS) is too low, indicating posterior 
variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess 
Inference for Stan model: test.
3 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=3000.

              mean se_mean  sd 2.5%          50%         97.5% n_eff Rhat
theta 3.95748e+305     NaN Inf    0 1.604126e+23 5.253103e+296   NaN  NaN
lp__   0.00000e+00     NaN   0    0 0.000000e+00  0.000000e+00   NaN  NaN

Samples were drawn using NUTS(diag_e) at Mon May 18 14:41:20 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

It’s as if the program is totally ignoring the sampling statement and just drawing theta from its declared nonnegative support. If I change the model block to

target += lognormal_lpdf(theta | 2, 2);

then the only difference is that lp__ is a nonzero but invariant value, still evidently unrelated to the value of theta.

This behavior depends on the arguments to the log-probability function: lognormal(2,3) is no bueno, but lognormal(1,1) is totally fine and works just as you’d expect. Also, if I pass reals instead of integers (e.g., lognormal(2.0,2.0)) then the problem disappears. Otherwise, I can’t discern any pattern regarding which mu and sigma values are “bad”. I haven’t (yet) reproduced this with distributions other than the lognormal, or without the lower bound on theta.

Since I discovered the integer-to-real “fix” and tweaked my old code accordingly, it appears to run as expected, but that’s a pretty unsatisfying “solution” that doesn’t inspire confidence or peace of mind.

When this all started, I was using R 3.6.3 and RTools 3.5, but I’ve since upgraded to R 4.0.0 and RTools 4.0 and it’s still happening. I tried to roll back to an earlier version of rstan, but I wasn’t able to get the installation to work. I did, however, test it with CmdStan 2.23.0 via cmdstanr, and the behavior was the same.

I have a sneaking suspicion that no one else is going to be able to reproduce this (though if I’m wrong, I would love to hear about it), and that something is terribly borked with my machine and/or installation of R / rstan / something else that multiple uninstalls, reinstalls, upgrades, reboots, etc. have not managed to resolve. I guess my question is, does anyone have any conjectures as to what the heck the problem(s) might be?

Session info:

R version 4.0.0 (2020-04-24)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] here_0.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4.6       compiler_4.0.0     pillar_1.4.4       prettyunits_1.1.1 
 [5] tools_4.0.0        pkgbuild_1.0.8     lifecycle_0.2.0    tibble_3.0.1      
 [9] gtable_0.3.0       pkgconfig_2.0.3    rlang_0.4.6        cli_2.0.2         
[13] rstudioapi_0.11    parallel_4.0.0     loo_2.2.0          gridExtra_2.3     
[17] withr_2.2.0        dplyr_0.8.5        vctrs_0.3.0        stats4_4.0.0      
[21] rprojroot_1.3-2    grid_4.0.0         tidyselect_1.1.0   glue_1.4.1        
[25] inline_0.3.15      R6_2.4.1           processx_3.4.2     fansi_0.4.1       
[29] rstan_2.19.3       sessioninfo_1.1.1  ggplot2_3.3.0      callr_3.4.3       
[33] purrr_0.3.4        magrittr_1.5       codetools_0.2-16   matrixStats_0.56.0
[37] backports_1.1.7    scales_1.1.1       ps_1.3.3           StanHeaders_2.19.2
[41] ellipsis_0.3.1     assertthat_0.2.1   colorspace_1.4-1   munsell_0.5.0     
[45] crayon_1.3.4      
4 Likes

I just want to confirm that I can replicate the problem on Catalina with R 4.0.0 and rstan 2.19.3. Useless sampling with lognormal(2,2) but it works fine with lognormal(2.0, 2.0). I also find the problem with cmdstanr and cmdstan 2.22.1.

Hooray! I mean, bizarre and troubling. But at least now I don’t feel like the gaslights are flickering.

1 Like

Yeah, I wish I had better news but it looks like you found a bug. It’s probably useful for the developers to know that it replicates across platforms.

In my opinion, you don’t need to do any statistical inference here, and simply use the generated quantities block

It’s just simulating from a prior. The point isn’t that this toy model is useful for any actual inference problem, but that it’s a minimal example demonstrating bizarre and pathological behavior that is reproducible across multiple Stan interfaces in (at least) two completely different OSes. As @stijn said, sure looks like a bug to me.

I think it is not pathological behaviour. I have had something similar with sampling from negative binomial, but after some thinking I concluded it does make sense of not abling doing so in the current paradigm in Stan. For example, how would you interpret of Rhat for this kind of problem? This is the simplest thing, but probably you can arrive to some other.

@stijn just confirmed they are able to replicate the problem.

Would you mind checking whether it’s a particular argument that’s causing the issue, or whether it only occurs when both are ints?

i.e., can you check whether the same issues occur across:

theta ~ lognormal(2,2.0);
theta ~ lognormal(2.0,2);

It’ll be easier to debug on the back-end if I know which combinations of arguments cause the issue

1 Like

A Stan model defines a log-probability as a function of parameter(s). It doesn’t matter in principle how that log-probability is constructed – in particular, say, whether it includes likelihood statements as well as priors. In practice, one might write a priors-only model to do prior predictive checking.

The crux of the issue here is that the compiled program does not do what the model specifies. For certain arguments, the sampling statement is just ignored. Also, as I laid out, this pathology manifests itself in much more complex models in the wild.

Thanks, good question. I looked into this before, but just confirmed that

theta ~ lognormal(2,2.0);

is fine and

theta ~ lognormal(2.0,2);

is bad.

4 Likes

Good to know (I always write numbers with decimals, so I wouldn’t imagine such problem)

Well, we haven’t established that it’s only the int / real typing that’s causing the problem (although it sounds like @andrjohns has a hunch). Again, lognormal(1,1) is fine.

I can repoduce the issue with just the math library:

test/unit/math/prim/prob/lognormal_log_test.cpp:25: Failure
Expected equality of these values:
  (stan::math::lognormal_lpdf(0.8, 2.0, 2.0))
    Which is: -2.0067382
  (stan::math::lognormal_log(0.8, 2.0, 2))
    Which is: -1.3889421

The problem is occurring because when we take the inverse of the sigma argument, we’re accidentally invoking an integer division when sigma is an int:

inv_sigma[n] = 1 / value_of(sigma_vec[n]);

I’ll put in a PR in the math library and post an update here when it’s in. Thanks for finding this!

4 Likes

Aha, that makes so much more sense than I had imagined! Would this also affect normal_lpdf(), and maybe others, or is it isolated to lognormal_lpdf()?

Thanks for tracking this down so quickly!

No worries! It appears to be isolated to the lognormal distribution, the other lpdfs specified the division with 1.0 / , so no integer division was used

2 Likes

The fix has been merged and will be in the next Stan release, thanks again for finding this one!

2 Likes

Thank you, @andrjohns! I’m relieved this turned out to have such a rational (LOL) explanation, and I look forward to reverting all my lognormal(real, real) statements to lognormal(int, int) after the next Stan release.

1 Like