Segfault in simple exponential model


#1

Hello everyone,

I want to first thank the developers for an awesome software package that has helped me extensively in several projects.

This bug doesn’t really affect me, but I felt like I should report it anyways. With a specific model/data combination and a specific random seed, I observe a segfault when invoking the optimizing function. The model is rather simple.

Steps to reproduce:

  1. Please find the model attached (exp_model.stan) exp_model.stan (558 Bytes)
  2. Please find an R script attached (breaks_stan.R) breaks_stan.R (917 Bytes)
  3. Modify line 36 of the R file to point to where exp_model.stan lives on disk.
  4. Run the R file in its entirety. This is important to advance PRNG to the point of failure. This involves sampling from the posterior, but it terminates in less than half a second on my machine.

Bug Description:
Posterior sampling from the model works fine. Afterwards, when trying to find the optimum, it will print the initial loglikelihood, then R will freeze for about half a minute before this message appears (the shrug emoji is just my replacement for the standard prompt):

¯\_(ツ)_/¯ optimizing(mod, data = stan_dat)
Initial log joint probability = -8.18984

 *** caught segfault ***
address 0x256b38fff, cause 'memory not mapped'

Traceback:
 1: .External(list(name = "CppMethod__invoke_notvoid", address = <pointer: 0x7fdbf945fd70>,     dll = list(name = "Rcpp", path = "/Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rcpp/libs/Rcpp.so",         dynamicLookup = TRUE, handle = <pointer: 0x7fdbf9708d90>,         info = <pointer: 0x7fdbf986aa20>), numParameters = -1L),     <pointer: 0x7fdbf97fb9b0>, <pointer: 0x7fdbf9739640>, .pointer,     ...)
 2: sampler$call_sampler(c(args, dotlist))
 3: .local(object, ...)
 4: optimizing(mod, data = stan_dat)
 5: optimizing(mod, data = stan_dat)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

This only seems to occur for a specific PRNG state.

Facts about the platform this was discovered on:

OS: Mac OS High Sierra 10.13.2

R: R version 3.4.3 (2017-11-30) – “Kite-Eating Tree”
Copyright © 2017 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin15.6.0 (64-bit)

rstan (Version 2.17.3, GitRev: 2e1f913d3ca3)

R was executed directly from a bash terminal.


#2

I wasn’t able to reproduce this on Ubuntu 16.04. But if it seems rng dependent then maybe it’s hard to trigger.

Maybe someone with a Mac can try it.

I get tons of error evaluating model log probabilities, then the optimization fails and things quit:

Error evaluating model log probability: Non-finite gradient.
Error evaluating model log probability: Non-finite gradient.
Error evaluating model log probability: Non-finite gradient.

Optimization terminated with error: 
  Line search failed to achieve a sufficient decrease, no more progress can be made
$par
     lambda_1      lambda_2            nu 
 7.322885e-02 8.411680e-309  3.661443e-02 

edit: Not saying the bug doesn’t exist – just adding that I’d tried it.


#3

Actually maybe make an issue in https://github.com/stan-dev/rstan/issues ? It might get lost in the shuffle on Discourse.


#4

On a Mac, I can reproduce the situation as described by @nwycoff. If I don’t run sampling first, then I get the output that @bbbales2 got.

R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.5

This seems similar to https://github.com/stan-dev/stan/issues/1928.

The crashing part seems like a bug, as I imagine Stan should just complain about the boundary issue instead of segfault. In fact, if you don’t run sampling first, then Stan does just complain – like bbbales2 got. R’s optim issues a warning about NaNs.

The lack of robustness to this boundary issue matches optim exactly.

I’m at least convinced that this is a boundary issue because optimizing, even after following steps 1-4 to reproduce, produces results similar to optim – within about 1e-4 – for all of the following cases.

  1. Flat prios
  2. Normal priors – N(0, 1)
  3. Gamma priors with boundary avoiding shape – \Gamma(\alpha = 2, 1)

The prior in the reproducible example, \Gamma(0.2, 1), seems to put too much weight near 0.

Should I open a new issue or follow up on issue 1928 with this simpler reproducible example?


#5

Thanks for the replies! Indeed, one could argue that the optimization problem is not well conditioned, and I’m not actually using this prior/likelihood combination for any analyses, but my understanding of segfaults is that they basically shouldn’t happen no matter how much the user is asking for it :)

I’d be happy to either make an issue on github, or let Edward make an issue, or just add a link to this thread in the existing issue.


#6

Yeah, if it reproduces on a mac (thanks for checking @roualdes!) then go ahead and make an issue. I’d say a new one since this is actually segfaulting.


#7

gotcha, I’ve made an issue here.