Sampling error kills Jupyter kernel without error messages


#1

I’ve been using RStan in Jupyter notebooks with IRKernel, and when everything works well, it works well.

However, when there is a sampling error (as opposed to a warning), and I am using parallelized sampling (mc.cores > 0), the kernel dies and all error messages are swallowed (they do not appear in my notebook, or in the console output where I am running jupyter notebook).

If I run the manual from a normal R session (e.g. by converting the notebook to a script and running it with Rscript), I see the errors. Below is an example of the kind of error that causes this behavior:

Rejecting initial value:
  Error evaluating the log probability at the initial value.
validate transformed params: nTheta[1] is 1.80545, but must be less than or equal to 1

Initialization between (-2, 2) failed after 100 attempts.
 Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

A few more notes:

  • This is on Linux 64-bit.
  • I do not see any progress output while a successful simulation is running. If it runs successfully, warnings appear in the notebook, but progress output does not appear anywhere.
  • On Windows, I see progress output (and possibly error messages - I do not remember for sure) in the console where I run jupyter notebook.
  • The Stan error itself isn’t my problem - I know how to fix that. The problem is that when I have an error like this in my Stan code, the only way I find out is a dead kernel, and then I have to play games with script exports to see the real error message(s).
  • When I run in single-process mode, I see the progress output in the notebook after the sampling has completed.

Is there any good workaround to get progress output & error messages to show up somewhere when using the Jupyter notebook? This issue makes the STAN debugging process significantly more tedious.

Operating System: Linux
Interface Version: RStan 2.16.2


#2

Thanks for posting here.

Do you have an example notebook? If you can provide anything that simplifies reproducing your environment would help. What version is your jupyter?

Finally, I’ll loop in @bgoodri. Hopefully he has some insight.


#3

Nope but I could try a reproducible example.


#4

The ModularTumors notebook in my rat tumors rebuild is sufficient to exhibit the ‘eats all progress output’ behavior; if parallelism is turned off, then the progress shows up after the sampling run is finished.

The environment-lx64.yml file defines a Conda environment with the necessary dependencies on 64-bit Linux.

I have tried modifying ratxmodel.stan to trigger kernel death, but haven’t yet found the right corruption.