In short, I’m wondering whether it is possible to get a machine-readable structured error output (e.g., JSON, XML, etc) from the current
stanc or a future version (
I’m currently trying to update the emacs
stan-mode to the 2019 standards. One of the new feature is on-the-fly syntax check using
flycheck-stan. The example picture is a code with a single syntax error in
cauchy_lpdf, which the on-the-fly syntax checker underlined.
But I’m having a difficulty parsing
stanc error output for this. When there is only one error or info, it is fine. But when there are multiple info like deprecated syntax and an error, parsing them separately becomes little trickier.
- Info’s do not have associated line numbers, making it difficulty to indicate it live
- An error typically starts with " error" with associated line and column (some times just line). But in this case, the explanation “Probability function…” comes before “error…” making it difficult to capture its beginning unless all possible messages are encoded in the parser.
- Machine-readable format (e.g., --json option gives the errors in json)
- Data elements include: line, column, severity (error, warning, info, etc), one-liner error/info message, long error/info body for full description
- If machine-readable format is too far-fetched, a consistent beginning (Error: in addition to Info:), consistent line number, one liner summary before the multi-line message body can make parsing much easier.
I had the same problem when I had a go at adding flycheck support and an Atom lintr.
@jrnold, thanks for your comment. Is the logic for the Aom lintr available somewhere?
stan/src/stan/lang/grammars/semantic_actions_def.cpp seems to be where these messages are generated. If messages with
pass = false all have the same prefix like “Error:” then it will be much easier.
Oh man, I didn’t see this thread! @enetsee and @Matthijs, @kaz-yos here is looking for structured output for the purposes of presentation compilation :)
We were thinking about eventually building something like language server protocol into stanc3: https://github.com/stan-dev/stanc3/issues/65
We have a super vague sketch of an idea for how that could happen using Menhir but haven’t had the time to work on it yet, and didn’t realize there were folks in the community who could make use of it already.
I am documenting what I wrote elsewhere. I solve the issue as follows for now:
flycheck-stan picks up all “Info:…” patterns and error patterns listed here.
These were extracted from semantic_actions_def.cpp by some basic pattern matching.
flycheck is smart enough to warn you even if the patterns are missed because it pick up the non-zero exit code from stanc.
I haven’t implemented support for
stanc3, but the corresponding OCaml code may be easier for message extraction.
Thank for the information on the potential future directions!
Of course! Would you be interested at all in working on a Stan Language Server that can integrate with something like https://github.com/emacs-lsp/lsp-mode ? We hear it is the new hotness w.r.t. IDE language support - should allow for error checking, completions, and definitions at least. One time we analyzed what we thought a minimal supporting API would look like and wrote about it here: https://github.com/stan-dev/stanc3/issues/94
We’d be happy to help you get started with OCaml - it’s a lot like a LISP, actually!
I am revisiting this now after a couple of weeks of interruption due to grant deadlines. In the latest develop branch of
stanc gives more structured output. Is this already
stanc3? Here I can just capture “Warning:” and “Syntax error” to split up the output into individual messages, which is easier.
Yes, develop cmdstan has stanc3 as of end of October.
Thanks for your information!
Hi @seantalts and @rok_cesnovar,
I am currently examining the error patterns in stanc3.
e.g. , https://github.com/kaz-yos/stan-mode/blob/develop/flycheck-stan/examples/example_info_composite_with_error.stanc3out.txt
They are very beautiful in that each individual message has file name, line, column, and extract associated with it!
As far as I can see in my example bad stan files and the expected file in stanc3, there are three types of openings for individual messages.
- Syntax error in
- Semantic error in
Are these exhaustive? Is it correct, the first “Warning” is non-fatal and the latter two “errors” are fatal (not executable).
Also, are these correct as the location in stanc3 code base where these message templates are defined?
Full disclaimer: I dont have an exhaust knowledge of the compiler, hopefully one of the people running the compiler development (@seantalts, @Matthijs, @enetsee, et al) will be able to confirm or correct me and add to my words.
I think you are more or less correct. There are three types of syntax errors (parsing error, lexing error and include error). I think the link you posted for syntax errors is pointing to a different location. Then as you note we have warnings and semantic erros. There is also fatal error that is something that should not happen and is probably a compiler bug if it does.
Thanks! I’ll work on flycheck-stan based on these.
The warnings can now be correctly highlighted (underscores) in the developmental version of flycheck-stan!
Fantastic stuff @kaz-yos!