In short, I’m wondering whether it is possible to get a machine-readable structured error output (e.g., JSON, XML, etc) from the current stanc or a future version (stanc3).
Long version:
I’m currently trying to update the emacs stan-mode to the 2019 standards. One of the new feature is on-the-fly syntax check using flycheck (flycheck-stan. The example picture is a code with a single syntax error in cauchy_lpdf, which the on-the-fly syntax checker underlined.
But I’m having a difficulty parsing stanc error output for this. When there is only one error or info, it is fine. But when there are multiple info like deprecated syntax and an error, parsing them separately becomes little trickier.
Info’s do not have associated line numbers, making it difficulty to indicate it live
An error typically starts with " error" with associated line and column (some times just line). But in this case, the explanation “Probability function…” comes before “error…” making it difficult to capture its beginning unless all possible messages are encoded in the parser.
Ideal output
Machine-readable format (e.g., --json option gives the errors in json)
Data elements include: line, column, severity (error, warning, info, etc), one-liner error/info message, long error/info body for full description
If machine-readable format is too far-fetched, a consistent beginning (Error: in addition to Info:), consistent line number, one liner summary before the multi-line message body can make parsing much easier.
Oh man, I didn’t see this thread! @enetsee and @Matthijs, @kaz-yos here is looking for structured output for the purposes of presentation compilation :)
We have a super vague sketch of an idea for how that could happen using Menhir but haven’t had the time to work on it yet, and didn’t realize there were folks in the community who could make use of it already.
Of course! Would you be interested at all in working on a Stan Language Server that can integrate with something like https://github.com/emacs-lsp/lsp-mode ? We hear it is the new hotness w.r.t. IDE language support - should allow for error checking, completions, and definitions at least. One time we analyzed what we thought a minimal supporting API would look like and wrote about it here: https://github.com/stan-dev/stanc3/issues/94
We’d be happy to help you get started with OCaml - it’s a lot like a LISP, actually!
I am revisiting this now after a couple of weeks of interruption due to grant deadlines. In the latest develop branch of cmdstan (commit 991360f). stanc gives more structured output. Is this already stanc3? Here I can just capture “Warning:” and “Syntax error” to split up the output into individual messages, which is easier.
Full disclaimer: I dont have an exhaust knowledge of the compiler, hopefully one of the people running the compiler development (@seantalts, @Matthijs, @enetsee, et al) will be able to confirm or correct me and add to my words.
I think you are more or less correct. There are three types of syntax errors (parsing error, lexing error and include error). I think the link you posted for syntax errors is pointing to a different location. Then as you note we have warnings and semantic erros. There is also fatal error that is something that should not happen and is probably a compiler bug if it does.