Testing Stan lang compiler - how can we automate functional tests?


I’ve just submitted PR https://github.com/stan-dev/stan/pull/2357 for issue https://github.com/stan-dev/stan/issues/2280. this feature captures the line numbers in the Stan program for variable declarations - previously this was only done for statements.

as noted in the PR, the tests that I’ve added aren’t automated.

for the generated code, the test make test/integration/compile-models checks that the models in stan/test/test-models/good and subdirs compiles. so I’ve added a directory src/test/test-models/good/runtime_errors for the models I used to test this function, but I’d like to have automated tests that:

  1. check that the line numbers in the generated .hpp code do line up with the .stan source
  2. check that at runtime the error message is as expected.

there are other language features where it would be nice to be able to verify that the generated code actually does what we say it will do - especially w/r/t to manipulating data, e.g., the compound assignment operators.

to me this seems to be one level up from unit testing - the functionality is spread over the parser, generator, and stan/math library functions, not to mention the interfaces. suggestions anyone?


Yeah, coming up with something that works generally and automatically is tough.

It’d be nice if we could actually unit test the generator. In my mind, that would mean instantiating a part of the AST to pass to the particular generator function and verifying the output at that level. It should be really cheap to instantiate just a portion of the AST and we’re only checking that the generation is working. We’d also need a separate test to make sure the parser is constructing the right AST. Testing at this level wouldn’t require file access. The difficulty is that we’re not set up to do any of this and I don’t think it’s easy instantiating a single piece of the AST. (I looked into it years ago and managed to do it, but determined it was easier to run the tests we have now; maybe someone else can figure out an easier way to do it.)

The tests we have now are really not unit tests. They’re more like integration tests. And the unfortunate thing is that it’s manual. Maybe we can list the things we want to test and try to come up with something that will cover enough of those things?

It also might be easier doing the two things separately (verifying the code generation is as expected and verifying that the generated code runs as expected). Sorry I don’t have any great suggestions yet.


I just write some simple Stan programs and then translate them to C++ and make sure I see what I’m expecting to see in the C++ output. It’s very crude and doesn’t even check that the result compiles, much less that it does the right thing.


in that case, the PR could go through as is, as I did add test programs, compiled them, checked output by eye, and ran them as well.