Hi Bob,
Thanks for your interest and comments. I will improve the paper :)
I’m curious what the motivation was for designing another input format—the paper didn’t really say, at least up front.
I didn’t create the HistFactory input format - that was created around 2012 (originally in XML and later implemented as a JSON schema). The motivation then was to simplify sharing & preserving statistical models in high-energy physics, and allow them to be changed (or ‘rescasted’ the in language used in high-enegy physics).
In high-energy physics, we use a data analysis tool called ROOT. ROOT is a data analysis and plotting framework in object-oriented C++, with a binary data format that is similar in some ways to HDF5. ROOT can be used by compiling a program or through an interactive C++ interpreter (yes, you read that right. The original one was called cint, the new one is called cling). Explicit memory management by a user is often required, sometimes even to just plot a histogram. Have you ever come across it?
ROOT began development in the mid 90s. It’s powerful, but it’s fair to say that it can be quite awkward and hard to use, and it’s not very portable. Thus, HistFactory was created as a simple way to fully declare a model that didn’t depend on ROOT or writing any C++ (even if the HistFactory models were read by ROOT). So it was portable, easily shareable, and should have a long-lifetime.
Is there an existing stockpile of these models somewhere you want to support?
The format has been adopted by a few experimental collaborations, so yes, there is now a stockpile of these models out there. And I don’t think I’d ever persuade them to switch to Stan. And I don’t think they should - the nice thing about HistFactory is that it’s implementation-agnostic. But a converter from HistFactory to Stan is nice.
Is the plan to have some automated tool to do this or are you expecting people to edit JSON files?
These JSON files are already out there in the wild & I expect experimental collaborations to keep making them. If you are using stanhf, and your model follows the HistFactory specs, you have the choice to edit the original JSON and convert it again to Stan, or to edit directly in Stan. However, if you want to depart from the HistFactory spec, you’ll need to edit the Stan file.
The last part of the Stan User’s Guide shows how to do things like use a bootstrap for frequentist confidence interval estimation (though I hear it can be very unstable).
Thanks! I will make it clearer that there are ways of doing these things already. In stanhf, this can be done without writing more Stan code or doing simulations by using asymptotic assumptions (e.g., Wilks’ theorem and similar theorems). This functionality is there so that users can keep doing things the standard way they are done in the field.
The constraint terms go into the Jacobian accumulator, which can be turned off for doing maximum likelihood. With the latest release, we made the Jacobian accumulator available to users.
Thanks. Let me think about this & check whether I need to edit my text.