Shot in the dark, but a reviewer asked for a diagram of my model, and while I could do one by hand I thought I’d check if anyone knows any tools that attempt to automate some portion of that task. Obviously a hard problem, but figured it might be common enough that someone might have tried to tackle it.
Greta does this. It’s about midway down here under Plotting: https://greta-stats.org/articles/get_started.html
Ah, my model was done in Stan; presumably no automatic Stan-to-Greta converters too?
No but they do offer help for translating from Stan to Greta: https://greta-stats.org/articles/example_models.html
I think it’d be about the same amount of work to convert (and check) the code as it would be to do then diagram by hand.
Yeah I think the (most?) interesting aspect to Greta is using TensorFlow to fit probability models in R.
The diagrams to which you refer are for specifying probabilistic graphical models. The Stan Modeling Language, on the other hand, is richer than just probabilistic graphical models which means that we cannot define diagrams for a general Stan program. Moreover, in turns out to be theoretically impossible to translate those limited Stan programs that are equivalent to graphical models to a graphical model and hence graphical model representations like diagrams. Heuristic translations can go a long way, but given the tricky edge cases no one has attempted anything along this line.
Came across this nice new R-package for visualizing DAGS yesterday thanks to @bgoodri which may have been helpful to you: ggdag. The video content is here: https://youtu.be/3p5zCXoggtA?t=2483. It’s not auto-generating but it could be the best alternative in R at the moment. What did you end up using?
Exactly what i was searching for, to see the structure and causal relations in my Stan function. The examples are so easy to code, its easier than sketching it on a piece od paper.
Also worth checking out dagitty for causal DAGS – the online version has a nice GUI to play with: http://www.dagitty.net/development/dags.html
Wait, why is it theoretically impossible to generate a diagram from a Stan model that is equivalent to a graphical model? If anyone reading this wants to learn ocaml and help build such a tool based on the AST in the new compiler, I’m creating an issue here: https://github.com/stan-dev/stanc3/issues/177
See: ICAR. The real question isn’t about theoretical but in practice can you take a model I specify as
target+= and make a useful DAG-like viz. out of it.
Agreed - we don’t need it to work in 100% of cases including halting-problem-style pathological examples. It’d be great if it worked for 50% of models people write already and helped influence people to write more generative models.
What are commonly classed “probabilistic graphical models” are a very particular subset of probabilistic models, namely those faithfully specified as directed acyclic graphs. Another common class of graphical models are those faithfully specified as undirected graphs. The overlap between these two classes is only partial, with some models faithfully specified by one, both, or none. Additionally there are generalizations of graphical models to expand the scope of faithful representations. See Bishop 8.3.4.
The Stan language goes beyond all of this by requiring only a density representation. Converting a Stan program into a graphical specification will be well-posed only if the model falls into the domain of the corresponding graph type, and then implementable only if one can identify that correspondence in finite time.
The problem with heuristic translations beyond their heuristic nature is that they confuse the intent of the Stan language. If we wanted a pure graphical language then we would have written an entirely different language. Indeed this is the approach that BUGS, PyMC, and others have taken which is why they have such tools.
In my opinion throwing down incomplete heuristics without a hell of a good UX that is able to communicate what the graphical model representation means and what it doesn’t mean is only going to confuse users and hence will be more danger than benefit. And I haven’t seen any discussion of such an UX at all.
I think it’d be more interesting if it was fast. When I tried Greta, it was super slow. Here’s the thread where I evaluated it. It’d be interesting to know if it’s gotten faster or if there was just something I was doing wrong.
I agree. I don’t want something that only works on a subset of the Stan language (one with only sampling statments, single assignment to variables, no conditionals, and no local variables.
I think a more viable approach for us would be to define a directed graphical modeling language like BUGS/JAGS/PyMC3 and a translation of that to Stan. That’d let us do all the cool stuff you can do with a directed graphical model like do automatic simulation, allow missing data, etc.
Why not just extract a factor graph (these encompass both directed and undirected graphical models) from a Stan program? Ryan already has implemented the code to do this. Then you can visualize that. This is something that is always possible.
Moreover, it’s easy to check whether a Stan program is a generative model using static analysis. If so, we can give users the option to also visualize it as a DAG if they want. I don’t see the problem.
Not sure if it’s easy to check whether the program corresponds to an undirected graphical model. If so, then we can also allow people to visualize the models that way if they so desire.
After the 0.3 release greta is now much faster. We don’t have specific benchmarks but small models which used to take minutes now sample in a few seconds.
I do like the idea of a graphical modelling language which compiles to Stan - that sort of gives you the best of both worlds: a high level* UI for those that don’t want to get into the nitty gritty and the more low level Stan interface for those who need to.
- I would naively imagine that such graphical language would include less types etc. than Stan