I’m wondering if it is possible to get the Stan compiler to output a factor graph for a Stan model.
From some previous posts in 2019 and 2021, it seems like the computation of the factor graph was implemented in stanc to support diagnostics like checking for multiple priors for a single parameter. I can see in the stanc documentation that there is a utility function factor_graph_to_dot to export this factor graph to a file, but as far as I can tell, there is no user-facing way to invoke this function for a given model.
So ultimately my questions are:
Is this considered mature functionality in the compiler?
Would extracting this factor graph be as simple as finding a way to make the compiler invoke factor_graph_to_dot during compilation?
If the answers to these two question are both ‘yes’, then my follow-up question would be whether it would be reasonable to make an issue/PR for this?
Hi, @CollinCademartori, and welcome to the Stan forums. @WardBrian should be able to give you a definitive answer as to what stanc is doing. Given that you’re in town, you might want to just visit us at Flatiron and then we can talk about what you’re trying to do.
Extracting an exact factor graph in Stan is an undecideable problem, so it’s really just a matter of what approximation is being used. I’m pretty sure from previous messages on our forums that it’s not trying to compute a factor graph per se, but just counting the number of times a variable shows up on the left hand side of a distribution statement with ~.
P.S. I haven’t heard anyone talk about dot for 20 years! I used to use it in the 90s.
The functionality is indeed used in the compiler for diagnostics. I believe the specifics are that it computes a strict over-estimate, meaning some edges may be spurious. The code there pre-dates my time working on the compiler.
That said, I think it would be very reasonable to add it behind a debug flag and see what kind of output you get. I’ve never myself called that function!
Thanks for the invitation, @Bob_Carpenter! Unfortunately, I’m no longer in town. I graduated last May and am now at Wake Forest. I’m interested in extracting a factor graph because I’ve been thinking about ways of summarizing/explaining posterior uncertainty in complex models that involves finding paths in a factor graph for the model.
Is the undecideability of the factor graph problem due to the fact that the posterior factorization may not be determined in Stan until runtime (e.g. if I have if statements in the model block)? Or is the issue deeper than that?
And thanks @WardBrian for the clarification on what the compiler is doing! I might try my hand at this if I can figure out enough OCaml to do it.
Oh, right. Either Mitzi or Charles had told me you moved to Wake Forest.
It’s due to the fact that Stan allows loops, assignment, and conditionals, which is enough to make it Turing equivalent. Thus we’re subject to the halting problem:
As a consequence, we can’t even decide which lines of code will get executed by just looking at a Stan program.