Good morning.
I guess my question exceeds the specificities of STAN, and is more related to the core of the principle of using causal diagrams to model a specific phenomenon. How can software such as JAGS or STAN estimate the density a posteriori by MCMC from a DAGS diagram ?
By this I mean that in order to make the Metropolis-Hastings algorithm work, it is necessary to know analytically the form of the density (a posteriori in the Bayesian framework, which is of interest to us) up to a constant. Indeed, if I want to simulate a target distribution p whose shape is known to within a constant cst, i.e. p = cst*p_tilde, with Metropolis-Hastings, I can do so with an instrumental density q which I accept with an acceptance rate that depends on p_tilde and q. But you need to know p_tilde to be able to simulate p.
Now, when I use JAGS or STAN, I specify a DAGS with laws between my parameters, I press enter, and MAGIC I get an a posteriori distribution. This means that the software has calculated p_tilde at some point. But how ?
For software that adjusts the regression coefficients using MCMC, such as brms, we actually have an analytical expression for the a posteriori density of the GLM parameters, so in that case I think I understand how it works. For the usual laws, the expressions of the standard laws are coded (depending on whether you’re doing Poisson, logit or normal regression), and once they’ve been multiplied together (GLM independence assumption) and with the laws of the priors, which are also usual, you get the joint law and p_tilde. As it is demonstrated in the Molenberghs and Verbeke book, we have a know expression :
Then i guess brms automatically transcribes into STAN in back-end, and then Metropolis-Hastings produces chains whose distributions are those of the a posteriori density we wanted to estimate.
But if we draw a diagram whose vertices are variables with arrows between them to which we attribute deterministic or stochastic relationships, and we put priors on the hyperparameters, how can that work since the software has no analytical form of the a posteriori density that it could use to obtain p_tilde ?
Thank you.