Contradicting solutions to pipes in DAGs in Statistical Rethinking 2

Hi @richard_mcelreath,
I was reading through the 2nd edition preprint and took a look at your lecture and have a question regarding the treatment of pipes (in DAGs).
You use the fungal treatment example that has a DAG looking a little like this:
fungal
In the book you say that conditioning on F removes the effect of T in our model, not sure the wording is good. Essentially you want to omit F from your model to get the effect of T on H_1.

However, I would read T \rightarrow F \rightarrow H_1 as the pipe here and those are “Open unless you condition on Z [F in this DAG]”. Which I would read as “add F to your model to close the pipe”.
So now I have “omit F from your model to get the effect of T on H_1” and “add F to your model to close the pipe” which contradict each other.

Another example you give is the grandparents/parents education one:
gp
In your lecture you list the following paths from G to C:
(1) G \rightarrow C
(2) G \rightarrow P \rightarrow C
(3) G \rightarrow P \leftarrow U \rightarrow C
And the comment: “Conditioning on P closes (2) but opens (3)”
Again, G \rightarrow P \rightarrow C looks like a pipe to me, so conditioning on P closes it, based on “Open unless you condition on Z” however if we ignore U, what is the difference to the first example?

I feel like it could be something of a direct vs indirect effect thing but am not sure how to work with that:

  • For the plant example, the treatment itself has no direct effect on plant growth. Only through the hindering of fungal growth does it impact H_1.
  • In the Grandparents example, there might be a direct effect of GP education on children.

The problem I have is that any kind of soil treatment could (I think) also directly influence plant growth.
Sso one could rewrite the first DAG as such:
plant2
Which makes it the same as the GP graph if we ignore U.
Conditioning on P in the GP graph closes (2) so using that on the fungal graph, we should condition on F to close the same path T \rightarrow F \rightarrow H_1.

Could you maybe give me a helping hand here (or anyone else who got DAGs)?

I am a bit unsure what the exact question is, but here are some thoughts.

I think your mentioning of direct and indirect effects is important.
It seems to me that DAGs are frequently used to see if it is possible/how to estimate the total effect, where

total_effect = direct_effect + indirect_effect

Abut the first DAG you show: Because this is missing a direct edge T \rightarrow H_1, this DAG implies the assumption that there is no direct effect.

If you think there is a direct effect, your third DAG represents this well. If you now want to estimate the total effect of T on H_1, you must not adjust for F. If you adjust for F, you get the direct effect T \rightarrow H_1.

About the second DAG: It is correct that

because the path G \rightarrow P \leftarrow U \rightarrow C is initially closed at the collider/inverted fork point \rightarrow P \leftarrow . If you condition on P however (in an attempt to close G \rightarrow P \rightarrow C), you will open the path G \rightarrow P \leftarrow U \rightarrow C.

Hope this helps.

Also check out http://www.dagitty.net/, which will also show you implied conditional independecies for your DAG, which you can use to check if it is a good representation of your data.

4 Likes

It sounds like my confusion comes from the direct vs indirect effect thing.
Thanks for taking the time to answer :)