Unexpected omission from instrumentalVariables() output

Dagitty has been very helpful to students in my class, but this behavior (bug?) tripped many of them up on an assignment by making it look like there were no usable valid instruments in a more complex version of the example below.

In the first DAG below, `U` is an unobserved confounder, and the only valid instrument is `Z` conditionally on `X`. Dagitty correctly identifies this.

    library(dagitty)
    
    dag_latent <- dagitty('dag {
    D [outcome,pos="0.887,1.229"]
    E [exposure,pos="-0.025,1.218"]
    U [latent,pos="-1.929,0.187"]
    X [pos="-1.462,1.224"]
    Z [pos="-0.794,1.229"]
    E -> D
    U -> D
    U -> X
    X -> Z
    Z -> E
    }')
    
    instrumentalVariables(dag_latent)

> Z |  X

However, if we forget to mark `U` as latent, we get that either `X` or `Z` are valid instruments conditionally on `U`.

    dag_observed <- dagitty('dag {
    D [outcome,pos="0.887,1.229"]
    E [exposure,pos="-0.025,1.218"]
    U [pos="-1.929,0.187"]
    X [pos="-1.462,1.224"]
    Z [pos="-0.794,1.229"]
    E -> D
    U -> D
    U -> X
    X -> Z
    Z -> E
    }')
    
    instrumentalVariables(dag_observed)

> X |  U
> Z |  U

The listed conditional instruments are not wrong, and an alert user would just ignore them as unusable because we can't actually condition on the unobserved `U`. But `Z | X` is now missing, even though it is still a (conditionally) valid instrument. This could easily lead a user to incorrectly conclude that there is no way to perform a valid instrumental variable analysis because all of the valid instruments condition on an unobserved variable.

This differs from `adjustmentSets()` in two ways:

1. `adjustmentSets()` fairly explicitly indicates that the returned list is of *minimal* adjustment sets by default. There is no such indication for `instrumentalVariables()`, and if the behavior is intentional, it's not clear why or how this minimality is defined.
2. As far as I can tell, there's no scenario where marking a variable as latent returns a minimal adjustment set that isn't already returned when not marking that variable as latent. Marking a variable as latent only removes minimal adjustment sets (the ones that include it). Though I don't know of a proof for this and it may just be a failure of imagination.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected omission from instrumentalVariables() output #99

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Unexpected omission from instrumentalVariables() output #99

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions