In many (quasi-)experimental settings with an encouragement design, the presence of defier is common and reasonable. For example, Angrist and Evan (1998) leverage an instrumental variable to study the effect of a family’s size on the labor supply. They utilize the composition of child’s sex in the family of at least two children as the instrument for the family’s size, assuming that parents are biased toward having more childrens when they have two childrens with the same sex.

The potential defiers in this setting are the sub-population of parents that are sex bias. For instance, some parents may prefer to have two daughters or two sons for some cultural reasons (see Dahl and Moretti 2008, cited in de Chaisemartin 2017).

When this is the case, the treatment status of this sub-population moves to the opposite side of the theoretical expectation. Substantially, this implies that these biased parents tend to maintain the family’s size, despite having two children with the same sexes.

In such situations, the reduced-form (RF) relative to the first-stage (FS) regressions no longer yields a local average treatment effect (LATE), even when other assumptions hold.

Assume a binary treatment and instrument for simplicity, the monotonicity failure gives a downward bias on the Wald estimand as shown below:

\[\begin{align} \frac{\mathbb{E}[Y| Z = 1] - \mathbb{E}[Y|Z = 0]}{P[D(1) > D(0)]} = \text{LATE} - \frac{P[D(1) < D(0)]}{P[D(1) > D(0)]} \times \text{DATE} \end{align}\]

where $\text{DATE} = \mathbb{E}[Y(1) - Y(0)|D(1) < D(0)]$, which is LATE of the defier group.

In the settings of the labour market above, if the family’s size does not have an effect on the mother’s decision entering the job market among defier group, then the presence of this sub-population obviously does not prevent the identification of LATE. But the cultural preference of sex diversity in the family may correlate to the mother’s decision to work, making the non-zero effect of the family size and labour outcome less desirable assumption. Therefore, generally, when the potential outcome of defier group is non-zero across all domains, then the LATE is not identified.

Chaisemartin (2017) suggests a complementary assumption to recover LATE under monotonicity failure by decomposing the complier into two sub-groups called complier-survival (CS) and complier-defier (CD). In this strategy, CD must express similar characteristics with the defier, such that:

  1. (1) Same proportion: $P(CD = 1) = P[D(1) < P(0)]$;
  2. (2) Same effect: $\mathbb{E}[Y(1) - Y(0)|CD = 1] = \mathbb{E}[Y(1) - \mathbb{E}[Y(0)]| D(1) < D(0)]$

Under this decomposition, the Wald estimand yields no longer a LATE interpretation, but this gives the LATE of complier-survival group.

Proof

The proof is a special case where the instrument and treatment variables are binary.

The equation below shows that the Wald estimand is equal to LATE-related quantity minus a bias factor.As shown previously, the strategy is to expand the RF regression with the potential outcome framework and LTP:

\[\begin{align} \mathbb{E}[Y|Z = 1] - \mathbb{E}[Y|Z = 0] &= \mathbb{E}\bigg[\bigg(D(1) - D(0)\bigg)\bigg(Y(1) - Y(0)\bigg)\bigg] \\ &= \mathbb{E}[Y(1) - Y(0)|D(1) > D(0)]\times P(D(1) > D(0)) \\ &- \mathbb{E}[Y(1) - Y(0)|D(1) < D(0)]\times P(D(1) < D(0)) \end{align}\]

Divides both sides with the complier probability, we get the result.

The second part of the proof is to show that under decomposition, the Wald estimand yield LATE of complier-survival. First, we expand the first term of the last equality from the first proof:

\[\begin{align*} \mathbb{E}[Y(1) - Y(0)|D(1) > D(0)] &= \mathbb{E}[Y(1) - Y(0)|CS = 1] \times \frac{P(CS = 1)}{P(D(1) > D(0))} \\ &+ \mathbb{E}[Y(1) - Y(0)|CD = 1] \times \frac{P(CD = 1)}{P(D(1) > D(0))} \end{align*}\]

Pluging back the second term above and factorizing with the defier-related quantity by same proportion and same effect assumptions above, it yields zero:

\[\begin{align*} &= \mathbb{E}[Y(1) - Y(0)|CD = 1] \times \frac{P(CD = 1)}{P(D(1) > D(0))} - \mathbb{E}[Y(1) - Y(0)|D(1) < D(0)]\times \frac{P(D(1) < D(0))}{P(D(1) > D(0))} \\ &= \frac{P(CD = 1)}{P(D(1) > D(0))} \bigg[\mathbb{E}[Y(1) - Y(0)|CD = 1] - \mathbb{E}[Y(1) - Y(0)|D(1) < D(0)]\bigg] \\ &= 0 \end{align*}\]

Finally, these demonstrate that:

\[\begin{align*} \frac{\mathbb{E}[Y|Z = 1] - \mathbb{E}[Y|Z = 0]}{P(D(1) > D(0))} &= \mathbb{E}[Y(1) - Y(0)|CS = 1] \times P(CS = 1| D(1) > D(0)) \\ \text{Wald Estimand} &= \text{LATE of CS} \end{align*}\]