LATE Theorem

Previously, we learn that in the setting of observational data where $Y$ is the outcome of interest and $D$ is a treatment indicator, what-would-be the realized outcome if a unit receiving the treatment on average, also called ATT, cannot be observed unless the treatment assignment, e.g. $D = 1$, is statistically independent on the observed behavior in the control group $Y(0)$.

Potential alternative approaches to recover this estimand incorporates covariates $\boldsymbol{X}$ through conditional ignorability and overlapping assumptions.

In this note, we introduce a similar causal effect estimand through covariates $Z$ that have the following properties:

Exclusion Restriction and Random Assignment: $Z$ is statistically independent of $Y(D)$ for all $D \in {0, 1}$ and $D(Z)$ for all $Z$ in the domain, where notation-wise means $\mathbb{E}[Y | D] = Y(D)$ and $\mathbb{E}[D| Z] = D(Z)$;
Relevance: $\mathbb{E}[D| Z] = P(D| Z) \neq 0$ for all $D \times Z$ in their domain;
Monotonic: such that $P(D| Z = z) > P(D| Z = w)$ for all $z > w$ or $z < w$ in the domain of $Z$.

Under these three properties, taking the difference in the conditional expectation of $Y$ given $D$ and $Z$ recovers the Local Average Treatment Effect (LATE), as expressed below:

\[\begin{align} \text{LATE} = \mathbb{E}[Y(1) - Y(0) | D(z) > D(w)] \end{align}\]

Moreover, when $Z$ is an indicator variable, LATE can be represented as Wald estimator as in the instrumental variable regression, as follow:

\[\begin{align} \text{LATE} = \frac{\mathbb{E}[Y| Z = 1] - \mathbb{E}[Y| Z =0]}{\mathbb{E}[D| Z = 1] - \mathbb{E}[D| Z = 0]} \end{align}\]

LATE is different but related to ATT in the sense that it recovers the average causal effect of a sub-population. If in the setting of ATT, this sub-population is the units who receive the treatment, whereas in LATE, this sub-population is the units whose the treatment status affected by the covariate $Z$.

Proof:

To proof the existence of LATE under three properties above, we follow the approach in Angrist and Imbens (1994) by attacking the problem from the numerator in equality (2). To simplify, we further assume $Z$ as a binary variable.

We also consider the notation in the potential outcome framework where we observe only one realization of the outcome conditional on the treatment status, defined as follow:

\[Y = DY(1) + Y(0)(1 - D)\]

Additionally, given the treatment status conditional on $Z$, we also have to consider that treatment status is partially observed, defined below:

\[D= ZD(1) + D(0)(1 - Z)\]

Applying these frameworks, the numerator in the equality (2) becomes as follow, after some arrangement:

\[\begin{align*} \mathbb{E}[Y| Z = 1] - \mathbb{E}[Y| Z =0] &= \mathbb{E}\bigg[\bigg(D(1) - D(0)\bigg)\bigg(Y(1) - Y(0)\bigg)\bigg] \end{align*}\]

As usual, we can represent the equality above through the Law of Total Expectation. In this case, we have three conditions: $D(1) > D(0)$, $D(1) = D(0)$, and $D(0) < D(1)$ to totalize.

However, monotonicity condition implies that $P(D(0) < D(1)) = 0$, which subtantially means that the unit does not enter to the treated group when not induced by the covariate $Z$. Further, $D(1) = D(0)$ implies that the quantity inside the expectation is zero.

All these things consider, we then have the following equality:

\[\begin{align*} \mathbb{E}\bigg[\bigg(D(1) - D(0)\bigg)\bigg(Y(1) - Y(0)\bigg)\bigg] &= \mathbb{E}\bigg[Y(1) - Y(0) | D(1) > D(0) \bigg] P(D(1) > D(0)) \end{align*}\]

By the Law of Total Probability and the potential treatment status framework mentioned above, the probability density in the second term in the equality above is equal as follow:

\[\begin{align*} \mathbb{E}[D(1) - D(0)] &= P(D(1) - D(0) = 1) -P(D(1) - D(0) = -1) \end{align*}\]

Assume monotonicity, the second term is vanished. Applying the linearity of expectation we then have,

\[\begin{align*} P(D(1) - D(0) = 1) &= P(D(1) > D(0)) \\ &= \mathbb{E}[D(1) - D(0)] \\ &= \mathbb{E}[D(1)] - \mathbb{E}[D(0)] \\ &= \mathbb{E}[D|Z = 1] - \mathbb{E}[D| Z = 0] \end{align*}\]

The last equality is established from the definition potential treatment status above.

Hence, dividing both sides by the last equality above, we show that equality (2) is equal to equality (1):

\[\begin{align*} \frac{\mathbb{E}[Y| Z = 1] - \mathbb{E}[Y| Z =0]}{\mathbb{E}[D| Z = 1] - \mathbb{E}[D| Z = 0]} &=\mathbb{E}[Y(1) - Y(0) | D(1) > D(0)] \\ &= \text{LATE} \end{align*}\]