Consider an ordered random sample of covariates $(y_i, d_i)$ with $i \in 1, 2, \dots, n$, $y \in \mathbb{R}^n$, and $d$ is a binary interesting treatment variable that is non-randomly assigned on any interesting outcome $y$.

Define the observed outcome of $y$ given $d$ on the sample as $\bar{y}(1)$ among units receiving the treatment, $d = 1$, and $\bar{y}(0)$ among those who don’t, $d = 0$. The counterfactual outcomes of $\bar{y}(1)$ if $d = 0$ and $\bar{y}(0)$ if $d = 1$ are unobserved.

We are interested in what would happen to the outcome $y$ of unit $i$ if they receive the treatment, $d = 1$, on average, defined as average treatment effect on the treated (ATT):

\[\begin{align*} \tau_{ATT} = \mathbb{E}[y(1) - y(0)| d = 1] \end{align*}\]

where $\mathbb{E}$ is an expectation operator.

The ATT is then the difference of observed outcome, $y$, between units that receive the treatment and what-would-be the outcome of units that do not receive the treatment if they receive it.

In this setting, taking the difference of sample conditional means $\bar{d} = \bar{y}(1)-\bar{y}(0)$ will return $\tau_{ATT}$ plus a bias quantity.

To show this, expand and re-arrange $\bar{d}$ as follow:

\[\begin{align*} \mathbb{E}[\bar{y}(1) - \bar{y}(0)] &= \mathbb{E}[y(1) - y(0)] \ \ \text{by property of expectation} \\ &= \mathbb{E}[y| d = 1] - \mathbb{E}[y | d =0] \ \ \ \text{by definition} \\ &= \mathbb{E}[y(1) - y(0)|d = 1] + \{\mathbb{E}[y(0)| d = 1] - \mathbb{E}[y(0) | d = 0]\} \\ &= \tau_{ATT} + \text{Bias} \end{align*}\]

This means, $\bar{d}$ is equal to $\tau_{ATT}$ iff $d$ is randomly assigned, so that $\mathbb{E}[y(0)| d = 1] = \mathbb{E}[y(0) | d = 0]$,


Adapted from Problem 21.1 in Wooldridge, J. M. (2010). Econometric analysis of cross section and panel data (2nd ed.). Cambridge, MA: MIT Press, p. 975