Consider a population set $X \equiv {x_1, \dots, x_n}$, where $n \geq 2$ and $x_i \in \mathbb{R}$ for all $i \in n$. Define $\mu = \frac{1}{n} \sum_{i}^n x_i$ as an arithmatic mean of the elements in the set.

If we randomly draw one element from $X$ with replacement for $n$ times and a sample set $\hat{X}$ is constructed based on these repeated draws, then the following property emerges:

  1. (1) $\mu$ is the most probable outcome of the arithmatic mean, $\hat{\mu}$, from the sample set $\hat{X}$;
  2. (2) The most probable outcome of $\hat{\mu}$ is zero as $n$ grows;
  3. (3) The probability of one element of $X$ is being drawn at least one time is approximatedly $0.632$ as $n$ grows;
  4. (4) If this sampling procedure is repeated independently for $B \in 2, 3, \dots$ times, then the expected frequency of events when the sample mean ($\hat{\mu}$) equal the population mean ($\mu$) from these repeated samples is $B \frac{n!}{n^n}$ times.

Proof of Property (1)

To show this property, we consider an associative axiom of the real number, implying that the sample arithmatic mean is constant regardless the ways in which the elements in the sample set $\hat{X}$ are ordered. Hence, we have unordered sampling with replacement.

To construct a probability function of any outcome, $\hat{\mu}$, from the sample, we follow the principle of counting in combinatorics: as we have $n$ elements being sampled $n$ times with replacement, we have $n^n$ ways of constructing the sample set from $X$. This quantity defines the sample space, $\Omega$.

Now we consider ways of the sample of $X$ is realized from $n$ times drawn:

\[\frac{n!}{x_1! \dots x_n!}\]

with $n! = \prod_{i = 0}^n (n - i)$ because there $n!$ ways of drawing $X$, where $x_i!$ is the number of repeated times any element $x_i$ appears in the sample satisfying $x_1 + \dots + x_n = n$ that we cancelled out because the order does not matter in this sampling.

Finally, write the combination each elements are selected from $X$ in the sample set $\hat{X}$ as $\mathcal{A}$ where $\mathcal{A} \in \Omega$, then we have the following mass function.

\[\begin{align*} \mathbb{P}(\hat{X} \in \mathcal{A}) &= \frac{n!}{x_1! \dots x_n!} \frac{1}{n^n} \\ &\leq \frac{n!}{n^n} \end{align*}\]

The last inequality holds since $x_1! \dots x_n!$ are non-zero.

This upper bound implies that selecting each elements $x_i$ for all $i \in n$ at most once in $n$ times sampling with replacement is the most likely outcome.

Proof of Property (2)

Due to Stirling’s formula , the approximation of factorial is given below:

\[\begin{align*} n! \approx \frac{\sqrt{2\pi}n^{n + 1/2}}{\exp(n)} \end{align*}\]

Consider $\mathcal{A} \in \hat{X}$ and $\mathcal{A} \in X$, which is any potential subset of $\hat{X}$ returning every unique value of $X$ in the sample, regardless the order, from the property (1) it has the probability function as follow:

\[\begin{align*} \mathbb{P}(\hat{X} \in \mathcal{A}) &= \frac{n!}{n^n} \\ &\approx \frac{\sqrt{2\pi}n^{n+1/2}}{n^n \exp(n)} \end{align*}\]

By L’Hôpital’s rule, the last equality as $n$ grows infinitely is below:

\[\begin{align*} \frac{\sqrt{2n\pi}}{\exp(n)} &= \lim_{n \to \infty} \frac{\sqrt{\pi}}{\sqrt{2n}\exp(n)} \\ &= 0 \end{align*}\]

Proof of Property (3)

Given the random sampling, the probability of $x_i$ for any $i \in n$ not being selected from the population set in the first draw is below:

\[\begin{align*} \mathbb{P}(x_i \text{ not being selected}) &= 1 - \frac{1}{n} \end{align*}\]

Hence in $n$ times draws, the probability of $x_i$ not being selected is below:

\[\begin{align*} \mathbb{P}(\text{$x_i$ not being selected $n$ times}) &= \bigg(1 - \frac{1}{n}\bigg)^n \end{align*}\]

Taking the log of the probability above and compute the limit, we then have:

\[\begin{align*} \lim_{n \to \infty} \log\bigg\{\mathbb{P}(\text{$x_i$ not being selected $n$ times})\bigg\} &= \lim_{n \to \infty} \frac{\log(1 - \frac{1}{n})}{1/n} \\ &= \lim_{n \to \infty} -\frac{n}{n - 1} \ \ \text{by L'Hôpital's rule} \\ &= -1 \\ \lim_{n \to \infty} \mathbb{P}(\text{$x_i$ not being selected $n$ times}) &= \exp(-1) \end{align*}\]

which is about $0.368$.

Notice that this is actually the lower bound of missing this observation. This implies the chance of observing $x_i$ at least once in the $n$ times sample than is $1 - 0.368 = 0.631$, which obviously is larger than missing this observation.


Proof of Property (4)

The proof of this property is derived from property (2). Let $\hat{X_1}, \dots, \hat{X_{B}}$ as a possible combination of $X$ in the $B$ times sample drawn, where $\hat{X}_i = \mathcal{A}$ for all $i \in B$ is the specific event which every element of the population $X$ is succesfully recovered regardless the order in the sample set $X_i$.

Using the fact that each resample is independent, the number of $\mathcal{A}$ appears for $k$ times follows Binomial distribution:

\[\begin{align*} \mathbb{P}(\text{$\hat{X_i} = \mathcal{A}$ appears $k$ times}) &= \frac{B!}{k! (B - k)!} p^k q^{B - k} \end{align*}\]

where $\frac{B!}{k! (B - k)!}$ is ways of events $\mathcal{A}$ appears without replacement and order does not matter in $B$ repeated sample, $p = \frac{n!}{n^n}^k$ is the joint mass function when $X_i = \mathcal{A}$ appearing $k \leq B$ times, and $q = (1 - \frac{n!}{n^n})^{B - k}$ the joint mass function when $X_i = \mathcal{A}$ does not appear $B - k$ times. So the expected times $\hat{\mu}$ equal to $\mu$ in $B$ resamples is $Bp = B\frac{n!}{n^n}$.

Immediately, we notice the ways to increase the chance $\hat{\mu} = \mu$ when $n$ is large is by increasing the number of $B$.

This property also implies that as $B$ large, the distribution of $\hat{X_i} = \mathcal{A}$ is approximatedly Gaussian normal, following by De Moivre–Laplace theorem, with the following density function:

\[\begin{align*} \mathbb{P}(\text{$\hat{X_i} = \mathcal{A}$ appears $k$ times}) &\approx \frac{1}{\sqrt{2 \pi B p q}} \exp\bigg(-\frac{(k - B \cdot p)^2}{2B pq}\bigg) \\ &\sim \mathcal{N}(Bp, Bpq) \end{align*}\]



Adapted from Problem 1.31 in Casella G, Berger RL. Statistical inference. 2nd ed. Boca Raton (FL): Chapman & Hall/CRC; 2024, p. 34