The Sample Mean and Sample Variance
The Sample Mean and Sample Variance come up everywhere in real life. Let's examine some properties about them.
Definitions
A sample (or random sample) is a collection of iid RVs $X_1, ..., X_n$ from a common distribution $F$
A statistic is a RV whose value is determined by the sample data. For example the Sample Mean $\overline{X}_n$, the Sample Variance $S^2$, the jth order statistic $X_{(j)}$, etc
$$ \text{The Sample Mean is:} \quad \overline{X}_n = \frac{X_1 + ... + X_n}{n} = \frac{1}{n} \sum_{i=1}^n X_i $$ $$ \text{The Sample Variance is:} \quad S^2 = \frac{\sum_{i=1}^n (X_i - \overline{X}_n)^2}{n-1} = \frac{1}{n-1} \sum_{i=1}^n (X_i - \overline{X}_n)^2 = \frac{\sum_{i=1}^n \left(X_i - \frac{X_1 + ... + X_n}{n} \right)^2}{n-1} $$Note that $\overline{X}_n$ and $S^2$ do NOT include the true population parameters $\mu$ and $\sigma^2$, only the observations $X_1, ..., X_n$
My notation: if I write $S^2$ then the denominator is $n-1$, otherwise I will use $S_n^2$ for a denominator of $n$
Properties if $X_1, ..., X_n$ iid from ANY distribution
$E[ \overline{X}_n ] = E \left[ \frac{X_1 + ... + X_n}{n} \right] = \frac{1}{n} (E[X_1] + ... + E[X_n]) = \frac{1}{n} (nE[X_1]) = E[X_1] = \mu$
$Var( \overline{X}_n ) = Var \left[ \frac{X_1 + ... + X_n}{n} \right] = \frac{1}{n^2} (Var[X_1] + ... + Var[X_n]) = \frac{1}{n^2} (n Var[X_1]) = \frac{Var(X)}{n} = \frac{\sigma^2}{n}$
$E[S_{n}^2] = \left( 1 - \frac{1}{n} \right) \sigma^2$ which is the "uncorrected" sample variance
$E[S^2] = E[S_{n-1}^2] = \sigma^2$ which is unbiased
$Cov(\overline{X}_n, X_i - \overline{X}_n) = Cov(\overline{X}_n, X_i) - Cov(\overline{X}_n, \overline{X}_n) = Cov \left(\frac{X_1 + ... + X_n}{n}, X_i \right) - Var(\overline{X}_n) = \frac{\sigma^2}{n} - \frac{\sigma^2}{n} = 0 \quad \forall i \in [1,n]$
Properties if $X_1, ..., X_n$ iid $Normal(\mu, \sigma^2)$ only
Special case: if the $X_i$'s are iid $N(\mu, \sigma^2)$ then $\overline{X}_n$ and $S^2$ are independent! This is because:
a) The vector $\mathbf{X} = (\overline{X}_n, X_1 - \overline{X}_n, ... , X_n - \overline{X}_n)$ is MVN since any linear combination is Normal
b) For iid RVs from any distribution, $Cov(\overline{X}_n, X_i - \overline{X}_n) = 0 $ for all $i$ and if a random vector has a MVN distribution then uncorrelated components are independent
c) If $\overline{X}_n$ is independent of the entire sequence $X_i - \overline{X}_n$ for all $i$ then $\overline{X}_n$ is independent of any function of this sequence, including $S^2 = \frac{1}{n-1} \sum_{i=1}^n (X_i - \overline{X}_n)^2 $
If the $X_i$'s are iid $N(\mu, \sigma^2)$ then $\overline{X}_n \sim N \left( \mu, \frac{\sigma^2}{n} \right) $
If the $X_i$'s are iid $N(\mu, \sigma^2)$ then $\frac{\overline{X}_n - \mu_{\overline{X}_n}}{\sigma_{\overline{X}_n}} = \sqrt{n} \left( \frac{\overline{X}_n - \mu}{\sigma} \right) \sim N(0,1)$
If the $X_i$'s are iid $N(\mu, \sigma^2)$ then $(n-1) \frac{S^2}{\sigma^2} \sim \chi_{n-1}^2$
If the $X_i$'s are iid $N(\mu, \sigma^2)$ then $\sqrt{n} \left( \frac{\overline{X}_n - \mu}{S} \right) \sim t_{n-1}$
Off topic: by the CLT if the $X_i$'s are iid from ANY distribution then $\sqrt{n} \left( \frac{\overline{X}_n - \mu}{\sigma} \right) \rightarrow N(0,1)$ in distribution as $n \rightarrow \infty$
Proof that if $X_i$'s are iid $N(\mu, \sigma^2)$ then $ (n-1) \frac{S^2}{\sigma^2} \sim \chi_{n-1}^2 $
\begin{align*} S^2 & = \frac{ \sum_{i=1}^n (X_i - \overline{X}_n)^2 }{n-1} \\ (n-1) S^2 & = \sum_{i=1}^n (X_i - \overline{X}_n)^2 \\ & = \sum_{i=1}^n (X_i - \mu + \mu - \overline{X}_n)^2 \\ & = \left[ \sum_{i=1}^n (X_i - \mu)^2 \right] - n(\overline{X}_n - \mu)^2 \quad \quad \text{details in Appendix 1}\\ \end{align*}Next divide both sides by $\sigma^2$
$$ (n-1) \frac{S^2}{\sigma^2} = \frac{\sum_{i=1}^n (X_i - \mu)^2}{\sigma^2} - \frac{n(\overline{X}_n - \mu)^2}{\sigma^2} $$ $$ (n-1) \frac{S^2}{\sigma^2} = \sum_{i=1}^n \left( \frac{X_i - \mu}{\sigma} \right)^2 - n \left( \frac{\overline{X}_n - \mu}{\sigma} \right)^2$$ $$ (n-1) \frac{S^2}{\sigma^2} + \left( \sqrt{n} \left( \frac{\overline{X}_n - \mu}{\sigma} \right) \right)^2 = \sum_{i=1}^n \left( \frac{X_i - \mu}{\sigma} \right)^2$$ $$ (n-1) \frac{S^2}{\sigma^2} + Z^2 = Z_1^2 + ... + Z_{n-1}^2 + Z^2$$Note that the two terms on the LHS are independent! At this point it seem the first term of the LHS is a $\chi_{n-1}^2$ and this turns out to be the case. This is proven by showing the MGF of the LHS is equal to the MGF of the RHS.
Recall that:
a) the MGF of the sum of two independent RVs is the product of their individual MGFs
b) the MGF of a $\chi_{n}^2$ RV is $(1-2t)^{-n/2}$
$$E[e^{t(n-1)S^2/\sigma^2}] (1-2t)^{-1/2} = (1-2t)^{-n/2}$$ $$E[e^{t(n-1)S^2/\sigma^2}] = (1-2t)^{-(n-1)/2}$$and since an MGF uniquely determines the distribution it follows that $(n-1)S^2/\sigma^2 \sim \chi_{n-1}^2$
Claim
If $X_i$'s are iid $N(\mu, \sigma^2)$ then $ \sqrt{n} \left( \frac{\overline{X}_n - \mu}{S} \right) \sim t_{n-1} $Proof
First recall that if $Z, Z_1, Z_2, ... , Z_n$ are iid $N(0,1)$ then
$$T = \frac{Z}{\sqrt{\frac{Z_1^2 + ... + Z_n^2}{n}}} \sim t_n $$For our proof, since the $X_i$'s are iid $N(\mu, \sigma^2)$ we have
a) $\sqrt{n} \left( \frac{\overline{X}_n - \mu}{\sigma} \right) \sim N(0,1)$
b) $(n-1) \frac{S^2}{\sigma^2} \sim \chi_{n-1}^2$
c) $\overline{X}_n$ is independent of $S^2$ which means $\overline{X}_n$ is independent of any function of $S^2$ and therefore (a) is independent of (b) and so
$$ \frac{Z}{\sqrt{\frac{\chi_{n-1}^2}{n-1}}} = \frac{ \sqrt{n} \left( \frac{\overline{X}_n - \mu}{\sigma} \right) }{ \sqrt{ \frac{(n-1) \frac{S^2}{\sigma^2}}{n-1}}} = \frac{ \frac{ \sqrt{n}( \overline{X}_n - \mu )}{\sigma} }{ \frac{S}{\sigma}} = \sqrt{n} \left( \frac{\overline{X}_n - \mu}{S} \right) \sim t_{n-1}$$Articles
Personal notes I've written over the years.
- When does the Binomial become approximately Normal
- Gambler's ruin problem
- The t-distribution becomes Normal as n increases
- Marcus Aurelius on death
- Proof of the Central Limit Theorem
- Proof of the Strong Law of Large Numbers
- Deriving Multiple Linear Regression
- Safety stock formula derivation
- Derivation of the Normal Distribution
- Comparing means of Normal populations
- Concentrate like a Roman
- How to read a Regression summary in R
- Notes on Expected Value
- How to read an ANOVA summary in R
- The time I lost faith in Expected Value
- Notes on Weighted Linear Regression
- How information can update Conditional Probability
- Coupon collecting singeltons with equal probability
- Coupon collecting with n pulls and different probabilities
- Coupon collecting with different probabilities
- Coupon collecting with equal probability
- Adding Independent Normals Is Normal
- The value of fame during and after life
- Notes on the Beta Distribution
- Notes on the Gamma distribution
- Notes on Conditioning
- Notes on Independence
- A part of society
- Conditional Expectation and Prediction
- Notes on Covariance
- Deriving Simple Linear Regression
- Nature of the body
- Set Theory Basics
- Polynomial Regression
- The Negative Hyper Geometric RV
- Notes on the MVN
- Deriving the Cauchy density function
- Exponential and Geometric relationship
- Joint Distribution of Functions of RVs
- Order Statistics
- The Sample Mean and Sample Variance
- Probability that one RV is greater than another
- St Petersburg Paradox
- Drunk guy by a cliff
- The things that happen to us