Wed 23 December 2020

When does the Binomial become approximately Normal

Written by Hongjinn Park in Articles

Why does a Binomial RV with $np(1-p) \ge 10$ become approximately Normal?


By the CLT, if $Y_1, ..., Y_n$ are iid Bernoulli with parameter $p$ then

$$ \sqrt{n} \left( \frac{\bar{Y}_n - p}{\sqrt{pq}} \right) \xrightarrow[]{\text{in dist}} N(0,1) \qquad \text{as $n \rightarrow \infty$}$$

Where $\bar{Y}_n$ is the sample mean and $E[\bar{Y}_n] = \mu_{Y}=p$ and $Var(\bar{Y}_n) = \frac{\sigma_Y^2}{n} = \frac{pq}{n}$

And then focusing on the LHS of the CLT,

$$ \sqrt{n} \left( \frac{\bar{Y}_n - p}{\sqrt{pq}} \right) = \sqrt{n} \left( \frac{\frac{Y_1 + ... + Y_n}{n} - p}{\sqrt{pq}} \right) = \sqrt{n} \left( \frac{Y_1 + ... + Y_n - np}{n\sqrt{pq}} \right) = \frac{Y_1 + ... + Y_n - np}{\sqrt{npq}}$$

But $Y_1 + ... + Y_n \sim Bin(n,p)$ and so if we let $X = Y_1 + ... + Y_n$ then

$$\frac{X - np}{\sqrt{npq}} \xrightarrow[]{\text{in dist}} N(0,1) \qquad \text{as $n \rightarrow \infty$}$$

And so as $n \rightarrow \infty$,

$$\frac{X - np}{\sqrt{npq}} \approx Z$$ $$X \approx Z\sqrt{npq}+np \sim N(np, npq)$$

So as $n$ gets big, the Binomial RV,

$$X \dot{\sim} N(np, npq)$$

which shows the Normal approximation to the Binomial.



Articles

Personal notes I've written over the years.