Fri 27 September 2019

Derivation of the Normal Distribution

Written by Hongjinn Park in Articles

Where does the bell curve (pdf) of the Normal distribution come from? When I first saw this function I was like, how in the world did they come up with that?

$$f(x) = \frac{1}{\sqrt{2 \pi}\sigma} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}$$

First, completely forget about the function above. Let's just think about throwing darts at the origin (Cartesian coordinates). In this thought experiment we're not professional dart throwers. So we don't always get a bullseye and there's some randomness to where the darts land. Based on two assumptions about the randomness, the distribution of darts must be Normal.


Assumptions

1. The closer you are to the origin the higher the probability

2. Left right accuracy or inaccuracy doesn't impact up down accuracy/inaccuracy at all. "Rotationally invariant" and $x$ and $y$ independent

Derivation

Therefore we're looking for a function $\varphi : R^2 \rightarrow [0,1]$

Here we start with $\varphi$ is in polar coordinates. So $\varphi(r)$ should give you a probability between $[0,1]$

Note that $\int_{-\infty}^{\infty} \varphi (r)dA = 1$ and also note that it's $\varphi(r)$ and not $\varphi(r,\theta)$ due to assumption 2.

Now from assumption 2 we know that the pdf can be split into two marginal pdfs:

$$\varphi(r) = f_X(x) f_Y(y)$$ $$= f(x)f(y)$$

Where last equality comes from the fact that $f_X = f_Y$ therefore we can just call it $f$

Transform from polar coordinates to Cartesian which means $\forall x$ and $\forall y$

$$\varphi(\sqrt{x^2+y^2}) = f(x)f(y)$$

Set $y=0$ and get:

$$\varphi(x) = f(x)f(0)$$

Note that $f(0)$ is just a constant therefore let $f(0) = \lambda$.

$$\varphi(x)=\lambda f(x)$$

Therefore

$$\lambda f(\sqrt{x^2+y^2}) = f(x)f(y)$$

Multiply both sides by $\frac{1}{\lambda^2}$ which says $\lambda$ shouldn't be zero amirite?

$$\frac{f(\sqrt{x^2+y^2})}{\lambda} = \frac{f(x)}{\lambda} \frac{f(y)}{\lambda}$$

Let $g(x) = \frac{f(x)}{\lambda}$ which means that:

$$g(\sqrt{x^2+y^2}) = g(x)g(y)$$

Now from examining $g$ we decide that $g$ must be an exponential and therefore in the form $g(x) = e^{Ax^2}$ and note that $C^x = e^{xln(C)}$

If $g$ is in the form $g(x) = e^{Ax^2}$ then plugging into $g(\sqrt{x^2+y^2}) = g(x)g(y)$ you get:

$$e^{Ax^2}e^{Ay^2} = e^{A(x^2+y^2)}$$

And since $g(x) = \frac{f(x)}{\lambda}$ we have that

$$ f(x) = \lambda e^{Ax^2} $$

Note that $A$ has to be negative otherwise your pdf looks kind of like a parabola which means your probability of missing really bad gets higher as you suck more. By making $A$ negative you get the classic bell shaped curve. Therefore just define $A=-h^2$ to guarantee non positive values only

$$ f(x) = \lambda e^{-h^2x^2} $$

By the axiom of probability,

$$\int_{-\infty}^{\infty} \lambda e^{-h^2x^2} = 1 $$

Now there is a famous integral which says that $\int_{-\infty}^{\infty} e^{-x^2} = \sqrt{\pi} $ and so we need to get into the right form.

Let $u=hx$ and $du = h dx$

$$\frac{\lambda}{h} \int_{-\infty}^{\infty} e^{-u^2} du = 1$$

So this means that $ \lambda = \frac{h}{\sqrt{\pi}}$ and $h^2 = \lambda^2 \pi$ and therefore

$$f(x) = \lambda e^{-\lambda^2 \pi x^2}$$

Interesting place to be. Let's mess with various values of $\lambda$. If $\lambda = 1 \Rightarrow f(x) = e^{-\pi x^2}$ and I did check that the area under this case is one. If you graph the $\lambda = 1$ case versus $\lambda = 5$ you get the same bell shaped curve but pulled upward (taller and skinnier). Therefore we can think of $\lambda$ as the variance of our data $\sigma$ and note that as $\lambda \nearrow \sigma \searrow$ and as $\lambda \searrow \sigma \nearrow$ therefore the next objective is to get $\lambda$ in terms of $\sigma$

By definition $\sigma_X^2 = Var(X) = E[(X-\mu)^2] = \int_{-\infty}^{\infty} (x-\mu)^2f(x) dx$

$$\sigma^2 = \int_{-\infty}^{\infty} (x-0)^2f(x) dx = \int_{-\infty}^{\infty} x^2 \lambda e^{-\lambda^2 \pi x^2} dx$$

Integration by parts:

$$u = x$$ $$du = dx$$ $$dv = x e^{-\lambda^2 \pi x^2} dx$$ $$v = afterUsub = \frac{-1}{2\pi \lambda^2} e^{-\lambda^2 \pi x^2}$$ $$= \lambda\frac{-x}{2\pi\lambda^2} e^{-\pi \lambda^2 x^2} \Biggr|_{-\infty}^{\infty} + \int_{-\infty}^{\infty} \frac{1}{2 \pi \lambda^2} e^{-\pi \lambda^2 x^2} dx) = \lambda \int_{-\infty}^{\infty} \frac{1}{2 \pi \lambda^2} e^{-\pi \lambda^2 x^2} dx$$ $$ = \frac{1}{2 \pi \lambda^2} \int_{-\infty}^{\infty} \lambda e^{-\pi \lambda^2 x^2} dx = \frac{1}{2 \pi \lambda^2} = \sigma^2$$ $$\Rightarrow \lambda^2 = \frac{1}{2\pi \sigma^2} \Rightarrow \lambda = \frac{1}{\sqrt{2 \pi} \sigma}$$

Remember that $\lambda$ and $\sigma$ are inversely proportional therefore,

$$f(x) = \frac{1}{\sqrt{2 \pi} \sigma} e^{\frac{-1}{2}(\frac{x}{\sigma})^2}$$

And if you want to center this at some other value $\mu$ then just do

$$f(x) = \frac{1}{\sqrt{2 \pi} \sigma} e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}$$


Articles

Personal notes I've written over the years.