Tue 19 June 2018

Conditional Expectation and Prediction

Written by Hongjinn Park in Articles

This question can be related to Simple Linear Regression. Suppose you have the model,

$$ Y = \alpha + \beta x + e \quad e \sim N(0,\sigma^2)$$

Then the "best" predictor of $Y$ is $E[Y \mid X = x]$ unless if $X$ and $Y$ are independent. Or you could say unless there is no regression on the input.


Let's say you observe a RV $X$ and you want to predict a second RV $Y$ where your predictor of $Y$ is $g(X)$.

Then you can prove that the "best" predictor of $Y$ is $g(X) = E[Y \mid X]$, where "best" is the function $g$ that minimizes $E[(Y-g(X))^2]$.

To prove that $E[Y|X]$ is the "best" predictor possible we just need to show that

$$E[(Y-g(X))^2] \ge E[(Y-E[Y \mid X])^2]$$

and the proof for this I wrote here. Note that $E[Y]$ will be as good a predictor as $E[Y \mid X]$ if $X$ and $Y$ are independent.

An example

Say $X$ is the distance we will drive and we want to predict $Y$, the amount of gas we will use. Now $E[Y]$ which is the mean gas used for all trips wouldn't be a bad predictor of $Y$ (the amount of gas we will use) if we always travel the same distance more or less. But if we take trips of varying distance then a better predictor of $Y$ is $E[Y|X]$. For example $E[Y|X=\text{1 mile}]$ versus $E[Y|X=\text{100 miles}]$.

Another example

Say $X_1$ and $X_2$ are iid then

$$E[X_1 \mid X_1 + X_2] + E[X_2 \mid X_1 + X_2] = E[X_1 + X_2 \mid X_1 + X_2] = X_1 + X_2$$

since $E[X \mid X] = X$. Also, since $E[X_1 \mid X_1 + X_2] = E[X_2 \mid X_1 + X_2]$ we have that $E[X_1 \mid X_1 + X_2] = \frac{X_1 + X_2}{2}$



Articles

Personal notes I've written over the years.