Conditional Expectation and Prediction

Tue 19 June 2018

Conditional Expectation and Prediction

Written by Hongjinn Park in Articles

This question can be related to Simple Linear Regression. Suppose you have the model,

$$ Y = \alpha + \beta x + e \quad e \sim N(0,\sigma^2)$$

Then the "best" predictor of $Y$ is $E[Y \mid X = x]$ unless if $X$ and $Y$ are independent. Or you could say unless there is no regression on the input.

Let's say you observe a RV $X$ and you want to predict a second RV $Y$ where your predictor of $Y$ is $g(X)$.

Then you can prove that the "best" predictor of $Y$ is $g(X) = E[Y \mid X]$, where "best" is the function $g$ that minimizes $E[(Y-g(X))^2]$.

To prove that $E[Y|X]$ is the "best" predictor possible we just need to show that

$$E[(Y-g(X))^2] \ge E[(Y-E[Y \mid X])^2]$$

and the proof for this I wrote here. Note that $E[Y]$ will be as good a predictor as $E[Y \mid X]$ if $X$ and $Y$ are independent.

An example

Say $X$ is the distance we will drive and we want to predict $Y$, the amount of gas we will use. Now $E[Y]$ which is the mean gas used for all trips wouldn't be a bad predictor of $Y$ (the amount of gas we will use) if we always travel the same distance more or less. But if we take trips of varying distance then a better predictor of $Y$ is $E[Y|X]$. For example $E[Y|X=\text{1 mile}]$ versus $E[Y|X=\text{100 miles}]$.

Another example

Say $X_1$ and $X_2$ are iid then

$$E[X_1 \mid X_1 + X_2] + E[X_2 \mid X_1 + X_2] = E[X_1 + X_2 \mid X_1 + X_2] = X_1 + X_2$$

since $E[X \mid X] = X$. Also, since $E[X_1 \mid X_1 + X_2] = E[X_2 \mid X_1 + X_2]$ we have that $E[X_1 \mid X_1 + X_2] = \frac{X_1 + X_2}{2}$

Articles

Personal notes I've written over the years.