Conditional Expectation and Prediction
This question can be related to Simple Linear Regression. Suppose you have the model,
$$ Y = \alpha + \beta x + e \quad e \sim N(0,\sigma^2)$$Then the "best" predictor of $Y$ is $E[Y \mid X = x]$ unless if $X$ and $Y$ are independent. Or you could say unless there is no regression on the input.
Let's say you observe a RV $X$ and you want to predict a second RV $Y$ where your predictor of $Y$ is $g(X)$.
Then you can prove that the "best" predictor of $Y$ is $g(X) = E[Y \mid X]$, where "best" is the function $g$ that minimizes $E[(Y-g(X))^2]$.
To prove that $E[Y|X]$ is the "best" predictor possible we just need to show that
$$E[(Y-g(X))^2] \ge E[(Y-E[Y \mid X])^2]$$and the proof for this I wrote here. Note that $E[Y]$ will be as good a predictor as $E[Y \mid X]$ if $X$ and $Y$ are independent.
An example
Say $X$ is the distance we will drive and we want to predict $Y$, the amount of gas we will use. Now $E[Y]$ which is the mean gas used for all trips wouldn't be a bad predictor of $Y$ (the amount of gas we will use) if we always travel the same distance more or less. But if we take trips of varying distance then a better predictor of $Y$ is $E[Y|X]$. For example $E[Y|X=\text{1 mile}]$ versus $E[Y|X=\text{100 miles}]$.
Another example
Say $X_1$ and $X_2$ are iid then
$$E[X_1 \mid X_1 + X_2] + E[X_2 \mid X_1 + X_2] = E[X_1 + X_2 \mid X_1 + X_2] = X_1 + X_2$$since $E[X \mid X] = X$. Also, since $E[X_1 \mid X_1 + X_2] = E[X_2 \mid X_1 + X_2]$ we have that $E[X_1 \mid X_1 + X_2] = \frac{X_1 + X_2}{2}$
Articles
Personal notes I've written over the years.
- When does the Binomial become approximately Normal
- Gambler's ruin problem
- The t-distribution becomes Normal as n increases
- Marcus Aurelius on death
- Proof of the Central Limit Theorem
- Proof of the Strong Law of Large Numbers
- Deriving Multiple Linear Regression
- Safety stock formula derivation
- Derivation of the Normal Distribution
- Comparing means of Normal populations
- Concentrate like a Roman
- How to read a Regression summary in R
- Notes on Expected Value
- How to read an ANOVA summary in R
- The time I lost faith in Expected Value
- Notes on Weighted Linear Regression
- How information can update Conditional Probability
- Coupon collecting singeltons with equal probability
- Coupon collecting with n pulls and different probabilities
- Coupon collecting with different probabilities
- Coupon collecting with equal probability
- Adding Independent Normals Is Normal
- The value of fame during and after life
- Notes on the Beta Distribution
- Notes on the Gamma distribution
- Notes on Conditioning
- Notes on Independence
- A part of society
- Conditional Expectation and Prediction
- Notes on Covariance
- Deriving Simple Linear Regression
- Nature of the body
- Set Theory Basics
- Polynomial Regression
- The Negative Hyper Geometric RV
- Notes on the MVN
- Deriving the Cauchy density function
- Exponential and Geometric relationship
- Joint Distribution of Functions of RVs
- Order Statistics
- The Sample Mean and Sample Variance
- Probability that one RV is greater than another
- St Petersburg Paradox
- Drunk guy by a cliff
- The things that happen to us