Safety stock formula derivation
There is a safety stock formula on Wikipedia but I've never found a step by step derivation or even a formal statement of the problem.
$$ SS = z_{\alpha } \sqrt{ E(L) \sigma_{D}^{2} + (E[D])^{2}\sigma_{L}^{2}} $$
https://en.wikipedia.org/wiki/Safety_stock
Below I provide a formal problem statement and show two ways to derive the SS formula. I'd love to get your feedback. Thank you!
Background
First let's talk about the problem we're trying to solve. Imagine you're a Supply Chain Manager of a candy factory in charge of inventory levels for raw materials (ie sugar, honey, etc). Now if you ever run out of a raw material, say sugar, then the plant will shut down and everyone will yell at you.
You may be tempted to maintain extremely high inventory levels so there's never a chance you run out. A billion tons of sugar in the warehouse at all times. Unfortunately this takes up space and ties up money in inventory. Therefore, it is your job to come up with the minimal amount of safety stock needed to maintain a certain service level.
As the Supply Chain Manager you're faced with two uncertainties. First is the amount of material needed in a given time frame. This quantity is uncertain because it is impossible to perfectly forecast things like customer demand. For example, next week do we need three or four tons of sugar to support production for Halloween sales? The second uncertainty is not knowing exactly when new deliveries will arrive. Suppliers are not always on time.
Formalizing the problem
Let $D_1,D_2,...,D_n$ be a sequence of independent and identically distributed random variables. Each $D_i$ represents your demand quantity in one time period. For example, say we choose one time period to be a week. Then $D_1$ and $D_2$ are quantities that represent your total raw material needs in week one and two respectively.
Let $T \in [1,2,...]$ be a random variable that represents the time between deliveries. For example, let's say the supplier delivers every two weeks and is never late. Then $T=2$ always and in this situation $T$ is a constant random variable. In the real world however $T$ can vary depending on how unreliable the supplier is.
Therefore the quantity we are interested in for this paper is:
$$D(T) = \sum_{i=1}^{T} D_i = D_1 + D_2 + ... + D_T$$
As you can see the two unknowns are the demand quantities within each time period (the $D_i$'s) and the amount of time periods until the next delivery. Therefore, $D(T)$ is a random variable that represents your total demand needed between deliveries. Therefore the amount of safety stock we keep is some multiple $C$ of the standard deviation of $D(T)$. That is,
$$Safety \,\, Stock = C \sqrt{\mu_L \sigma_D^2 + \mu_D^2 \sigma_L^2} = C \sqrt{Var(D(T))}$$
For example, let's say that $D(T) \sim Normal(\mu, \sigma^2)$ then by keeping $C=3$ standard deviations on top of the mean you have a $\Phi(3) = 99.8$ percent chance of not running out. Note that $D(T)$ will not always be normally distributed. But by Chebyshev's Theorem, regardless of how $D(T)$ is distributed if you set $C=3$ you have at least an $89$ percent chance of not running out.
Super Fast Derivation
It turns out this already a well known problem, just not necessarily in the context of supply chains or safety stock. Note that $D(T)$ is the sum of a random number of random variables that are independent and identically distributed. Note that $T$ is a non negative integer and for this problem we are assuming that $T$ is independent of the sequence $D_i$. Using the law of total variance
$$Var(X) = E[Var(X|Y)] + Var(E[X|Y])$$
we can condition the sum of the $D_i$'s on $T$ and after using the linearity of expectations and the fact that the variance of a sum of independent RVs is the sum of each variance, get
$$Var\bigg(\sum_{i=1}^{T} D_i \bigg) = E[T] Var(D) + (E[D])^2 Var(T)$$
Therefore one standard deviation of $D(T)$ is
$$\sqrt{Var(D(T))} = \sqrt{\mu_L \sigma_D^2 + \mu_D^2 \sigma_L^2}$$
and the final safety stock formula is a multiple $C$ of one standard deviation of $D(T)$.
Another Derivation
Here's another way of deriving the formula. Again let $T$ be a random variable that gives the duration of time between deliveries. Let $D(T)$ be a random variable that represents the total demand during an interval of length $T$. In the end what we want to derive is $\sqrt{Var(D(T))}$ which is the standard deviation of $D(T)$.
Let's first condition on the expected value of $D(T)$
$$E[D(T) | T = t] = E[D_1 + D_2 + ... + D_t] = \sum_{i=1}^{t} E[D_i] = tE[D_i] = t\mu_D $$
Remember in the assumptions that each $D_i$ is independent. Similarly,
$$Var(D(T)|T=t) = Var(D_1 + D_2 + ... + D_t) = \sum_{i=1}^t Var(D_i) = t\sigma_D^2$$
Therefore
$$E[D(T)|T] = T\mu_D$$
and
$$Var(D(T)|T) = T\sigma_D^2$$
Now we use the property
$$Var(X|Y) = E[X^2|Y] - (E[X|Y])^2$$
which can be rearranged into
$$E[X^2|Y] = Var(X|Y) + (E[X|Y])^2$$
For our problem $X$ is $D(T)$ and $Y$ is $T$. Plugging in we get
$$ E[D(T)^2|T] = Var(D(T)|T) + (E[D(T)|T])^2$$
$$= T\sigma_D^2 + T^2\mu_D^2$$
Now we use the following property to get the unconditional expectations.
$$E[X] = E[E[X|Y]]$$
Therefore,
$$E[D(T)] = E[E[D(T)|T]] = E[T\mu_D] = \mu_D E[T] = \mu_D \mu_L$$
and
$$E[D(T)^2] = E[E[D(T)^2)|T]] = E[T\sigma_D^2 + T^2\mu_D^2] =E[T\sigma_D^2] + E[T^2\mu_D^2] $$
$$= \sigma_D^2 E[T] + \mu_D^2 E[T^2] = \sigma_D^2 \mu_L + \mu_D^2 E[T^2] = \sigma_D^2 \mu_L + \mu_D^2 (\sigma_L^2 + \mu_L^2)$$
And the last equality follows because $Var(T) = E[T^2] - (E[T])^2$ which gives us $E[T^2] = \sigma_L^2 + \mu_L^2$
Finally,
$$Var(D(T)) = E[D(T)^2] - (E[D(T)])^2 = \sigma_D^2 \mu_L + \mu_D^2 (\sigma_L^2 + \mu_L^2) - (\mu_D \mu_L)^2$$
$$ = \mu_L \sigma_D^2 + \mu_D^2 \sigma_L^2$$
And one standard deviation of $D(T)$ is
$$\sigma_{D(T)}=\sqrt{Var(D(T))} = \sqrt{\mu_L \sigma_D^2 + \mu_D^2 \sigma_L^2}$$
So our safety stock should be some multiple of the above based on our risk tolerance.
Articles
Personal notes I've written over the years.
- When does the Binomial become approximately Normal
- Gambler's ruin problem
- The t-distribution becomes Normal as n increases
- Marcus Aurelius on death
- Proof of the Central Limit Theorem
- Proof of the Strong Law of Large Numbers
- Deriving Multiple Linear Regression
- Safety stock formula derivation
- Derivation of the Normal Distribution
- Comparing means of Normal populations
- Concentrate like a Roman
- How to read a Regression summary in R
- Notes on Expected Value
- How to read an ANOVA summary in R
- The time I lost faith in Expected Value
- Notes on Weighted Linear Regression
- How information can update Conditional Probability
- Coupon collecting singeltons with equal probability
- Coupon collecting with n pulls and different probabilities
- Coupon collecting with different probabilities
- Coupon collecting with equal probability
- Adding Independent Normals Is Normal
- The value of fame during and after life
- Notes on the Beta Distribution
- Notes on the Gamma distribution
- Notes on Conditioning
- Notes on Independence
- A part of society
- Conditional Expectation and Prediction
- Notes on Covariance
- Deriving Simple Linear Regression
- Nature of the body
- Set Theory Basics
- Polynomial Regression
- The Negative Hyper Geometric RV
- Notes on the MVN
- Deriving the Cauchy density function
- Exponential and Geometric relationship
- Joint Distribution of Functions of RVs
- Order Statistics
- The Sample Mean and Sample Variance
- Probability that one RV is greater than another
- St Petersburg Paradox
- Drunk guy by a cliff
- The things that happen to us