Sat 5 January 2019

Coupon collecting with different probabilities

Written by Hongjinn Park in Articles

Praying to RNGesus for a good pull. Definitely a relevant problem. You can't apply the technique in the equal probability case. Go a different route.


Cleverly realize that if $X$ is the number of coupons to collect a full set, and $X_i$ is the number of picks to get coupon $i$ from a sequence of pulls, then $X = \max\limits_{i=1,...,n} X_i$

A few of my own observations. $X_i \neq X_j \quad \forall i,j$ and also these $n$ RVs are not independent of each other. For if you know $X_1=2$ then you know all the other RVs cannot equal 2.

Next, recall the Max-Mins Identity, which says for a set of $n$ numbers (can be negative or decimals) there is a formula to calculate the max of the numbers: $$\max\limits_{i=1,...,n} x_i = \sum_{i=1}^{n} x_i - \sum_{i < j} \min (x_i, x_j) + ... + (-1)^{n+1} \min(x_1, ... ,x_n)$$

and therefore for RVs and by the LOE,

$$E\bigg[\max\limits_{i} X_i\bigg]= \sum_{i=1}^{n} E[X_i] - \sum_{i < j} E[\min (X_i, X_j)] + ... + (-1)^{n+1} E[\min(X_1, ... ,X_n)]$$

Now first look at the easiest first term, $X_i$ and realize that this is a geometric RV with probability $p_i$. Summing all $n$ yields $\frac{1}{p_1}+\frac{1}{p_2}+...+\frac{1}{p_n}$

Now look at $E[\min(X_i,X_j)]$ and have a very clever idea that the min of $X_i$ and $X_j$ is also a geometric RV with parameter $p_i+p_j$ and therefore has expected value $\frac{1}{p_i+p_j}$.

Now you can come up with a really crazy formula for $E[X]$ using an identity involving an integral and an identity for sums involving a product, both identities using $e$.

Let's say $p_1 = .99$ and $p_1 = .01$ then

$$E[X] = \frac{1}{.99} +\frac{1}{.01} -\frac{1}{1} = 1.010101 + 100 - 1 \approx 100.01 $$


Articles

Personal notes I've written over the years.