Coupon collecting with n pulls and different probabilities
Let $X$ be the number of unique coupons in $n$ pulls, what is the expected value and variance when the probabilities are different?
So the max formula doesn't apply as well here since your total pulls is restricted to a set number.
Let's be very clever and this time think about the number of uncollected types. Let $X$ be the number of uncollected coupons and $X_i = 1$ if coupon $i$ is uncollected in n pulls. Therefore, $Y$ which is the number of unique coupons collected in $n$ pulls is
$$Y = N-X = N-\sum_{i=1}^{N} X_i$$ $$E[Y] = N-\sum_{i=1}^{N} E[X_i] = N-\sum_{i=1}^{N} P(X_i=1) = N - \sum_{i=1}^{N} (1-p_i)^n$$Let's say $N=20$ and $n=10$ and $p_i = \frac{1}{20}$ then $E[Y] = 20 - 20(\frac{19}{20})^{10} = 8.02$ so you're expected to have two duplicates.
Also, since $X = N-Y$,
$$E[X] = E[N-Y] = N - E[Y] = \sum_{i=1}^{N} (1-p_i)^n$$ Also note that the $Var(Y) =Var(N-X) = 0 + (-1)^2Var(X) = Var(X)$. The $Var(X) = E[X^2] - E[X]^2$ and I already have the expected value. To get $E[X^2]$ I use the this useful identity that you get from looking at the $E[{X \choose 2}]$ which eventually gives you $$E[X^2] = 2 \sum_{i< j} P(A_iA_j) + E[X]$$For my problem, the event that both $X_i$ and $X_j$ both happen occurs with probability $(1-p_i-p_j)^n$ and therefore,
$$E[X^2] = 2 \sum_{i < j} (1-p_i-p_j)^n + E[X]$$And from all this crap you can get $Var(X) = Var(Y)$.
Articles
Personal notes I've written over the years.
- When does the Binomial become approximately Normal
- Gambler's ruin problem
- The t-distribution becomes Normal as n increases
- Marcus Aurelius on death
- Proof of the Central Limit Theorem
- Proof of the Strong Law of Large Numbers
- Deriving Multiple Linear Regression
- Safety stock formula derivation
- Derivation of the Normal Distribution
- Comparing means of Normal populations
- Concentrate like a Roman
- How to read a Regression summary in R
- Notes on Expected Value
- How to read an ANOVA summary in R
- The time I lost faith in Expected Value
- Notes on Weighted Linear Regression
- How information can update Conditional Probability
- Coupon collecting singeltons with equal probability
- Coupon collecting with n pulls and different probabilities
- Coupon collecting with different probabilities
- Coupon collecting with equal probability
- Adding Independent Normals Is Normal
- The value of fame during and after life
- Notes on the Beta Distribution
- Notes on the Gamma distribution
- Notes on Conditioning
- Notes on Independence
- A part of society
- Conditional Expectation and Prediction
- Notes on Covariance
- Deriving Simple Linear Regression
- Nature of the body
- Set Theory Basics
- Polynomial Regression
- The Negative Hyper Geometric RV
- Notes on the MVN
- Deriving the Cauchy density function
- Exponential and Geometric relationship
- Joint Distribution of Functions of RVs
- Order Statistics
- The Sample Mean and Sample Variance
- Probability that one RV is greater than another
- St Petersburg Paradox
- Drunk guy by a cliff
- The things that happen to us