Tuesday, 21 October 2014

combinatorics - Probability distribution in the coupon collector's problem



I'm trying to solve the well known Coupon Collector's Problem by explicitly finding the probability distribution (so far all the methods I read involve using some sort of trick). However, I'm not having much luck getting anywhere as combinatorics is not something I'm particularly good at.



The Coupon Collector's Problem is stated as:




There are $m$ different kinds of coupons to be collected from boxes. Assuming each type of coupon is equally likely to be found per box, what's the expected amount of boxes one has to buy to collect all types of coupons?



What I'm attempting:



Let $N$ be the random variable associated with the number of boxes one has to buy to find all coupons. Then $P(N=n)=\frac{|A_n|}{|\Omega _n|}$, where $A_n$ is the set of all outcomes such that all types of coupons are observed in $n$ buys, and $\Omega _n$ is the set of all the possible outcomes in $n$ buys. I think $|\Omega _n| = m^n$, but I'm not even sure about that anymore, as all my attempts so far led to garbage probabilities that either diverged or didn't sum up to 1.


Answer



As Henry pointed out in a comment, the probability has been determined elsewhere as



$$
\def\stir#1#2{\left\{#1\atop#2\right\}}

P(N=n)=\frac{m!}{m^n}\stir{n-1}{m-1}\;,
$$



where



$$\stir nk=\frac1{k!}\sum_{j=0}^k(-1)^{k-j}\binom kjj^n$$



is a Stirling number of the second kind and counts the number of partitions of a set of $n$ labeled objects into $k$ non-empty unlabeled subsets.



To get the expected value, it's slightly more convenient to work with the probability




$$
P(N\gt n)=1-\frac{m!}{m^n}\stir nm\;,
$$



which can be derived in much the same manner: There are $m^n$ sequences of length $n$; choose one of $\stir nm$ partitions into $m$ non-empty subsets and one of $m!$ assignments of the coupons types to the subsets.



Then



\begin{align}

E[N]={}&\sum_{n=0}^\infty P(N\gt n)\\
={}&\sum_{n=0}^\infty\left(1-\frac{m!}{m^n}\stir nm\right)\\
={}&\sum_{n=0}^\infty\left(1-\frac{m!}{m^n}\frac1{m!}\sum_{j=0}^m(-1)^{m-j}\binom mjj^n\right)\\
={}&\sum_{n=0}^\infty\frac1{m^n}\sum_{j=0}^{m-1}(-1)^{m-j+1}\binom mjj^n\\
={}&\sum_{j=0}^{m-1}(-1)^{m-j+1}\binom mj\sum_{n=0}^\infty\frac{j^n}{m^n}\\
={}&\sum_{j=1}^m(-1)^{j+1}\binom mj\sum_{n=0}^\infty\frac{(m-j)^n}{m^n}\\
={}&\sum_{j=1}^m(-1)^{j+1}\binom mj\frac mj\\
={}&-m\sum_{j=1}^m\int_0^{-1}\mathrm dq'\binom mjq'^{j-1}\\
={}&-m\int_0^{-1}\mathrm dq'\sum_{j=1}^m\binom mjq'^{j-1}\\
={}&-m\int_0^{-1}\mathrm dq'\frac{(1+q')^m-1}{q'}\\

={}&-m\int_0^{-1}\mathrm dq'\frac{(1+q')^m-1}{(1+q')-1}\\
={}&-m\int_0^{-1}\mathrm dq'\sum_{j=0}^{m-1}(-q')^j\\
={}&-m\sum_{j=0}^{m-1}\int_0^{-1}\mathrm dq'(-q')^j\\
={}&m\sum_{j=1}^m\frac1j\;.
\end{align}



I'll leave it to you to decide whether this counts as "using some sort of trick". :-)


No comments:

Post a Comment

real analysis - How to find $lim_{hrightarrow 0}frac{sin(ha)}{h}$

How to find $\lim_{h\rightarrow 0}\frac{\sin(ha)}{h}$ without lhopital rule? I know when I use lhopital I easy get $$ \lim_{h\rightarrow 0}...