Saturday 20 December 2014

What is the probability distribution for the number of N-sided die rolls needed to get M unique results?

Suppose you have a fair $N$-sided die. You decide to roll it until $M$ unique values have been produced (i.e. you re-roll all previously rolled values). How many times will you roll the die? (Given $2 <= M <= N$.)



I know that for the special case of $M=2$ it's simply a matter of how many times you have to re-roll your attempt for the second value, so the distribution is: $$P^N_2(X=u) = \left(\frac{1}{N}\right)^{u-2}\left(1-\frac{1}{N}\right) = \frac{N-1}{N^{u-1}}$$




And that for any $M$ the probability of the lowest possible outcome $X=M$ (i.e. no re-rolls): $$P^N_M(X=M) = \prod_{i=0}^{M-1}\left(1-\frac{i}{N}\right) = \frac{1}{N^M}\prod_{i=0}^{M-1}\left(N-i\right) = \frac{N!}{N^M(N-M)!}$$



The final clue I've got is that the probability distributions for subsequent values of $M$ can be defined using the probability distribution of the previous, like so:



$$P^N_{M}(X=u) = \sum_{i=1}^{u-M+1}\left(P^N_{M-1}(X=u-i)\left(\frac{M-1}{N}\right)^{i-1}\left(1-\frac{M-1}{N}\right)\right)$$



With that I can determine the probability distribution for any value of $M$ I want, for instance $M=3$:



$$P^N_3(X=u) = \sum_{i=1}^{u-3+1}\left(P^N_2(X=u-i)\left(\frac{3-1}{N}\right)^{i-1}\left(1-\frac{3-1}{N}\right)\right)$$




$$= \sum_{i=1}^{u-2}\left(\left(\frac{N-1}{N^{u-i-1}}\right)\left(\frac{2}{N}\right)^{i-1}\left(1-\frac{2}{N}\right)\right)$$



$$= \sum_{i=1}^{u-2}\left(\left(\frac{N-1}{N^{u-1}}\right)N^i\left(\frac{N}{2}\right)\left(\frac{2}{N}\right)^i\left(\frac{N-2}{N}\right)\right)$$



$$= \frac{(N-1)(N-2)}{2 \cdot N^{u-1}}\sum_{i=1}^{u-2}\left(2^i\right)$$



$$= \frac{(N-1)(N-2)}{N^{u-1}}\sum_{i=0}^{u-1}\left(2^i\right)$$



$$= \left(\frac{(N-1)(N-2)}{N^{u-1}}\right)\left(\frac{1-2^u}{1-2}\right)$$
$$= \frac{(N-1)(N-2)(2^u-1)}{N^{u-1}}$$




However, I have no idea how to turn this into a generic formula that will allow me to calculate the probability for any $N$, $M$, and $u$ without going through the process of figuring out the PMF of every value of $M$ leading up to the one I want.

No comments:

Post a Comment

real analysis - How to find $lim_{hrightarrow 0}frac{sin(ha)}{h}$

How to find $\lim_{h\rightarrow 0}\frac{\sin(ha)}{h}$ without lhopital rule? I know when I use lhopital I easy get $$ \lim_{h\rightarrow 0}...