Wednesday, 1 April 2015

probability - Finite sample bounds for generalized coupon collector problem




Consider the usual generalization of the coupon collector problem where $m$ copies of each coupon need to be collected. Let $T_m$ be the first time m copies of each coupon are collected. Donald J. Newman and Lawrence Shepp showed that
$$\mathbf{E} (T_{m})=n\log n+(m-1)n\log \log n+O(n),\ {\text{as}}\ n\to \infty $$



Then Erdős and Rényi showed that:



$$\displaystyle \operatorname {P} {\bigl (}T_{m}

But these statement are asymptotic. Are there known finite sample (upper and lower) bounds on $T_m$ ?


Answer




What follows is a minor contribution where we compute a formula for
the expectation for the case where $j$ instances of each of $n$ types
of coupons must be seen. Using the notation from the following MSE
link
we have from
first principles that



$$P[T = m] = \frac{1}{n^m}\times {n\choose 1}\times
(m-1)! [z^{m-1}]
\left(\exp(z) - \sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1}
\frac{z^{j-1}}{(j-1)!}

\\ = \frac{n}{n^m}\times
{m-1\choose j-1} (m-j)! [z^{m-j}]
\left(\exp(z) - \sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1}.$$



We verify that this is a probability distribution. The goal here
is to find a closed form for the infinite series in $m$ so that its
value may be calculated rather than approximated. Expanding the power
we find



$$\sum_{m\ge j} P[T=m] =

\frac{n}{n^j} \sum_{m\ge 0} \frac{1}{n^m} \times
{m+j-1\choose j-1} m! [z^m]
\sum_{k=0}^{n-1}
{n-1\choose k} \exp(kz) \\ \times (-1)^{n-1-k}
\left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}
\\ = \frac{n}{n^j} \sum_{m\ge 0} \frac{1}{n^m} \times
{m+j-1\choose j-1} m!
\sum_{k=0}^{n-1}
{n-1\choose k}
\sum_{p=0}^m \frac{k^{m-p}}{(m-p)!} \\ \times (-1)^{n-1-k}

[z^p] \left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}
\\ = \frac{n}{n^j} \sum_{k=0}^{n-1}
{n-1\choose k} (-1)^{n-1-k}
\sum_{p\ge 0}
[z^p] \left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}
\\ \times \sum_{m\ge p}\frac{1}{n^m} \times
{m+j-1\choose j-1} m! \times \frac{k^{m-p}}{(m-p)!}
\\ = \frac{n}{n^j} \sum_{k=0}^{n-1}
{n-1\choose k} (-1)^{n-1-k}
\sum_{p\ge 0}

n^{-p} [z^p] \left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}
\\ \times \sum_{m\ge 0}\frac{1}{n^m} \times
{m+p+j-1\choose j-1} (m+p)! \times \frac{k^m}{m!}$$



The inner sum is



$$\frac{1}{(j-1)!} \sum_{m\ge 0} (k/n)^m \frac{(m+p+j-1)!}{m!}
= \frac{(p+j-1)!}{(j-1)!} \frac{1}{(1-k/n)^{p+j}}$$



and with $P=(n-1-k)(j-1)$ we obtain




$$\bbox[5px,border:2px solid #00A000]{
n \sum_{k=0}^{n-1}
{n-1\choose k} \frac{(-1)^{n-1-k}}{(n-k)^j}
\times \sum_{p=0}^P
\frac{1}{(n-k)^p} \frac{(p+j-1)!}{(j-1)!}
[z^p] \left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}.}$$



We claim that the inner sum is $(n-k)^{j-1},$ proof for $j=2$ at end
of document. With this the sum reduces to




$$n \sum_{k=0}^{n-1}
{n-1\choose k} \frac{(-1)^{n-1-k}}{n-k}
\\ = - \sum_{k=0}^{n-1} {n\choose k} (-1)^{n-k}
= 1 - \sum_{k=0}^{n} {n\choose k} (-1)^{n-k} = 1$$



and we see that we indeed have a probability distribution.



Continuing with the expectation and re-capitulating the earlier
computation we find




$$E[T] = \sum_{m\ge j} m P[T = m] =
\frac{n}{n^j} \sum_{k=0}^{n-1}
{n-1\choose k} (-1)^{n-1-k}
\sum_{p\ge 0}
n^{-p} [z^p] \left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}
\\ \times \sum_{m\ge 0}\frac{1}{n^m} \times
{m+p+j-1\choose j-1} (m+p)! (m+p+j) \times \frac{k^m}{m!}$$



The inner sum has two pieces, the first is




$$\frac{1}{(j-1)!} \sum_{m\ge 1} (k/n)^m \frac{(m+p+j-1)!}{(m-1)!}
= \frac{1}{(j-1)!} \frac{k}{n}
\sum_{m\ge 0} (k/n)^m \frac{(m+p+j)!}{m!}
\\ = \frac{k}{n} \frac{(p+j)!}{(j-1)!} \frac{1}{(1-k/n)^{p+j+1}}
= \frac{k}{n-k} \frac{(p+j)!}{(j-1)!} \frac{1}{(1-k/n)^{p+j}}$$



and the second has been evaluated when we summed the probabilities to
give




$$(p+j) \frac{(p+j-1)!}{(j-1)!} \frac{1}{(1-k/n)^{p+j}}
= \frac{(p+j)!}{(j-1)!} \frac{1}{(1-k/n)^{p+j}}.$$



Substituting these into the outer sum we thus obtain



$$\bbox[5px,border:2px solid #00A000]{E[T] = n^2 \sum_{k=0}^{n-1}
{n-1\choose k} \frac{(-1)^{n-1-k}}{(n-k)^{j+1}}
\times \sum_{p=0}^P
\frac{1}{(n-k)^p} \frac{(p+j)!}{(j-1)!}
[z^p] \left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}.}$$




There is a very basic program which confirmed this formula for several
digits of precision by simulation which is written in C and goes as
follows.




#include
#include
#include
#include


int main(int argc, char **argv)
{
int n = 6 , j = 3, trials = 1000;

if(argc >= 2){
n = atoi(argv[1]);
}

if(argc >= 3){

j = atoi(argv[2]);
}

if(argc >= 4){
trials = atoi(argv[3]);
}

assert(1 <= n);
assert(1 <= j);
assert(1 <= trials);


srand48(time(NULL));
long long data = 0;

for(int tind = 0; tind < trials; tind++){
int seen = 0; int steps = 0;
int dist[n];

for(int cind = 0; cind < n; cind++){
dist[cind] = 0;

}

while(seen < n){
int coupon = drand48() * (double)n;

steps++;

if(dist[coupon] == j-1)
seen++;
dist[coupon]++;

}

data += steps;
}

long double expt = (long double)data/(long double)trials;
printf("[n = %d, j = %d, trials = %d]: %Le\n",
n, j, trials, expt);

exit(0);

}


Proof of inner sum for $j=2.$
Setting $j=2$ we have to show that



$$n-k = \sum_{p=0}^{n-1-k} {n-1-k\choose p}
\frac{1}{(n-k)^p} (p+1)!.$$



This is




$$\frac{1}{(n-1-k)!} =
\sum_{p=0}^{n-1-k} \frac{1}{(n-1-k-p)!} \frac{p+1}{(n-k)^{p+1}}.$$



Re-writing as follows



$$\frac{1}{m!} =
\sum_{p=0}^{m} \frac{1}{(m-p)!} \frac{p+1}{(m+1)^{p+1}}.$$



and introducing




$$\frac{1}{(m-p)!} =
\frac{1}{2\pi i}
\int_{|w|= \gamma}
\frac{1}{w^{m-p+1}} \exp(w) \; dw$$



we obtain for the sum (the integral vanishes nicely when $p\gt m$ so
we may extend $p$ to infinity)



$$\frac{1}{2\pi i}

\int_{|w|= \gamma}
\frac{1}{w^{m+1}} \exp(w) \frac{1}{m+1}
\sum_{p\ge 0} \frac{p+1}{(m+1)^p} w^p
\; dw
\\ = \frac{1}{m+1} \frac{1}{2\pi i}
\int_{|w|= \gamma}
\frac{1}{w^{m+1}} \exp(w)
\frac{1}{(1-w/(m+1))^2}
\; dw.$$




We now use the fact that residues sum to zero and the poles are at
$w=0$, $w=m+1$ and $w=\infty.$ We get for the residue at infinity



$$-\frac{1}{m+1}
\mathrm{Res}_{w=0} \frac{1}{w^2} w^{m+1} \exp(1/w)
\frac{1}{(1-1/w/(m+1))^2}
\\ = - \frac{1}{m+1} \mathrm{Res}_{w=0} w^{m+1} \exp(1/w)
\frac{1}{(w-1/(m+1))^2}
\\ = - (m+1) \mathrm{Res}_{w=0} w^{m+1} \exp(1/w)
\frac{1}{(1-w(m+1))^2}

\\ = -(m+1) [w^{-(m+2)}] \exp(1/w)
\frac{1}{(1-w(m+1))^2}.$$



Extracting coefficients we find



$$-(m+1)\sum_{q\ge 0} \frac{1}{(q+m+2)!} (q+1) (m+1)^q.$$



This is



$$-(m+1) \left(\sum_{q\ge 0} \frac{1}{(q+m+1)!} (m+1)^q

- \sum_{q\ge 0} \frac{1}{(q+m+2)!} (m+1)^{q+1}\right)
\\ = -(m+1) \left(\sum_{q\ge 0} \frac{1}{(q+m+1)!} (m+1)^q
- \sum_{q\ge 1} \frac{1}{(q+m+1)!} (m+1)^{q}\right)
\\ = -(m+1) \frac{(m+1)^0}{(m+1)!} = - \frac{1}{m!}.$$



We thus have the claim if we can show the residue at $w=m+1$ is zero.
We use



$$(m+1)\frac{1}{2\pi i}
\int_{|w|= \gamma}

\frac{1}{w^{m+1}} \exp(w)
\frac{1}{(w-(m+1))^2}
\; dw.$$



and observe that



$$\left.\left(\frac{1}{w^{m+1}} \exp(w) \right)'\right|_{w=m+1}
\\= \left.\left( -(m+1)\frac{1}{w^{m+2}} \exp(w)
+ \frac{1}{w^{m+1}} \exp(w) \right)\right|_{w=m+1}
\\ = \exp(m+1)

\left(-(m+1)\frac{1}{(m+1)^{m+2}}+\frac{1}{(m+1)^{m+1}}\right)
= 0$$



as required. This concludes the computation.


No comments:

Post a Comment

real analysis - How to find $lim_{hrightarrow 0}frac{sin(ha)}{h}$

How to find $\lim_{h\rightarrow 0}\frac{\sin(ha)}{h}$ without lhopital rule? I know when I use lhopital I easy get $$ \lim_{h\rightarrow 0}...