probability - Finite sample bounds for generalized coupon collector problem

Wednesday, 1 April 2015

probability - Finite sample bounds for generalized coupon collector problem

Consider the usual generalization of the coupon collector problem where $m$ copies of each coupon need to be collected. Let $T_m$ be the first time m copies of each coupon are collected. Donald J. Newman and Lawrence Shepp showed that
$$\mathbf{E} (T_{m})=n\log n+(m-1)n\log \log n+O(n),\ {\text{as}}\ n\to \infty $$

Then Erdős and Rényi showed that:

$$\displaystyle \operatorname {P} {\bigl (}T_{m}

But these statement are asymptotic. Are there known finite sample (upper and lower) bounds on $T_m$ ?

Answer

What follows is a minor contribution where we compute a formula for
the expectation for the case where $j$ instances of each of $n$ types
of coupons must be seen. Using the notation from the following MSE
link we have from
first principles that

$$P[T = m] = \frac{1}{n^m}\times {n\choose 1}\times
(m-1)! [z^{m-1}]
\left(\exp(z) - \sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1}
\frac{z^{j-1}}{(j-1)!}

\\ = \frac{n}{n^m}\times
{m-1\choose j-1} (m-j)! [z^{m-j}]
\left(\exp(z) - \sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1}.$$

We verify that this is a probability distribution. The goal here
is to find a closed form for the infinite series in $m$ so that its
value may be calculated rather than approximated. Expanding the power
we find

$$\sum_{m\ge j} P[T=m] =

\frac{n}{n^j} \sum_{m\ge 0} \frac{1}{n^m} \times
{m+j-1\choose j-1} m! [z^m]
\sum_{k=0}^{n-1}
{n-1\choose k} \exp(kz) \\ \times (-1)^{n-1-k}
\left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}
\\ = \frac{n}{n^j} \sum_{m\ge 0} \frac{1}{n^m} \times
{m+j-1\choose j-1} m!
\sum_{k=0}^{n-1}
{n-1\choose k}
\sum_{p=0}^m \frac{k^{m-p}}{(m-p)!} \\ \times (-1)^{n-1-k}

[z^p] \left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}
\\ = \frac{n}{n^j} \sum_{k=0}^{n-1}
{n-1\choose k} (-1)^{n-1-k}
\sum_{p\ge 0}
[z^p] \left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}
\\ \times \sum_{m\ge p}\frac{1}{n^m} \times
{m+j-1\choose j-1} m! \times \frac{k^{m-p}}{(m-p)!}
\\ = \frac{n}{n^j} \sum_{k=0}^{n-1}
{n-1\choose k} (-1)^{n-1-k}
\sum_{p\ge 0}

n^{-p} [z^p] \left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}
\\ \times \sum_{m\ge 0}\frac{1}{n^m} \times
{m+p+j-1\choose j-1} (m+p)! \times \frac{k^m}{m!}$$

The inner sum is

$$\frac{1}{(j-1)!} \sum_{m\ge 0} (k/n)^m \frac{(m+p+j-1)!}{m!}
= \frac{(p+j-1)!}{(j-1)!} \frac{1}{(1-k/n)^{p+j}}$$

and with $P=(n-1-k)(j-1)$ we obtain

$$\bbox[5px,border:2px solid #00A000]{
n \sum_{k=0}^{n-1}
{n-1\choose k} \frac{(-1)^{n-1-k}}{(n-k)^j}
\times \sum_{p=0}^P
\frac{1}{(n-k)^p} \frac{(p+j-1)!}{(j-1)!}
[z^p] \left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}.}$$

We claim that the inner sum is $(n-k)^{j-1},$ proof for $j=2$ at end
of document. With this the sum reduces to

$$n \sum_{k=0}^{n-1}
{n-1\choose k} \frac{(-1)^{n-1-k}}{n-k}
\\ = - \sum_{k=0}^{n-1} {n\choose k} (-1)^{n-k}
= 1 - \sum_{k=0}^{n} {n\choose k} (-1)^{n-k} = 1$$

and we see that we indeed have a probability distribution.

Continuing with the expectation and re-capitulating the earlier
computation we find

$$E[T] = \sum_{m\ge j} m P[T = m] =
\frac{n}{n^j} \sum_{k=0}^{n-1}
{n-1\choose k} (-1)^{n-1-k}
\sum_{p\ge 0}
n^{-p} [z^p] \left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}
\\ \times \sum_{m\ge 0}\frac{1}{n^m} \times
{m+p+j-1\choose j-1} (m+p)! (m+p+j) \times \frac{k^m}{m!}$$

The inner sum has two pieces, the first is

$$\frac{1}{(j-1)!} \sum_{m\ge 1} (k/n)^m \frac{(m+p+j-1)!}{(m-1)!}
= \frac{1}{(j-1)!} \frac{k}{n}
\sum_{m\ge 0} (k/n)^m \frac{(m+p+j)!}{m!}
\\ = \frac{k}{n} \frac{(p+j)!}{(j-1)!} \frac{1}{(1-k/n)^{p+j+1}}
= \frac{k}{n-k} \frac{(p+j)!}{(j-1)!} \frac{1}{(1-k/n)^{p+j}}$$

and the second has been evaluated when we summed the probabilities to
give

$$(p+j) \frac{(p+j-1)!}{(j-1)!} \frac{1}{(1-k/n)^{p+j}}
= \frac{(p+j)!}{(j-1)!} \frac{1}{(1-k/n)^{p+j}}.$$

Substituting these into the outer sum we thus obtain

$$\bbox[5px,border:2px solid #00A000]{E[T] = n^2 \sum_{k=0}^{n-1}
{n-1\choose k} \frac{(-1)^{n-1-k}}{(n-k)^{j+1}}
\times \sum_{p=0}^P
\frac{1}{(n-k)^p} \frac{(p+j)!}{(j-1)!}
[z^p] \left(\sum_{q=0}^{j-1} \frac{z^q}{q!}\right)^{n-1-k}.}$$

There is a very basic program which confirmed this formula for several
digits of precision by simulation which is written in C and goes as
follows.


#include 
#include 
#include 
#include 


int main(int argc, char **argv)
{
  int n = 6 , j = 3, trials = 1000; 

  if(argc >= 2){
    n = atoi(argv[1]);
  }

  if(argc >= 3){

    j = atoi(argv[2]);
  }

  if(argc >= 4){
    trials = atoi(argv[3]);
  }

  assert(1 <= n);
  assert(1 <= j);
  assert(1 <= trials);


  srand48(time(NULL));
  long long data = 0;

  for(int tind = 0; tind < trials; tind++){
    int seen = 0; int steps = 0; 
    int dist[n];

    for(int cind = 0; cind < n; cind++){
      dist[cind] = 0;

    }

    while(seen < n){
      int coupon = drand48() * (double)n;

      steps++;

      if(dist[coupon] == j-1)
        seen++;
      dist[coupon]++;

    }

    data += steps;
  }

  long double expt = (long double)data/(long double)trials;
  printf("[n = %d, j = %d, trials = %d]: %Le\n", 
         n, j, trials, expt);

  exit(0);

}

Proof of inner sum for $j=2.$
Setting $j=2$ we have to show that

$$n-k = \sum_{p=0}^{n-1-k} {n-1-k\choose p}
\frac{1}{(n-k)^p} (p+1)!.$$

This is

$$\frac{1}{(n-1-k)!} =
\sum_{p=0}^{n-1-k} \frac{1}{(n-1-k-p)!} \frac{p+1}{(n-k)^{p+1}}.$$

Re-writing as follows

$$\frac{1}{m!} =
\sum_{p=0}^{m} \frac{1}{(m-p)!} \frac{p+1}{(m+1)^{p+1}}.$$

and introducing

$$\frac{1}{(m-p)!} =
\frac{1}{2\pi i}
\int_{|w|= \gamma}
\frac{1}{w^{m-p+1}} \exp(w) \; dw$$

we obtain for the sum (the integral vanishes nicely when $p\gt m$ so
we may extend $p$ to infinity)

$$\frac{1}{2\pi i}

\int_{|w|= \gamma}
\frac{1}{w^{m+1}} \exp(w) \frac{1}{m+1}
\sum_{p\ge 0} \frac{p+1}{(m+1)^p} w^p
\; dw
\\ = \frac{1}{m+1} \frac{1}{2\pi i}
\int_{|w|= \gamma}
\frac{1}{w^{m+1}} \exp(w)
\frac{1}{(1-w/(m+1))^2}
\; dw.$$

We now use the fact that residues sum to zero and the poles are at
$w=0$, $w=m+1$ and $w=\infty.$ We get for the residue at infinity

$$-\frac{1}{m+1}
\mathrm{Res}_{w=0} \frac{1}{w^2} w^{m+1} \exp(1/w)
\frac{1}{(1-1/w/(m+1))^2}
\\ = - \frac{1}{m+1} \mathrm{Res}_{w=0} w^{m+1} \exp(1/w)
\frac{1}{(w-1/(m+1))^2}
\\ = - (m+1) \mathrm{Res}_{w=0} w^{m+1} \exp(1/w)
\frac{1}{(1-w(m+1))^2}

\\ = -(m+1) [w^{-(m+2)}] \exp(1/w)
\frac{1}{(1-w(m+1))^2}.$$

Extracting coefficients we find

$$-(m+1)\sum_{q\ge 0} \frac{1}{(q+m+2)!} (q+1) (m+1)^q.$$

This is

$$-(m+1) \left(\sum_{q\ge 0} \frac{1}{(q+m+1)!} (m+1)^q

- \sum_{q\ge 0} \frac{1}{(q+m+2)!} (m+1)^{q+1}\right)
\\ = -(m+1) \left(\sum_{q\ge 0} \frac{1}{(q+m+1)!} (m+1)^q
- \sum_{q\ge 1} \frac{1}{(q+m+1)!} (m+1)^{q}\right)
\\ = -(m+1) \frac{(m+1)^0}{(m+1)!} = - \frac{1}{m!}.$$

We thus have the claim if we can show the residue at $w=m+1$ is zero.
We use

$$(m+1)\frac{1}{2\pi i}
\int_{|w|= \gamma}

\frac{1}{w^{m+1}} \exp(w)
\frac{1}{(w-(m+1))^2}
\; dw.$$

and observe that

$$\left.\left(\frac{1}{w^{m+1}} \exp(w) \right)'\right|_{w=m+1}
\\= \left.\left( -(m+1)\frac{1}{w^{m+2}} \exp(w)
+ \frac{1}{w^{m+1}} \exp(w) \right)\right|_{w=m+1}
\\ = \exp(m+1)

\left(-(m+1)\frac{1}{(m+1)^{m+2}}+\frac{1}{(m+1)^{m+1}}\right)
= 0$$

as required. This concludes the computation.

Blog

Wednesday, 1 April 2015

probability - Finite sample bounds for generalized coupon collector problem

No comments:

Post a Comment

real analysis - How to find $lim_{hrightarrow 0}frac{sin(ha)}{h}$