Monday, 18 May 2015

The exact probability of observing x unique elements after sampling with replacement from a set with N elements

Say I sample with replacement from a set of N unique elements s.t. elements are selected with uniform probability. If I sample with replacement M times from this set, what is the exact probability P(x) that I have observed at least x unique elements?



I believe different variants of this question have been asked on this site, however, I haven't seen a form that asks for an explicit probability P(x)?




For example, Ross Rogers asks a variant of this question here: probability distribution of coverage of a set after `X` independently, randomly selected members of the set, and Henry calculates the mean number of unique elements, x, and variance for the coverage of a set of N elements after sampling with replacement M times (we switch M and x here to fit with our variable specification).



Reproducing Henry's derivation here:



Mean[x] = N(1(11N)M)



Var[x] = N(11N)M+N2(11N)(12N)MN2(11N)2M



(I'll note that I don't quite understand the derivation for Var[x]...)




How can we translate this variance into our P(x)?

No comments:

Post a Comment

real analysis - How to find limhrightarrow0fracsin(ha)h

How to find lim without lhopital rule? I know when I use lhopital I easy get $$ \lim_{h\rightarrow 0}...