Thursday, 9 April 2015

Different solution of probability problem from textbook




It is the problem 1.2.3 of Karlin's book An introduction to stochastic modeling:




A population having N distinct elements is sampled with replacement. Because of repetitions a sample of size r can have fewer than r different elements. Let Sr be the sample size necessary to have r different elements. Show that



E[Sr]=N(1N+1N1+...+1Nr+1)




But I have a different solution. I did




Pr



\Pr[S_r=r+1]=\frac{(N)_r}{N^r}\frac rN



And in general



\Pr[S_r=r+k]=\frac{(N)_r}{N^r}\left(\frac rN\right)^k



Now




\Bbb E[S_r]=\sum_{k\ge0}(r+k)\Pr[S_r=r+k]



Doing some algebra I get the solution



\Bbb E[S_r]=\sum_{k\ge0}(r+k)\Pr[S_r=r+k]=\frac{(N)_r}{N^r}\frac{(N-r+1)Nr}{(N-r)^2}



This expression dont seem equivalent to the textbook solution. Can someone show me my mistake or the way to get the solution of the book? Thank you in advance.







EDITION: I see a weird mistake. I assumed that for S_r>r the others values may repeat... but this is not true, they can be different too. Anyway this dont seem going in the good direction cause it complicate a lot the expression. Anyway I will see what I get fixing that.


Answer



Re-thinking the problem and seeing the "population" and "sample" as a dice of N sides and S_r throws respectively I get the correct answer.



Then the reformulated problem is: what is the expected number of throws to get r different results for a dice of N sides?



This is some kind of Markov chain that I solved some time ago. We thrown a die a undetermined number of times (the quantity doesnt matter) and we have seen d different faces of the die, so for the next throw the probability to see a new face (the probability to change from state d to state d+1) is p=(N-d)/N, and the probability that this doesnt happen is q=1-p.



Hence the expected value of the number of throws to see a new face of the die (i.e. to change from state d to state d+1) is




\Bbb E[d\to d+1]=\sum_{k\ge 1} kq^{k-1}p=p\sum_{k\ge 1}kq^{k-1}=\\=p\frac{\rm d}{{\rm d}k}\sum_{k\ge 0}q^k=p\frac{\rm d}{{\rm d}k}\frac{1}{1-q}=\frac{p}{(1-q)^2}=\frac1p=\frac{N}{N-d}



Then the expected number of throws, i.e. \Bbb E[S_r], will be the sum of the expected number of throws for each consecutive change of state



\Bbb E[S_r]=\sum_{d=0}^{r-1}\frac{N}{N-d}=N\left(\frac1N+...+\frac{1}{N-r+1}\right)=N(H_N-H_{N-r})



being H_k the k-th harmonic number.


No comments:

Post a Comment

real analysis - How to find lim_{hrightarrow 0}frac{sin(ha)}{h}

How to find \lim_{h\rightarrow 0}\frac{\sin(ha)}{h} without lhopital rule? I know when I use lhopital I easy get $$ \lim_{h\rightarrow 0}...