Sunday, 28 February 2016

probability - A multinomial problem (balls, bins, etc.)

Consider the well known multinomial setting: there are L balls, thrown at random at n bins so that the probability that a ball falls in bin i is pi, independent of the other balls (the pi’s are all positive and sum to 1).



Let Xi be the number of balls in the ith bin. It is straightforward to show that for any k distinct indices i1,i2,,ik, the events {Xi1>0},{Xi2>0},,{Xik>0} are negatively dependent in the sense that



P(Xi1>0,Xi2>0,,Xik>0)<P(Xi1>0)P(Xi2>0)P(Xik>0)




The proof is by induction over k: for k = 2, the probability on the LHS is 1 – [(1 - p_{i_1})^L + (1 - p_{i_2})^L - (1 - p_{i_1} - p_{i_2})^L], and on the RHS it’s [1 - (1 – p_{i_1})^L] [1 - (1 – p_{i_2})^L]. Elementary manipulations show that the inequality holds, and the induction step is easily proven using the multiplication rule P(A \cap B) = P(A)P(B | A).



This negative dependence is also intuitively clear: if we know that there is at least one ball in bin i, there is at least one ball that will surely not fall into bin j, so the probability of bin j being non-empty decreases.



So that was easy. However, for proving a certain bound in my research, I need to do something similar, but for random k indices: suppose that after the balls were thrown into the bins, somebody comes and chooses randomly k distinct bins (so that each set of k bins has the same probability to be chosen, namely, 1/(n choose k)). Let i_1, i_2, \ldots, i_k be the chosen indices, and again, I need to show that the inequality (below the second paragraph above) holds.



I thought to repeat the outline of the previous proof, but had a problem with proving the k = 2 case (the induction step is no problem). To compute each of the sides of the inequality I conditioned on the choice of the two indices (law of total probability), but even though many terms got cancelled, I couldn’t prove the inequality.



I am quite confident that the inequality holds, both by intuition (basically the same argument as in the non-random indices case) and because of numerical calculations.




Any ideas what to do here? I’d be grateful for any help.

No comments:

Post a Comment

real analysis - How to find lim_{hrightarrow 0}frac{sin(ha)}{h}

How to find \lim_{h\rightarrow 0}\frac{\sin(ha)}{h} without lhopital rule? I know when I use lhopital I easy get $$ \lim_{h\rightarrow 0}...