Tuesday 17 November 2015

probability - Summation Rules



I'm reading the Judea Pearl Causality textbook and on page 4 he has the following illustration.




If we wish to calculate the probability that the outcome $X$ of the first die will be greater than the outcome $Y$ of the second, we can condition the event $A: X > Y$ on all possible values of $X$ and obtain



$$P(A) = \sum_{i=1}^6 P(Y < X | X = i) P(X = i)$$

$$ = \sum_{i=1}^6 P(Y < i)\frac{1}{6} $$
$$ = \sum_{i=1}^6 \sum_{j=1}^{i-1} P(Y = j)\frac{1}{6}$$
$$ = \frac{1}{6} \sum_{i=2}^6 \frac{i - 1}{6} = \frac{5}{12}$$




My question regards the fact that I'm just learning how to operate with summation rules. I understand this problem intuitively by imagining the dice, and I can easily verify the $\frac{5}{12}$ result by counting, but what I'm not sure about is if each formulation above is just a restatement using intuition, or if the equations are actually being transformed stepwise using rules, like in algebra.



Going from step 1 to step 2 seems pretty clear. $P(Y

Beyond that I'm kind of lost - I'm not sure why or how we went from step 2 to step 3. I understand that it's summing the odds of every occasion where the 2nd die is equal to a value less than the first die $(i-1)$ but I'm not sure why it was necessary to restate it this way. Isn't step 4 just as derivable from step 2 as from step 3?




I recognize pulling the constant out in front of the summation in step 4, but I'm not sure I understand the thought process behind the rest. It should apparently start with index 2 since index 1 would yield 0? Is that common practice? As for what led to the $\frac{i-1}{6}$ - is that just intuition (I understand it intuitively; it is the sum of probabilities the 2nd die is less than the first, for each possibility of the first die), or is it also some transformation rule written down somewhere?



Then, using the usual trick, I can sort of imagine i={2,6} + i={3,5} + i={4,4} + i={5,3} + i={6,2}, all divided by 2... using the arithmetic series, that would be:



$$\sum_{i=m}^n i = \frac{(n + 1 - m)(n + m)}{2}$$



$$\sum_{i=2}^6 \frac{i - 1}{6} = \frac{5\left((\frac{n - 1}{6}) + (\frac{m - 1}{6})\right)}{2} = \frac{5}{2}$$



Is that a proper thought process, to recognize it as a series and then just figure out how to apply it to a term? I don't know if it's always proper to sub in $n$ and $m$ like that - it seems to fit awkwardly with the arithmetic series formula I found on wikipedia.




Update:



After stumbling onto "index shifting", it seems like this is an easier way to reduce $\sum_{i=2}^6 \frac{i-1}{6}$ :



$$ = \frac{1}{6} \sum_{i=2}^6 i - 1 $$
$$ = \frac{1}{6} \sum_{i=1}^5 i $$
$$ = \frac{1}{6} \left( \frac{5(5+1)}{2} \right) $$
$$ = \frac{15}{6} = \frac{5}{2} $$




So I'm mostly still stuck on how to get in and out of Step 3.


Answer



After spending another few hours learning this, here's my answer. This requires some index shifting, a handful of summation rules:



$$\sum_{n=s}^t C \cdot f(n) = C \cdot \sum_{n=s}^t f(n) $$



$$\sum_{i=m}^n 1 = n + 1 - m $$



$$\sum_{i=1}^n i = \frac{n(n + 1)}{2}$$




And the Law of Total Probability, to relate marginal probabilities to conditional probabilities. When marginalizing over $B$ (to get the marginal probability of $A$):



$$P(A) = \sum_{i=1}^n P(A | B_i) P(B_i)$$



So, going in order. We're trying to find the total probability of one die less than another, not just in the case of a first known die value. So the Law of Total Probability applies. But we're not just asking for the probability of the second value, we're asking for the probability that the second value is less than the first. So we can restate the problem as:



$$P(A) = \sum_{i=1}^6 P(Y < X | X = B_i) P (X = B_i) $$



Note that in the case of a die, $B_i = i$, so this can be restated as





$$P(A) = \sum_{i=1}^6 P(Y < X | X = i) P ( X = i) $$




for the first statement. We know by definition that the probability of one die rolling a particular number is $\frac{1}{6}$, so by simple substitution:




$$P(A) = \sum_{i=1}^6 P(Y < i) \frac{1}{6} $$





for the second statement.



$P(Y < i)$ will have multiple possible values for each value of $i$, so it implies the Law of Total Probability again. Given that the outer summation sets i to a known (constant) value, we will illustrate with the constant set to 4. Note that one way to ask if a value $A$ is less than another $B$ is to ask if $A$ is equal to some value $j$, for all values less than $B$.



$$P(Y < 4) = \sum_{j=1}^{4-1} P(Y = X|X = B_j)P(X = B_j)$$



$$P(Y < 4) = \sum_{j=1}^3 P(Y = j) P(j)$$



Realizing that $P(j)$ will always be 1 (since it's just the probability that the counter is set to $j$, not the probability that we rolled $j$), and swapping $i$ back in, we get




$$P(Y < i) = \sum_{j=1}^{i-1} P(Y = j)$$



Substituting this into the larger equation:




$$P(A) = \sum_{i=1}^6 \sum_{j=1}^{i-1} P(Y = j) \frac{1}{6}$$




for the third statement.




Then, taking a simplified version of one of the previously listed summation rules, we know that



$$\sum_{i=1}^n 1 = n$$



And we know that for any particular die roll, $P(Y)$ will always equal $\frac{1}{6}$. So using substitution,



$$\sum_{j=1}^{i-1} P(Y = j) \Leftrightarrow \frac{i - 1}{6}$$



Substituting into the larger equation:




$$P(A) = \sum_{i=1}^6 \frac{i-1}{6} \left(\frac{1}{6}\right)$$



We can pull the $\frac{1}{6}$ constant forward. Also note that when $i = 1$, it will resolve to $0$, so for convenience, we will index at $2$:




$$P(A) = \frac{1}{6} \sum_{i=2}^6 \frac{i-1}{6}$$




for the fourth statement.




Finally, we can then pull another constant out:



$$P(A) = \frac{1}{6} \left(\frac{1}{6}\right) \sum_{i=2}^6 i-1 $$



Shift the index:



$$P(A) = \frac{1}{36} \sum_{i=1}^5 i$$



Substitute in the arithmetic series:




$$P(A) = \left(\frac{1}{36}\right) \frac{5(5+1)}{2}$$



And solve:




$$P(A) = \frac{5}{12}$$



No comments:

Post a Comment

real analysis - How to find $lim_{hrightarrow 0}frac{sin(ha)}{h}$

How to find $\lim_{h\rightarrow 0}\frac{\sin(ha)}{h}$ without lhopital rule? I know when I use lhopital I easy get $$ \lim_{h\rightarrow 0}...