probability - Fundamental theorem of calculus - range instead of point by point

Sunday, 24 July 2016

probability - Fundamental theorem of calculus - range instead of point by point

I read the following in a math article about continuous sample spaces:

We need to have P(Ω) = 1, i.e., P([0, T]) = 1. On the other hand, in the first experiment, all points in the interval [0, T] seem to be equiprobable. And, since the sum of the probabilities P(t) must be 1, it looks like we have arrived at an impossible situation. If P(t) is non-zero, the sum of all probabilities will be infinite; if P(t) is 0, the sum will vanish as well. The apparent paradox is resolved by pointing out that the notion of the sum of a continuum of values is commonly replaced by an integral - the concept taught at the beginning Calculus courses. The probabilities on a continuous sample space should be defined somehow differently and not point-by-point.

I am interested in the last few lines of the paragraph. It states that instead of calculating point by point, we shift to adding by small ranges.

Although I'm quite familiar with calculus, I never thought of this. Why can't we add point by point?

Instead of $\int f(x) \, dx$ where we add for a small range, why cannot we have $\sum f(x)$ where we add point by point?

I am missing something fundamental in my concepts. I am not able to understand this intuitively.

Answer

This is similar to what I answered in another question.

The sum and the integral are closely related; closer than the definition of an integral might make you think.
Let me start a little further back by discussing ways to measure size.

There are two very natural ways to measure size of set on the real line: "length" and "number of points".
For a single point the first measure gives zero and the second one gives one.
For an open interval the first measure gives the distance between the endpoints and the second one gives infinity.
A single point is negligible in the sense of length but not in the sense of amount.

These different ways of associating sets with sizes are called measures.
The measure corresponding to length is called the Lebesgue measure, and the one corresponding to number of points is called the counting measure.
These may or may not mean anything to you at this point, but you will encounter them later on if you continue working with real analysis.

The natural way to measure the size of an event in the theory of probability is the probability of that event.
The measure of a set corresponds to a probability, and probabilities are in fact measures if you dig a little deeper.

Once you have a measure, you can integrate with respect to that measure.
The integral with respect to the Lebesgue measure is the usual integral you know (or rather an extension of it).

The integral with respect to the counting measure is the sum.
That is, for $A\subset\mathbb R$ and $f\colon A\to\mathbb R$ both
$\int_Af(x)dx$
and
$\sum_{x\in A}f(x)$
are integrals of the function $f$ over the set $A$ .

But these integrals are taken with respect to different measures and they are very different things.

The interval $[0,1]$ consists of points, so its total length must be the sum of the lengths of its points.
(We can make sense of uncountable sums. Especially if all numbers are zero, this is easy: then the sum is indeed zero.)
But all points have length zero, so the length of $[0,1]$ is zero, too!
This is not a bad argument, but it turns out that lengths — or measures in general — do not have such an additivity property.
However, if you only take a finite or countable union of disjoint sets, the length of the union is the sum of the lengths.
(I'm ignoring technicalities related to measurability as they are beside the point.)
The set $[0,1]$ is uncountable, so this "naive geometric reasoning" fails, and it can be enlightening to figure out why.

The remaining question is then why the Lebesgue measure was chosen and not the counting measure for probability on $[0,1]$ .
Counting measures (with suitable normalization) can be used for probability, and you can in fact see all introductory probability with finite sets as an example of this.
There are too many points on $[0,1]$ to count: the counting measure is infinite.
The total measure should be one, and there is no way to normalize it.
On the other hand, the Lebesgue measure of $[0,1]$ is one, and it matches the geometric intuition of length.

Summing over the index set $[0,1]$ (as opposed to the usual $\mathbb N$ or $\mathbb Z$ ) can be defined, and it corresponds to integrating a function on $[0,1]$ with respect to the counting measure.
However, uncountable sums are somewhat ill-behaved, and things are usually awry when you end up using one.
The sum $\sum_{x\in[0,1]}f(x)$ of a function $f\colon[0,1]\to[0,\infty)$ can only be finite if only countably many values of $f(x)$ are non-zero.
That is, any sum of nonzero elements over an uncountable set is necessarily infinite.

If you want to use a counting-type measure on $[0,1]$ and have total measure (probability) one, you need to choose a countable amount of points $[0,1]$ with non-zero measure and let all other points (most of them!) have zero measure.

Blog

Sunday, 24 July 2016