integration - Why continuity of $X$ needed for $int_{g^{-1}(y)}^infty f_X(x) , dx = 1-F

Tuesday, 29 January 2019

integration - Why continuity of $X$ needed for $int_{g^{-1}(y)}^infty f_X(x) , dx = 1-F_X(g^{-1}(y))$ ?

Let $X$ be a random variable and $Y=g(X)$

Define
$\tag{1} \chi = \{x: f_X(x)>0\}\quad \text{and}\quad \mathcal{Y} = \{y:y=g(x) \text{ for some } x \in \chi\}$

Define $g^{-1}(y) = \{x\in \chi:g(x) = y\}$

Define: A random variable $X$ is continuous if $F_X(x)$ is a continuous function of $x$ .

My question is: how come, in the theorem below, the statement in (b) requires X to be a continuous random variable but the statement in (a) does not

The relevant theorem is (Theorem 2.1.3 in Casella and Berger 2nd Edition)

Let $X$ have cdf $F_X(x)$ , let $Y=g(X)$ , and let $\chi$ and $\mathcal{Y}$ be defined as in (1)

(a) If $g$ is an increasing function on $\chi$ , $F_Y(y) = F_X(g^{-1}(y))$ for $y\in \mathcal{Y}$

(b) If $g$ is a decreasing function on $\chi$ and $X$ is a continuous random variable, $F_Y(y) = 1-F_X(g^{-1}(y))$ for $y\in\mathcal{Y}$

Another way of stating what I am asking is that, prior to stating this theorem, Casella and Berger state

if $g(x)$ is an increasing function, then using the fact that $F_Y(y) = \int_{x\in\chi : g(x)\leq y} f_X(x)dx$ , we can write

$F_Y(y) = \int_{x\in\chi : g(x)\leq y} f_X(x) \, dx = \int_{-\infty}^{g^{-1}(y)} f_X(x) \, dx = F_X(g^{-1}(y))$

If $g(x)$ is decreasing, then we have

$F_Y(y) = \int_{g^{-1}(y)}^\infty f_X(x) \, dx = 1-F_X(g^{-1}(y))$
"The continuity of $X$ is used to obtain the second equality

My question(restated) is in yellow box below:

My question (restated) is: How come, when $g(x)$ is an increasing function we do not need to use continuity of $X$ , but we do for the case when $g(x)$ is decreasing?

(A side question, I will accept answer so long as answers the above question): this is continuity of the random variable, but the integral uses the PDF. what is the relation between continuity of $X$ and it's pdf? (specifically, I think there may be some strangeness if $F_X$ , the CDF of $X$ is continuous but not differentiable)?

What came to my mind was Fundamental theorem of calculus maybe, but there is a version of it that doesn't require continuity of $f$ I think? Plus, here we have $X$ is continuous, if that matters -- I'm not sure.

Answer

$\int_{g^{-1}(y)}^\infty f_X(x) \, dx = 1-F_X\left(g^{-1}(y)\right) \text{ ?}$
We have:

$1-F_X(g^{-1}(y)) = 1 - \Pr(X\le g^{-1}(y)) = \Pr\left(X>g^{-1}(y)\right)$
We may consider continuity of $F$ at $g^{-1}(y)$ or continuity of $F$ at points greater than $g^{-1}(y).$ Nothing about continuity at points less than $g^{-1}(y)$ can matter here.

In the first place $\Pr(a< X < b) = \int_a^b f_X(x)\,dx\tag 1$ only if $X$ has a density function $f_X,$ and that in itself requires continuity of $F_X$ (and in fact requires something more than just continuity). If $\Pr(x = c)>0,$ where $c$ is some number between $a$ and $b,$ then line $(1)$ above is not true of any function in the role of $f.$

However, statement $(b)$ of the theorem does not mention integration of any density function. The statement is in effect $\Pr(Y\le y) = 1- \Pr(X>g^{-1}(y))$ if $F_X$ is continuous.

Cumulative distribution functions are non-decreasing. The only kind of discontinuity that a non-decreasing function can have is a jump. A jump in $F_X$ at $g^{-1}(y)$ would mean $\Pr(X = g^{-1}(y))>0.$ If that happens then

\begin{align}
& \Pr(Y\le y) = \Pr(Y=y) + \Pr(Y= {} & \Pr(X=g^{-1}(y)) + \Pr(X>g^{-1}(y)) \\[10pt]
= {} & \Pr(X=g^{-1}(y)) + \int_{g^{-1}(y)}^\infty f_X(x)\,dx.
\end{align}
If the first term in the last line is positive rather than zero, then equality between the second term in the last line and $\Pr(Y\le y)$ is not true.

But now suppose it had said $\Pr(Y\ge y).$ Then we would have
$\Pr(Y\ge y) = \Pr(X\le g^{-1}(y)) = F_X(g^{-1}(y)).$
The difference results from the difference between $\text{“}<\text{''}$ and $\text{“} \le \text{''}$ in the definition of the c.d.f., which says $F_X(x) = \Pr(X\le x)$ and not $F_X(x) = \Pr(X

As for the relationship between continuity and density functions, that is more involved. The Cantor distribution is a standard example, defined like this: A random variable $X$ will be in the interval $[0,1/3]$ or $[2/3,1]$ according to the result of a coin toss; then it will be in the upper or lower third of the chosen interval according to a second coin toss; then in the upper or lower third of that according to a third coin toss, and so on.

The c.d.f. of this distribution is continuous because there is no individual point between $0$ and $1$ that gets assigned positive probability.

But notice that there is probability $1$ assigned to a union of two intervals of total length $2/3,$ then probability $1$ assigned to a union of intervals that take up $2/3$ of that union of intervals, thus $4/9$ of $[0,1],$ then there is probability $1$ assigned to a set taking up $2/3$ of that space, thus $(2/3)^3 = 8/27,$ and so on. Thus there is probability $1$ that the random variable lies within a certain set whose measure is $\le (2/3)^n,$ no matter how big an integer $n$ is. The measure of that set must therefore be $0.$ If you integrate any function over a set whose measure is $0,$ you get $0.$ Hence there can be no function $f$ such that for every measurable set $A\subseteq[0,1]$ we have
$\Pr(X\in A) = \int_A f(x)\,dx,$
i.e. there can be no density function.

Thus the Cantor distribution has no point masses and also no probabilities that can be found by integrating a density function.

Thus existence of a density function is a stronger condition on than mere continuity of the c.d.f.

Blog

Tuesday, 29 January 2019