Let X be a random variable and Y=g(X)
Define
χ={x:fX(x)>0}andY={y:y=g(x) for some x∈χ}
Define g−1(y)={x∈χ:g(x)=y}
Define: A random variable X is continuous if FX(x) is a continuous function of x.
My question is: how come, in the theorem below, the statement in (b) requires X to be a continuous random variable but the statement in (a) does not
The relevant theorem is (Theorem 2.1.3 in Casella and Berger 2nd Edition)
Let X have cdf FX(x), let Y=g(X), and let χ and Y be defined as in (1)
(a) If g is an increasing function on χ, FY(y)=FX(g−1(y)) for y∈Y
(b) If g is a decreasing function on χ and X is a continuous random variable, FY(y)=1−FX(g−1(y)) for y∈Y
Another way of stating what I am asking is that, prior to stating this theorem, Casella and Berger state
if g(x) is an increasing function, then using the fact that FY(y)=∫x∈χ:g(x)≤yfX(x)dx, we can write
FY(y)=∫x∈χ:g(x)≤yfX(x)dx=∫g−1(y)−∞fX(x)dx=FX(g−1(y))
If g(x) is decreasing, then we have
FY(y)=∫∞g−1(y)fX(x)dx=1−FX(g−1(y))
"The continuity of X is used to obtain the second equality
My question(restated) is in yellow box below:
My question (restated) is: How come, when g(x) is an increasing function we do not need to use continuity of X, but we do for the case when g(x) is decreasing?
- (A side question, I will accept answer so long as answers the above question): this is continuity of the random variable, but the integral uses the PDF. what is the relation between continuity of X and it's pdf? (specifically, I think there may be some strangeness if FX, the CDF of X is continuous but not differentiable)?
What came to my mind was Fundamental theorem of calculus maybe, but there is a version of it that doesn't require continuity of f I think? Plus, here we have X is continuous, if that matters -- I'm not sure.
Answer
∫∞g−1(y)fX(x)dx=1−FX(g−1(y)) ?
We have:
1−FX(g−1(y))=1−Pr(X≤g−1(y))=Pr(X>g−1(y))
We may consider continuity of F at g−1(y) or continuity of F at points greater than g−1(y). Nothing about continuity at points less than g−1(y) can matter here.
In the first place Pr(a<X<b)=∫bafX(x)dx
However, statement (b) of the theorem does not mention integration of any density function. The statement is in effect Pr(Y≤y)=1−Pr(X>g−1(y)) if FX is continuous.
Cumulative distribution functions are non-decreasing. The only kind of discontinuity that a non-decreasing function can have is a jump. A jump in FX at g−1(y) would mean Pr(X=g−1(y))>0. If that happens then
\begin{align}
& \Pr(Y\le y) = \Pr(Y=y) + \Pr(Y
= {} & \Pr(X=g^{-1}(y)) + \int_{g^{-1}(y)}^\infty f_X(x)\,dx.
\end{align}
If the first term in the last line is positive rather than zero, then equality between the second term in the last line and Pr(Y≤y) is not true.
But now suppose it had said Pr(Y≥y). Then we would have
Pr(Y≥y)=Pr(X≤g−1(y))=FX(g−1(y)).
The difference results from the difference between “<'' and “≤'' in the definition of the c.d.f., which says FX(x)=Pr(X≤x) and not $F_X(x) = \Pr(X
As for the relationship between continuity and density functions, that is more involved. The Cantor distribution is a standard example, defined like this: A random variable X will be in the interval [0,1/3] or [2/3,1] according to the result of a coin toss; then it will be in the upper or lower third of the chosen interval according to a second coin toss; then in the upper or lower third of that according to a third coin toss, and so on.
The c.d.f. of this distribution is continuous because there is no individual point between 0 and 1 that gets assigned positive probability.
But notice that there is probability 1 assigned to a union of two intervals of total length 2/3, then probability 1 assigned to a union of intervals that take up 2/3 of that union of intervals, thus 4/9 of [0,1], then there is probability 1 assigned to a set taking up 2/3 of that space, thus (2/3)3=8/27, and so on. Thus there is probability 1 that the random variable lies within a certain set whose measure is ≤(2/3)n, no matter how big an integer n is. The measure of that set must therefore be 0. If you integrate any function over a set whose measure is 0, you get 0. Hence there can be no function f such that for every measurable set A⊆[0,1] we have
Pr(X∈A)=∫Af(x)dx,
i.e. there can be no density function.
Thus the Cantor distribution has no point masses and also no probabilities that can be found by integrating a density function.
Thus existence of a density function is a stronger condition on than mere continuity of the c.d.f.
No comments:
Post a Comment