Wednesday, 13 June 2018

probability theory - Weak Law of Large Number and Truncated Mean




I am working on this question:




Let X1,,Xn be i.i.d with P(Xi=(1)kk)=1/(c0k2logk) for k2, where c0 is a normalizer. Prove that E|X1|= and yet there exists a constant μ< such that 1nni=1Xiμ, in distribution.




I got a solution but I cannot understand it. The solution argued as follows:





Firstly we have P(|Xi|>n)<k=n+11c0k2logk1c0nlogn, so we have nP(|Xi|>n)0, and thus the weak law can be applied.



E|Xi|=k=21c0klogk=, but the truncated mean μn=EXi1(|Xi|n)=nk=2(1)k1c0klogkk=2(1)k1c0klogk, since the latter is an alternating series with decreasing terms for k3.




I have following confusions:



(1) I only have P(Xi=(1)kk), how does it calculate the P(|Xi|)>n and E|Xi|? and why k=n+11c0k2logk1c0nlogn?



(2) Does the weak law have anything to do E|Xi|=? or E|Xi|= can be concluded directly from that infinity sum? if so, why?




(3) I understand that the weak law implies that 1nni=1Xiμn0 in distribution, and I understand that the solution implies the desired μ is the infinite sum that μn converges to, but how could I connect μ with μn in the weak law? Can I do the following: 1nni=1Xiμnp0, but μnμ<, so 1nni=1Xipμ?



By the way, here is the weak law related to the truncated mean:




Show that if X1,,Xn are i.i.d and lim, then if we let S_{n}=X_{1}+\cdots+X_{n} and let \mu_{n}=E[X_{1}\mathbb{1}_{(|X_{1}|\leq n)}], then we have S_{n}/n-\mu_{n}\longrightarrow_{p} 0.




Thank you!




Edit 1:



Okay I think I figure it all out, and the proof I am going to present basically follows the answer of kasa, I just add more technical but non-trivial details:



Proof:



For brevity, set S_{n}:=X_{1}+\cdots+X_{n}. For me to type Latex easier, let us set C:=\dfrac{1}{c_{0}}.



Let us firstly deal with E|X_{1}|. We have the following calculation: E|X_{1}|=\sum_{k=2}^{\infty}kP(|X_{1}|=k)=\sum_{k=2}^{\infty}\dfrac{Ck}{k^{2}\log k}=C\sum_{k=2}^{\infty}\dfrac{1}{k\log k}, and I claim that \sum_{k=2}^{\infty}\dfrac{1}{k\log k}=\infty.




Indeed, if we do a change of variable u=\log (x) to the following integral, we have \int_{2}^{\infty}\dfrac{1}{x\log x}dx=[\log(\log x)]_{2}^{\infty}=\lim_{b\rightarrow\infty}\log(\log(b))-\log (2)=\infty.



Therefore, it follows from the integral test that \sum_{k=2}^{\infty}\dfrac{1}{k\log k}=\infty.



Thus, E|X_{1}|=\infty.



Now, for the second part of this question, denote the truncated mean by \mu_{n}:=E(X_{i}\mathbb{1}_{(|X_{i}\leq n)})=\sum_{k=2}^{n}(-1)^{k}\dfrac{C}{k\log k}.



Then, notice that \mu_{n}\longrightarrow \sum_{k=2}^{\infty}(-1)^{k}\dfrac{C}{k\log k}<\infty, since the latter one is actually an alternating sum with decreasing terms for k\geq 3.




Denote the limit of \mu_{n} to be \mu, and we will show that \mu is the desired constant we need. Firstly, we know that \mu<\infty by above argument.



Then, since \mu_{n}\longrightarrow\mu, I claim that \mu_{n}\longrightarrow_{p}\mu. Indeed, since \mu_{n}\longrightarrow\mu, the set \mathcal{O}:=\{\omega:\lim_{n\rightarrow\infty}\mu_{n}\neq \mu\}, has P(\mathcal{O})=0.



Now, fix \epsilon>0, then define the sets A_{n}:=\bigcup_{m\geq n}\{|\mu_{m}-\mu|>\epsilon\},\ \text{and}\ A:=\bigcap_{n\geq 1}A_{n}, which is clearly to see that A_{n}\searrow A, and thus P(A)=\lim_{n\rightarrow\infty}P(A_{n}) by continuity.



On the other hand, for \omega_{0}\in\mathcal{O}^{c}, by definition it means |\mu_{n}(\omega_{0})-\mu(\omega_{0})|<\epsilon for all n>N for some N. Therefore, for all n\geq N, \omega_{0}\notin A_{n} and thus \omega_{0}\notin A. Therefore, A\cap \mathcal{O}^{c}=\varnothing, which implies A\subset\mathcal{O}. Thus, by monotonicity, we have P(A)=0, and thus \lim_{n\rightarrow\infty}P(A_{n})=0.



Finally, we can see that P(|\mu_{n}-\mu|>\epsilon)\leq P(A_{n})\longrightarrow 0, and thus \mu_{n}\longrightarrow_{p}\mu.




The point here is that pointwise convergence can certainly imply almost sure convergence. Then the proof is exactly the same as almost sure convergence implying convergence in probability.



On the other hand, we have the following computation:
\begin{align*} P(|X_{i}|>n)&=\sum_{k=n+1}^{\infty}\dfrac{C}{k^{2}\log k}\leq C\sum_{k=n+1}^{\infty}\dfrac{1}{k^{2}\log n}\\ &=\dfrac{C}{\log n}\sum_{k=n+1}^{\infty}\dfrac{1}{k^{2}}\leq\dfrac{C}{\log n}\sum_{k=n+1}^{\infty}\dfrac{1}{k(k-1)}\\ &\leq\dfrac{C}{n\log n}, \end{align*}
which implies that nP(|X_{i}|>n)\longrightarrow 0,\ \text{as}\ n\longrightarrow\infty.




Therefore, by the weak law of large number, we have S_{n}/n-\mu_{n}\longrightarrow_{p} 0.



To conclude the proof, we need the following lemma. I will put the proof of lemma in the end.




Lemma: If we have two random variable X_{n} and Y_{n} such that X_{n}\longrightarrow_{p} a and Y_{n}\longrightarrow_{p} b for some constant a and b, then X_{n}+Y_{n}\longrightarrow_{p} a+b.




Now, by this lemma, we can set X_{n}:=S_{n}/n-\mu_{n} and Y_{n}:=\mu_{n}, then we have S_{n}/n=X_{n}+Y_{n}\longrightarrow_{p}0+\mu=\mu,\ \text{as desired.}




Proof of lemma:



Let \epsilon>0, then consider P(|X_{n}+Y_{n}-a-b|\geq\epsilon). Then, using the hypothesis that X_{n}\longrightarrow_{p}a and Y_{n}\longrightarrow_{p}b, we have the following computation:
\begin{align*} P(|X_{n}+Y_{n}-a-b|\geq\epsilon)&=P(|(X_{n}-a)+(Y_{n}-b)|\geq\epsilon)\\ &\leq P(|X_{n}-a|\geq\epsilon/2\ \text{or}\ |Y_{n}-b|\geq\epsilon/2)\\ &\leq P(|X_{n}-a|\geq\epsilon/2)+P(|Y_{n}-b|\geq \epsilon/2)\longrightarrow 0+0=0\ \text{as}\ n\rightarrow\infty. \end{align*}




Therefore, X_{n}+Y_{n}\longrightarrow_{p} a+b.



The only part I am still doubting is that \mu_{n} is not a random variable, it is just a sequence of number, so I don't know if the convergence in probability even makes sense to \mu_{n}. It will be really appreciated if anyone can clarify my confusion here :)



Let me express my appreciation to kasa for his/her patience and brilliant answer. Please let me know if you think anything can be improved or reined in my proof :)


Answer



HINT to part 1:



P(|X_i|>n) is straight-forward - sum of all the probabilities that |X_i| is greater than n, which in this case, is equal to \sum_{k=n+1}^{\infty}\dfrac{2}{c_{0}k^{2}\log k}.




Now, we can use the following bound
\begin{align} \sum_{k=n+1}^{\infty}\dfrac{1}{k^{2}} & \le \sum_{k=n+1}^{\infty}\dfrac{1}{k(k-1)} \leq (\dfrac{1}{n} - \dfrac{1}{n+1}) + (\dfrac{1}{n+1} - \dfrac{1}{n+2}) \dots \\ < \dfrac{1}{n} \end{align}



This implies \sum_{k=n+1}^{\infty}\dfrac{1}{k^{2}\log k} < \dfrac{1}{n \log k} < \dfrac{1}{n \log n}



HINT to part 3:
Use the property:





If X_n \longrightarrow_{p} x and Y_n \longrightarrow_{p} y, then
X_n + Y_n \longrightarrow_{p} x+y




Edit: The expectation \sum \dfrac{1}{k\log k} goes to infinity. Please check this proof.


No comments:

Post a Comment

real analysis - How to find lim_{hrightarrow 0}frac{sin(ha)}{h}

How to find \lim_{h\rightarrow 0}\frac{\sin(ha)}{h} without lhopital rule? I know when I use lhopital I easy get $$ \lim_{h\rightarrow 0}...