Wednesday 25 February 2015

statistics - Mean concentration implies median concentration



Exercise 2.14 in Wainwright, "High-Dimensional Statistics", states that if $X$ is such that $$P[|X-\mathbb{E}[X]|\geq t] \leq c_1 e^{-c_2t^2},$$ for $c_1, c_2$ positive constants, $t\geq 0$, then for any median $m_X$ it holds that $$P[|X-m_X|\geq t] \leq c_3 e^{-c_4t^2},$$ with $c_3=4c_1$ and $c_4=c_2/8$.



I can get some loose concentration around the median using $|\mathbb{E}[X]-m_X|\leq \sqrt{\mathbb{V}[X]}$, but this does not achieve the constants proposed. Any ideas for how to get the suggested bound, or any other bound resembling it?


Answer



Let $\Delta = |EX - m_X|$. The idea of the proof is that if $X$ is too far from mean, then $X$ is far from median as well. If it is close, then make the upper bound a triviality.




Now consider the two cases for $t$:




  1. $t \ge 2\Delta$:



    which implies that $\frac{t}{2} \geq \Delta$.
    By the reverse triangle inequality, $|X - EX| \geq |X-m_x| - \Delta$. Thus
    \begin{align}
    P(|X - m_X|\geq t) \leq P(|X - m_X|\geq \frac{t}{2} + \Delta ) \leq P( |X - EX| \geq \frac{t}{2}) \leq c_1e^{-\frac{c_2t^2}{4}}.
    \end{align}



  2. $t < 2\Delta$:



    By the definition of median: $\frac{1}{2}\leq P(|X - EX| \geq \Delta ) \leq c_1e^{-c_2\Delta^2}$. Therefore, $2c_1e^{-c_2\Delta^2} \geq 1$.
    \begin{align}
    2c_1e^{-\frac{c_2}{4}t^2} \geq 2c_1e^{-\frac{c_2}{4}(2 \Delta)^2} = 2c_1e^{-c_2\Delta^2} \geq 1
    \end{align}

    Therefore, the required condition holds trivially.




The final constants are $c_3 = 2c_1$ and $c_4 = \frac{c_2}{4}$. I don't know why I am getting better constants. Let me know if you spot an error.



No comments:

Post a Comment

real analysis - How to find $lim_{hrightarrow 0}frac{sin(ha)}{h}$

How to find $\lim_{h\rightarrow 0}\frac{\sin(ha)}{h}$ without lhopital rule? I know when I use lhopital I easy get $$ \lim_{h\rightarrow 0}...