Friday, 10 April 2015

probability - Reciprocal of a normal variable with non-zero mean and small variance




X is a normal random variable:



XN(μ,σ2)



Then Y=1/X has the following probability density function (see wiki):



f(y)=1y22σ2πexp((1yμ)22σ2)



This distribution of Y does not have moments since (stackExchange):




+|x|f(x)dx=



An intuitive explanation of this is that the distributions tails are too heavy and consequently the law of large number fails. The more samples that are drawn and averaged the less stable this average is. The non-zero probability density that X=0 means that Y will not have finite moments since there is a non-zero probability that Y=.



However in simulation for a non-zero mean and small variance XN(1,0.1) the distribution of Y is seemingly well-behaved with E[Y]=0.1010 and Var(Y)=0.0109. However theoretically these moments are not finite. As the variance of X increases the mean and variance of Y become both larger and increasingly unstable. As the variance of X continues to increase the moments of Y become increasingly unstable as the number of samples of Y increases. This is contrary to the typical expectation that the variance of the mean estimate should decrease as the number of samples increases.



Could you offer any insights into this strange behaviour? Why did the moments of Y appear stable for XN(1,0.1)?
Histogram


Answer




This behaviour is an example of conditional convergence. A perhaps simpler example to start with is 11dxx. This integrand is going to infinity as it approaches 0 form the right an negative infinity as it approaches zero from the left. If we successively approximate this integral as M1dxx+1Mdxx, we have arranged to carefully cancel the positive and negative contributions. No matter what positive value we pick for M, these two integrals are finite and their values cancel. But this is not how we evaluate improper integrals. (An integral that only gave useful answers if you sneak up on infinities in just the right way is too treacherous to trust. Taking this very special way of sneaking up on the infinities gives the Cauchy principal value.) Instead, we require that the two integrals can squeeze in towards zero independently and that regardless of how they do so we get the same answer. Contrast the above with M1dxx+1M/2dxx, which is always positive (because we include more of the positive part than of the negative part) and diverges to infinity. We can arrange to diverge to negative infinity by moving the "/2" to the other integral.



The same thing is happening in your integral for the mean, yf(y)dy because for very large positive y, 1/y11, so this integrand is approximately 1/y and similarly for very large negative y where the integrand is approximately 1/y. If you restrict your integral to be symmetric around 1, 1+M1M, these two tails will cancel (not exactly, but the cancellation improves the farther away from 1 you go). However, if you use 1+2M1M, they won't cancel as well and the uncancelled part will grow asymptotically like logM.



This slow growth of the discrepancy has a consequence for numerical experiments. You will have to take many samples at very large values of y and very carefully sum them so that the contributions from large ys aren't swamped by rounding the values from small ys. Your plot shows the integrand for the mean, yf(y) is about 3 for y=1. It's about 1023 for y100. This means to retain necessary precision just over the range y[100,100], you will have to numerically integrate with at least 23 digits of precision (plus several guard digits). Even more numerically difficult, because of the logarithmic rate of divergence of the right-hand and left-hand integrals, you will have to integrate over stupendous ranges of y for any not carefully balanced cancellation to accumulate measurably. For instance, 910yf(y)dy+109yf(y)dy1.84×10189100yf(y)dy+1009yf(y)dy3.13×1018.

(I've ignored the contribution from 9 to 9 in both integrals so that the result isn't swamped by the 18 orders of magnitude larger contribution from that region.) These will always be positive because the contribution from the positive ys is closer to the peak at y=1 than is the contribution from the negative ys.



So, in your simulations, were you sampling over truly stupendous ranges of ys so that you might observe the divergence? And, while doing so, were you keeping truly stupendous precision so that the very slow divergence wouldn't be swamped by rounding errors in the contribution from ys near 1? Doing both of these things is computationally very challenging. Once you have that ironed out, you'll need to take the limits to infinity in different ways to avoid calculating some sort of principal value and when you do this, you'll discover that the integrals do not converge and your estimates are not stable. For instance 191210yf(y)dy+1+1101+9yf(y)dy6.644×10191912100yf(y)dy+1+11001+9yf(y)dy1.292×1018 and each time we increment the power of 10 used for the outermost endpoints, the discrepancy will approximately double. Once we reach 1053-ish, the discrepancy will finally be as large as the contribution from y[19,1+9].



To sum up, numerically detecting the divergence of this integral will require some very careful programming and numerical analysis. (Alternatively, you can do what I did for the estiamtes above and use a big symbolic and numerics application like Maple or Mathematica.)


No comments:

Post a Comment

real analysis - How to find limhrightarrow0fracsin(ha)h

How to find limh0sin(ha)h without lhopital rule? I know when I use lhopital I easy get $$ \lim_{h\rightarrow 0}...