probability - Expected value of applying the sigmoid function to a normal distribution

Wednesday, 3 April 2013

probability - Expected value of applying the sigmoid function to a normal distribution

Short version:

I would like to calculate the expected value if you apply the sigmoid function $\frac{1}{1+e^{-x}}$ to a normal distribution with expected value $\mu$ and standard deviation $\sigma$ .

If I'm correct this corresponds to the following integral:

$\int_{-\infty}^\infty \frac{1}{1+e^{-x}} \frac{1}{\sigma\sqrt{2\pi}}\ e^{ -\frac{(x-\mu)^2}{2\sigma^2} } dx$

However, I can't solve this integral. I've tried manually, with Maple and with Wolfram|Alpha, but didn't get anywhere.

Some background info (why I want to do this):

Sigmoid functions are used in artificial neural networks as an activation function, mapping a value of $(-\infty,\infty)$ to $(0,1)$ . Often this value is used directly in further calculations but sometimes (e.g. in RBM's) it's first stochastically rounded to a 0 or a 1, with the probabililty of a 1 being that value. The stochasticity helps the learning, but is sometimes not desired when you finally use the network. Just using the normal non-stochastic methods on a network that you trained stochastically doesn't work though. It changes the expected result, because (in short):

$\operatorname{E}[S(X)] \neq S(\operatorname{E}[X])$

for most X. However, if you approximate X as a normal distribution and could somehow calculate this expected value, you could eliminate most of the bias. That's what I'm trying to do.

Answer

I doubt that there's a closed-form solution. However, here's a series in powers of $\sigma$ :

$\left( {{\rm e}^{-{\mu}}}+1 \right) ^{-1}+{\frac { \left( { {\rm e}^{-{\mu}}}-1 \right) {{\rm e}^{-{\mu}}}}{2\, \left( {{\rm e} ^{-{\mu}}}+1 \right) ^{3}}}{{\sigma}}^{2}+{\frac { \left( { {\rm e}^{-3\,{\mu}}}-11\,{{\rm e}^{-2\,{\mu}}}+11\,{{\rm e}^{-{ \mu}}}-1 \right) {{\rm e}^{-{\mu}}}}{8\, \left( {{\rm e}^{-{\mu} }}+1 \right) ^{5}}}{{\sigma}}^{4}+{\frac {{{\rm e}^{-{\mu} }} \left( {{\rm e}^{-5\,{\mu}}}-57\,{{\rm e}^{-4\,{\mu}}}+302\,{ {\rm e}^{-3\,{\mu}}}-302\,{{\rm e}^{-2\,{\mu}}}+57\,{{\rm e}^{-{ \mu}}}-1 \right) }{48\, \left( {{\rm e}^{-{\mu}}}+1 \right) ^{7}}}{{ \sigma}}^{6}+{\frac {{{\rm e}^{-{\mu}}} \left( {{\rm e}^{-7\,{\mu}}}-247\,{{\rm e}^{-6\,{\mu}}}+4293\,{ {\rm e}^{-5\,{\mu}}}-15619\,{{\rm e}^{-4\,{\mu}}}+15619\,{ {\rm e}^{-3\,{\mu}}}-4293\,{{\rm e}^{-2\,{\mu}}}+247\,{{\rm e}^{ -{\mu}}}-1 \right) }{384\, \left( {{\rm e}^{-{\mu}}}+1 \right) ^{9}}} {{\sigma}}^{8}+O \left( {{\sigma}}^{10} \right)$

EDIT: To obtain this, first do the change of variables $x = \mu + \sigma t$ . The
integral becomes
$\frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty \dfrac{e^{-t^2/2}}{1 + e^{-\mu - \sigma t}}\ dt$

Now take the Maclaurin series $\frac{1}{1+e^{-\mu - \sigma t}} = \frac{1}{1+e^{-\mu}} + \frac{e^{-\mu} \sigma t}{(1+e^{-\mu})^2} + \frac{e^{-\mu} ( e^{-\mu} - 1) \sigma^2 t^2}{(1+e^{-\mu})^3} + \ldots$
and integrate term by term.

Blog

Wednesday, 3 April 2013