November 2012

Friday 30 November 2012

elementary set theory - An explicit bijection between the power set $mathcal P left({mathbb{N}}right)$ and $2^mathbb{N}$.

I know how to show that these two have the same cardinality and from that there must be a bijection between them.

Can anyone help with an explicit bijection between these sets?

Thursday 29 November 2012

Use any dice to calculate the outcome of any probability

While looking at this question, I had a gut feeling that you can use any fair, single die with any number of sides to calculate the outcome of any probability.

Assuming we express the probability as a range of numbers, this is easy for coins and d10s: coins can be flipped to generate a binary number, while d10s can be rolled to produce each digit of the outcome. If the result falls outside the range, ignore it and reroll.

This is really just probability with respect to base. The coin generates a result in base 2, while the d10 generates results in base 10. Therefore, a die with a number of sides n can be used to produce a result in base n.

Now consider that we have an arbitrary number of dice, each with an arbitrary number of sides. We could generate an outcome by expressing the result of each die roll in base 2 and tacking them together^* (see example). This would however result in a lot of rerolling, and is otherwise time-consuming when you factor in converting to base 2 and so forth.

So here's what amounts to a somewhat silly puzzle question:

For an arbitrary set of dice, each with an arbitrary number of sides, is there a general method for determining the outcome of any probability while minimizing the number of rerolls.

Is there a method which is easy to remember and could reasonably be used during a gaming session (i.e. takes less than, say, 30 seconds to determine which dice to roll and calculate the result).

Example of presented method:
Outcome between 0 and 993, with (hypothetical) 1d7 and 1d21.

993 in base 2 is 1111100001, meaning we need 10 binary digits to express the full range of possible outcomes.

1d21 can provide 4 binary digits (0 through 15 in base 2), and 1d7 provides 2 digits (0 through 3).

Solution:
Roll 1d21 twice and 1d7 once. If the d21 lands higher than 16 or the d7 higher than 4, reroll. Subtract 1 from each roll so the range starts at 0. Convert to base 2. Append results to create one 10-digit binary value. If result > 993, toss it out and reroll.

There is a ~24% chance ($\frac{21-16}{21}$) of needing to reroll the d21 each time, a ~43% chance ($\frac{7-4}{7}$) for the d7, and a ~3% chance ($\frac{1024-994}{1024}$) of needing to reroll the final value.

^*Ignoring rolls that are higher than the maximum rollable power of 2. I.e. if you had a d12, you would ignore rolls higher than 8 (2³). This ensures an equal probability for each digit in the result.

Edit:

In light of Thomas Andrews' answer, multiple dice can be used to generate a higher $X$ value than one die alone. For a set of dice with number of sides $\{k1,k2,...,kn\}$ and rolls $\{r1,r2,...,rn\}$, the maximum $X$ value will be $k_1k_2k_3...k_n$ and a given roll value will be: $$r_1 + (r_2 - 1)k_1 + (r_3 - 1)k_1k_2 + \cdots + (r_n - 1)k_1k_2...k_{n-1}$$

Answer

Yes, given a single random number $X$ which generates elements $\{1,\dots,k\}$, with the $0

Basically, we're going to pick a real number in $[0,1]$ by successively reducing our range. To pick real exactly, we'd have to roll the die infinitely many times, but luckily, with probability $1$, in a finite amount of time, the current interval will no longer contain $q$, and we can stop, because then we know if $q$ is less than or greater than the real.

We start with the entire interval $[a_0,b_0]=[0,1]$.

At step $n$, we have interval $[a_n,b_n]$. If $b_nq$ then we halt with failure.

Otherwise, we roll the die.

If it comes up $1$, we take the next interval as $[a_n,(1-p)a_n+pb_n]$.
If it comes up something other than $1$, we take the next interval to be $[(1-p)a_n+pb_n,b_n]$.

The interval at step $n$ is at most length $\left(\max(p,1-p)\right)^n$. There is no guarantee that the process with stop in a known number of die rolls, but it will stop with probability $1$ in a finite number of rolls.

Edit: Ian asked for the expected number of rolls to know where you are.

This is rather complex, and depends on $q$ and $p$ as follows. Given an infinite sequence $\{a_i\}_1^{\infty}$ each in $\{0,1\}$, we can define $R_n=\sum_{i=1}^n a_i$ and $L_n=n-R_n$. We treat the $a_i$ as "left-right" choices in a binary tree.

Then for almost all[*] $q\in[0,1]$ there exist exactly one sequence $\{a_i\}$ such that:

$$q=\sum a_ip^{L_n}(1-p)^{R_n}$$

This has the advantage that if $\{a_i\}$ corresponds to $q_1$ and $\{b_i\}$ corresponds to $q_2$, then if $q_1

The expected number of rolls is going to depend on the $a_i$ corresponding to $q$.

That said, let $e_p(q)$ be the expected number of rolls. We can define the expected number recursively as follows:

$$e_p(q)=\begin{cases}
1 + (1-p)e_p\left(\frac{q-p}{1-p}\right)&p1+pe_p\left(\frac{q}{p}\right)&p>q
\end{cases}$$

But whether $pp$ almost certainly.

Finally, if $a_1=0$, then $\frac{q}{p}$ corresponds to the sequence $\{a_2,a_3,\dots\}$ and if $a_1=1$ then $\frac{q-p}{1-p}$ corresponds to $\{a_2,a_3,\dots\}$.

So we really see this expected value is related to the sequence $\{a_i\}$, but it is a mess to compute it.

The value is:

$$\sum_{i=0}^\infty p^{L_i}(1-p)^{R_i}$$

which we can also see because $p^{L_i}(1-p)^{R_i}$ is the odds that $q$ is still in our interval after trial $i$.

This is no more than $$\frac{1}{1-\max(p,1-p)},$$ for any $q$.

If you want more efficient use of the die (I'm using it as a coin toss here) and it has $N$ sides with equal probabilities, then the expected number of rolls is:

$$\sum_{k=0}^\infty \frac{1}{N^k}=\frac{N}{N-1}$$
and is independent of $q$. (That's true if $p=\frac{1}{2}$ in the original approach.

statistics - CDF from joint PDF

My joint PDF function of (x,y) is given
\begin{cases} 2e^{-x-y}, & 0\leq x\leq y\leq \infty\\0, & otherwise
\end{cases}
How do I find joint CDF from this PDF? I know that my CDF is zero when both x and y are less then zero and that I'm supposed to integrate PDF on a given region, but there is a solution in my book that says I'm supposed to calculate CDF on this area too:

$$ 0\leq y\leq x, y>0 $$

and it equals $$e^{-2y} - 1$$

Why do I do this? Shouldn't CDF be zero in this case? Also, isn't it redudant to say that y>0 in that area?

linear algebra - Determinant of block matrices

I am having the following block matrix

$$\left[\begin{array}{cccc}
\mathbf{A+B} & \mathbf{B} & \cdots & \mathbf{B} \\
\mathbf{B} & \mathbf{A+B} & \cdots & \mathbf{B} \\
\vdots & \vdots & \ddots & \vdots\\
\mathbf{B} & \mathbf{B} & \cdots & \mathbf{A+B}
\end{array}\right]$$

where $\mathbf{A}$ and $\mathbf{B}$ are full rank, symmetric square matrices. There are $n$ blocks in each direction. I want to obtain the determinant of the block matrix.

I play with some examples and the determinants seems to be

$$\det(\mathbf{A})^{n-1}\det(\mathbf{A}+n\mathbf{B})$$

May I ask whether this is correct or not, and is there any proof?

real analysis - Proof that $prod_{n=1}^N(1 - K/n)$ is $O(N^{-K})$

I am trying to prove Raabe's test and one of the steps is to show that $$\prod_{n=1}^N(1 - K/n)$$ is order $O(N^{-K})$. The book I'm using doesn't give me tools to do this so I'm wondering how it's done.

There's another proof I've found online that involves telescoping sums. However, I am interested in seeing how this proof is done.

Thanks.

Answer

If $0 < \frac{K}{n} < 1$ then we have $1 - \frac{K}{n} = e^{\log\left(1 -\frac{K}{n}\right)} = e^{-\frac{K}{n} - \frac{K^2}{2n^2} - \mathcal{O}\left(\frac{K}{n}\right)^3}$ where we used $\log(1-x) = -x - \frac{x^2}{2} + \mathcal{O}(x^3)$. Fix a $M > \lceil K \rceil$ then if $N > M$ we have

$$\prod_{n=1}^N\left(1 - \frac{K}{n}\right) = \prod_{n=1}^{M-1}\left(1 - \frac{K}{n}\right)\times \prod_{n=M}^N\left(1 - \frac{K}{n}\right) \\= \prod_{n=1}^{M-1}\left(1 - \frac{K}{n}\right)\times \prod_{n=M}^{N}e^{-\frac{K}{n} - \frac{K^2}{2n^2} - \mathcal{O}\left(\frac{K}{n}\right)^3} = \prod_{n=1}^{M-1}\left(1 - \frac{K}{n}\right) \times e^{-K\sum_{n=M}^N \frac{1}{n} + \mathcal{O}(1)}$$

The first product is independent of $N$ so it's just a constant $C$ and using the asymptotics for the harmonic number $\sum_{n=1}^N \frac{1}{n} = \log(n) + \gamma + \mathcal{O}(N^{-1})$ we have $-K\sum_{n=M}^N \frac{1}{n} = -K\log(N) + \mathcal{O}(1)$ and it follows that

$$\prod_{n=1}^N\left(1 - \frac{K}{n}\right) = C e^{-K\log(N) + \mathcal{O}(1)} = \mathcal{O}(N^{-K})~~~\text{as}~~N\to\infty$$

Tuesday 27 November 2012

algebraic geometry - How to show that axiomatic intersection multiplicity is invariant under affine transformations?

Question: (Exercise 3.3.16, p.139, of Algebraic Geometry: A Problem-Solving Approach by Garrity et al) Show that for any polynomials $f$ and $g$ in $\mathbb{C}[x,y]$ and a point $p$ in $\mathbb{C}^2$, for any affine change of coordinates $T$ we have $$I(p, V(f) \cap V(g)) = I(T(p), V(T^{-1} f) \cap V(T^{-1} g))$$ Hint: this problem is actually not that hard. Its solution involves little or no calculations.

I fail to see at all how one can prove or disprove this using the axiomatic definition of intersection multiplicity at all, since the definition only refers to a single point, but this problem asks to compare calculations at different points.

Is the idea simply supposed to be that for all of the axioms, an affine change of coordinates leaves their corresponding conditions invariant, i.e. $p$ lies on a common component of $V(f)$ and $V(g)$ if and only if $T(p)$ lies on a common component of $V(T^{-1}f)$ and $V(T^{-1}g)$?

How can we show this invariance for each of the axioms using the definition of affine transformation and change of coordinates? (i.e. using formulas?)

It says briefly (on p. 60) of Kirwan's Complex Algebraic Curves, that

Since the conditions are independent of the choice of coordinates we may assume that $p = [0\ 0\ 1]$.

It also says that we can assume that $f$ and $g$ are irreducible by 3. and 4. (using the number of axioms I give below), that $I(p, V(f) \cap V(g))$ is finite by 1., and that $I(p, V(f) \cap V(g)=k>0$ by 2., and that by induction on $k$ we may assume that any intersection multiplicity strictly less than $k$ can be calculated using only the conditions 1.-6. (The idea being to show that the intersection multiplicity can always be calculated using only these axioms, and thus that these axioms must determine the intersection multiplicity completely.)

Just like in Garrity et al, no careful proof is given of any of these claims. I get the "idea" behind all of them, none of them seem shocking and they all seem plausible, but I don't think that I know how to prove them rigorously if I needed to do so.

Context: This might be easier using the definition of intersection multiplicity in terms of resultants -- this is not what I am asking about here.

I am talking about the axiomatic definition (which turns out to be equivalent to that given in terms of resultants). I am copy-pasting the definition given in Garrity et al, Algebraic Geometry: A Problem Solving Approach p. 138, theorem 3.3.12, although similar definitions can be found on Wikipedia and in Kirwan's Complex Algebraic Curves, Theorem 3.18, p. 59:

Given polynomials $f$ and $g$ in $\mathbb{C}[x,y]$ and a point $p$ in $\mathbb{C}^2$, the intersection multiplicity is the uniquely defined number $I(p, V(f) \cap V(g))$ such that the following axioms are satisfied:

1. $I(p, V(f) \cap V(g)) \in \mathbb{Z}_{\ge 0}$, unless $p$ lies on a common component of $V(f)$ and $V(g)$, in which case $I(p, V(f) \cap V(g) = \infty$.

2. $I(p, V(f) \cap V(g)) = 0$ if and only if $p \not\in V(f) \cap V(g)$.

3. Two distinct lines meet with intersection multiplicity one at their common point of intersection.

4. $I(p, V(f) \cap V(g)) = I(p, V(g) \cap V(f))$.

5. $I(p, V(f) \cap V(g)) = \sum r_i s_j I(p, V(f_i) \cap V(g_j))$ when $f= \prod f_i^{r_i}$ and $g = \prod g_j^{s_j}$.

6. $I(p, V(f) \cap V(g)) = I(p, V(f) \cap V(g+af))$ for all $a \in \mathbb{C}[x,y]$.

Answer

I struggled with this too. It seems like an elaborate induction could be done. First note that Axioms 1, 2, 3 and 4 do not depend on a choice of coordinates. The basis step of the argument would be to invoke Axiom 3 to show that $I(p,C,D) = I(Tp,TC,TD)$ when $C$ and $D$ are lines. If the curves $C$ and $D$ had larger degree than one then use Axioms 5 and 6 to reduce the computation of $I(Tp,TC,TD)$ to that of smaller degree curves and invoke the induction hypothesis.

Also, I believe that Kirwan's proof of uniqueness could avoid choosing a favored coordinate system, although it would notationally more burdensome. With that, one could show that for a given transformation the function $I(T\cdot,T\cdot,T\cdot)$ satisfys the same axioms as $I$ and so they have to be the same function.

Either way, I don't think either of these things is what the authors had in mind.

Finally, Fulton includes invariance under change of coordinates as an axiom. This seems like a reasonable thing to demand, and given that the point of writing these axioms is not to get away with the fewest number of axioms, but to sweep the details of a good definition under the rug, this seems like a reasonable thing to throw into the mix. Indeed, given a good definition, $I(p,V(f),V(g)) = \dim_k k[x,y]_p/(f,g)$, it is clear that the ring on right will be isomorphic after applying an affine change of variables.

algebra precalculus - Review of solved Pre-Calculus/Calculus questions (Range of subjects)

I really need some help. This has been driving me crazy for the past week! I've been working on this calculus sudoku and I cannot solve it, so I have come to the conclusion that I must have incorrectly answered at least one question. Are there any glaring mistakes here?

Thank you very much!

Answer

$7x^{\sin \pi}$ should be $7,$ rather than $1$.

calculus - Evaluate $sum_{r=1}^{infty} frac{sin(rpi x)}{r cdot y^r}$

Find a closed form expression for

$$\sum_{r=1}^{\infty} \dfrac{\sin(r\pi x)}{r \cdot y^r}$$

I know that $\displaystyle\sum_{r=1}^{\infty} \dfrac{\sin(r \pi x)}{r} = \dfrac{\pi}{2} - \left\{\dfrac{x}{2}\right \}$ but I don't know how to obtain a closed form for the required summation. I thought about using Euler's Formula but it became messy.

Any help will be appreciated.
Thanks.

Answer

$$
\sum_{r=1}^{\infty} \dfrac{\sin(r\pi x)}{r \cdot y^r}=\mathrm{Im}\sum_{r=1}^{\infty} \dfrac{\exp(\mathrm{i}r\pi x)}{r \cdot y^r}=\mathrm{Im}\int_0^\infty ds \sum_{r=1}^{\infty} \left(\frac{e^{\mathrm{i}\pi x-s}}{y}\right)^r=\mathrm{Im}\int_0^\infty ds\frac{e^{i \pi x}}{e^s y-e^{i \pi x}}=
$$
$$
=-\mathrm{Im}\log \left(1-\frac{e^{i \pi x}}{y}\right)=- \mathrm{arctan}\left(1-\frac{\cos (\pi x)}{y},-\frac{\sin (\pi x)}{y}\right)\ ,
$$
where $\log$ is the principal branch of the complex logarithm, and we used $1/z=\int_0^\infty ds\ e^{-s z}$, for $z>0$. The function arctan with two arguments is described here https://reference.wolfram.com/language/ref/ArcTan.html. I checked with Mathematica a few cases and it seems it works.

summation - Deriving the formula for the $n^{th}$ tetrahedral number

$T(n)$ is $n^{th}$ triangular number, where $T\left(n\right)=\frac{n^2+n}{2}$

And from other sources I know the $n^{th}$ tetrahedral number is

$G\left(x\right)=\sum _{n=1}^xT\left(n\right)=\frac{\left(x^2+x\right)\left(x+2\right)}{6}$

I also happen to know that the formula for the volume of a tetrahedron is:

$V=\frac{1}{2}Ah$,

where $A$ is the area of the triangular base and $h$ is the height.

If I sat down one day not knowing the formula for $G(x)$ and wanted to create a function to find the $n^{th}$ tetrahedral number, how do I derive it?

I've seen proofs. I want to know how the proof authors arrived at that formula in the first place.

Answer

In general, if $f(n)$ is a polynomial with degree $k$, and if $$\sum_{x=1}^n f(x) = g(n)$$

then $g(n)$ must be a polynomial with degree $k+1$.

This means that since triangular numbers are given by a polynomial of degree $2$, tetrahedral numbers must be given by a polynomial of degree $3$.

Let $a,b,c,d \in \mathbb{R}$ such that the $n^\text{th}$ tetrahedral number $G(n)$ is given by

$$an^3 + bn^2 + cn + d$$

We know immediately that $d=0$ because $G(0)=0$ (the empty sum).

Now we can simply list out any three tetrahedral numbers to find the general formula. Let's use the first three.

\begin{align*}
G(1) \,&=\, T(1) = 1 \\\\
G(2) \,&=\,T(1) + T(2) = 1 + 3 \\\\
G(3) \,&=\,T(1) + T(2) + T(3) = 1 + 3 + 6 \\
\end{align*}

Rewriting, we get these equations involving the coefficients:

\begin{align*}
1 &= a + b + c\\\\
4 &= 8a + 4b + 2c\\\\
10 &= 27a + 9b + 3c\\
\end{align*}

Three linear equations, three variables. While it's a bit tedious to do the row reduction, it's definitely one way to derive the formula.

probability - Limit of Poisson Distribution

Just for fun, I'm looking at the concentration of the Poisson Distribution near it's mean. For $\lambda=10$, there is a 36% probability of being within 10% of the mean. For $\lambda=100$, that number jumps to 70%, and at $\lambda = 1000$, it's nearly 100%.

I've just taken a course in Probability...and I'm trying to make it more concrete in my mind. I'm just looking for a way to formalize this observation, so I can relate it to what I've learned.

It seems I have for $X_n$ Poisson w/ Parameter $n$, $P\big(\big|\frac{X_n}{n}-1\big| > \epsilon\big) \rightarrow 0$ as $n \rightarrow \infty$.

Is that the strongest thing I can say? How would you say this in the language of probability? It's almost like convergence in probability, except for the division by $n$. Thanks!

Answer

This is a case of the Weak Law of Large Numbers, since $X_n$ is the sum of $n$ independent Poisson(1) random variables. A stronger statement is given by the
Central Limit Theorem. See also the theory of Large Deviations, which tells you
that for $a > 1$,
$$\lim_{n \to \infty} \dfrac{1}{n} \log P(X_n/n \ge a ) = - a \ln(a) + a - 1 $$

(if I've done my calculations right).

Monday 26 November 2012

Trying to correctly write the proof using strong induction of the sum of the nth positive integer

I'm learning about proofs using induction and our professor want us to always use strong induction when giving proofs. In my understanding, strong induction is used to show the range of numbers you have shown to be true. I want to be able to write a complete and clear proof. I'm not really understanding how to correctly write the inductive hypothesis for this problem: for all natural numbers n, prove 1+2+3,+...+ n = n(n+1)/2.

I already showed the base case is true when n = 1. But for the inductive step I want to know if I did it right or not. Even if its correct, is there anything I can add or remove to make it clearer? This is what I have so far:

Inductive hypothesis:
If n <= k and n >= 1, assume 1+2+3+...+k = k(k+1)/2 is true for 1 <= n <= k.
Prove 1+2+3+...+ k+(k+1) = (k+1)(k+1+1)/2.

I know how to show the proof after this, I just want to understand how to write the inductive hypothesis. Thanks for any help.

Blog

Friday 30 November 2012

elementary set theory - An explicit bijection between the power set $mathcal P left({mathbb{N}}right)$ and $2^mathbb{N}$.

Thursday 29 November 2012

Use any dice to calculate the outcome of any probability

statistics - CDF from joint PDF

linear algebra - Determinant of block matrices

real analysis - Proof that $prod_{n=1}^N(1 - K/n)$ is $O(N^{-K})$

Tuesday 27 November 2012

algebraic geometry - How to show that axiomatic intersection multiplicity is invariant under affine transformations?

algebra precalculus - Review of solved Pre-Calculus/Calculus questions (Range of subjects)

calculus - Evaluate $sum_{r=1}^{infty} frac{sin(rpi x)}{r cdot y^r}$

summation - Deriving the formula for the $n^{th}$ tetrahedral number

probability - Limit of Poisson Distribution

Monday 26 November 2012

Trying to correctly write the proof using strong induction of the sum of the nth positive integer

real analysis - How to find $lim_{hrightarrow 0}frac{sin(ha)}{h}$