Theoretically, why does this work? How do we know the resulting matrix performs the operation on ALL other matrices?
I guess what I'm asking is whether you can prove without "checking" the resulting matrix that it performs the same operation on all other matrices.
So something like : Let $E_o$ be an elementary operation. Then applying $E_o$ to $I$, we get
$I -> E_m$, where $E_m$ is the matrix associated with the elementary matrix.
Three questions :
1) How do we know $I$ functions as the "identity" when we're applying these operations? All we know is that it works that way when we multiply.
2) How do we know $E_m$ performs the operation we applied?
3) How do we know $E_m$ works for all matrices?
It's really crucial that the proof doesn't explicitly find the resulting matrix and then just check that it does what it's supposed to for all matrices.
Answer
$$
\newcommand{\mm} {\mathbf M}
\newcommand{\mk} {\mathbf K}
\newcommand{\mi} {\mathbf I}
\newcommand{\me} {\mathbf E}
\newcommand{\ml} {\mathbf L}
$$
If you extend your notion of "elementary operation" to include the "do nothing operation" --- call it $N$ for "nothing" --- then the corresponding matrix is $I$. That is to say, applying $N$ to a matrix $\mm$ gives the same result as multiplying the matrix by $\mi$:
$$
N(\mm) = \mi \cdot \mm
$$
Now let's look at some other elementary operation $S$; we can compute
$$
\me = S(\mi), \tag{1}
$$
right? That is to say, the operation $S$, applied to $\mi$ must produce some matrix, and I'm going to call that matrix $\me$.
Now I'm going to define a new operation $T$ on matrices:
$$
T(\mm) = \me \cdot \mm.
$$
And your question, now that I've got the definitions out of the way, is
"How do I know that
$$S(\mm) = T(\mm)$$
for every matrix $\mm$?"
The first thing to realize is that what I've written above, without the word 'elementary' in "some other elementary operation $S$", is not enough to guarantee this result. It's possible that $S$ might be defined so that $S(\mi) = \me$, where $\me$ is some non-identity matrix, but $S(\mm) = \mi$ for every $\mm \ne \mi$. In that case, the claim that $S(\mm) = T(\mm)$ would be false.
So we're going to need to use the notion of "elementary" somehow. It turns out that the key property that elementary operations share --- the one that makes this proof go through --- is that
$$
S(\mm \cdot \mk) = S(\mm) \cdot \mk \tag{2}
$$
where the $\cdot$ here represents matrix multiplication. Supposing, for the moment, that our elementary operation has this property, it's easy to prove that $S(\mm) = T(\mm)$ as follows. We know that
\begin{align}
\mi \cdot \mm &= \mm & \text{property of identity} \\
S(\mi \cdot \mm) &= S(\mm) & \text{apply $S$ to both sides} \\
S(\mi) \cdot \mm &= S(\mm) & \text{use equation 2 above} \\
\me \cdot \mm &= S(\mm) & \text{use equation 1 above} \\
\end{align}
and that's the result we wanted, because the left hand side is exactly the definition of $T(\mm)$.
Now how do we prove that fundamental claim, equation 2? We need to use elementary-ness to do so. It's going to come down to cases, alas. For instance, if $S$ is an operation that swaps rows $p$ and $q$, I need to write out $S$ explicitly:
$$
S(\mm)_{ab} = \begin{cases}
m_{ab} & a \ne p \text{ and } a \ne q \\
m_{pb} & a = q \\
m_{qb} & a = p \\
\end{cases}
$$
That is to say, the $ab$--entry of $S(\mm)$ is the $ab$-entry of $\mm$, unless the row ($a$) is either $p$ or $q$, in which case you have to swap things around.
Now looking at a pair of matrices $\mm$ and $\mk$, we need to know a formula for the $ab$ entry of the product $\mm \cdot \mk$; that's
$$
(mk)_{ab} = \sum_i m_{ai} k_{ib}.
$$
I'm following the usual convention that the entries of $\mm$ are $m_{ij}$ and those of $k$ are $k_{ij}$, and using the two-letter name "$mk$" (in parens) to indicate entries of the product, so that the $i$th row, $j$th column entry of $\mm \cdot \mk$ is denoted $(mk)_{ij}$.
And now we can look at $S(\mm \cdot \mk)_{ab}$ and $ (S(\mm) \cdot \mk)_{ab}$ and compare to see whether they're the same.
Well
$$
\begin{align}
S(\mm \cdot \mk)_{ab}
&= \begin{cases}
(mk)_{ab} & a \ne p \text{ and } a \ne q \\
(mk)_{pb} & a = q \\
(mk)_{qb} & a = p
\end{cases} & \text{definition of $S$} \\
&= \begin{cases}
\sum_i m_{ai} k_{ib} & a \ne p \text{ and } a \ne q \\
\sum_i m_{pi} k_{ib} & a = q \\
\sum_i m_{qi} k_{ib} & a = p
\end{cases} & \text{definition of matrix multiply} \\
&=\left( \begin{cases}
\sum_i m_{ai} & a \ne p \text{ and } a \ne q \\
\sum_i m_{pi} & a = q \\
\sum_i m_{qi} & a = p
\end{cases} \right) k_{ib} & \text{extract $k_{ib}$ factor from each!} \\
&=\sum_i \left( \begin{cases}
m_{ai} & a \ne p \text{ and } a \ne q \\
m_{pi} & a = q \\
m_{qi} & a = p
\end{cases} \right) k_{ib} & \text{extract $\sum_i$, by distributive law} \\
&= \sum_i S(\mm)_{ai} k_{ib} & \text{replace large parens with $S(\mm)_{ai}$, because they match} \\
&= (S(\mm) \cdot \mk)_{ab}.
\end{align}
$$
So they are, in fact, equal, and we're done (with this case).
That still leaves the "multiply a row by a constant" case, and the "add $r$ times one row to another" case to work out, but the proofs for those cases are nearly identical to this one, so I leave that to you.
No comments:
Post a Comment