Linear independence in construction of Jordan canonical form basis for nilpotent endomorphisms

I am proving by construction that there is some basis in which a nilpotent endomorphism has a jordan canonical form that has only ones over supradiagonal. I'll put what I have already and stop where my problem is in order you can think it in the way I am.

What I want to prove is:

Theorem

Let $T\in\mathcal{L}(V)$ a $r-$ nilpotent endomorphism, $V(\mathbb{C})$ a finite-dimensional vector space. There is some basis of $V$ in which the matrix representation of $T$ is a block diagonal matrix, and the blocks have the form
$\begin{align*} \left( \begin{array}{cccccc} 0 &1 &0 &0 &\dots &0\\ 0 &0 &1 &0 &\dots &0\\ 0 &0 &0 &1 &\dots &0\\ \vdots &\vdots &\vdots &\vdots &\ddots &\vdots\\ 0 &0 &0 &0 &\dots &1\\ 0 &0 &0 &0 &\dots &0 \end{array} \right) \end{align*}$
that is, blocks that have null entries except for the ones-filled supradiagonal.

Proof

First we have that if $T$ is a $r-$ nilpotent endomorphism then $T^{r}=0_{\mathcal{L}(V)}$ , then, since $U_{1}=T(V)\subseteq V=id(V)=T^{0}(V)=U_{0}$ therefore $U_{2}=T^{2}(V)=T(T(V))\subseteq T(V)=U_{1}$ and if we suppose that $U_{k}=T^{k}(V)\subseteq T^{k-1}(V)=U_{k-1}$ we conclude that $U_{k+1}=T^{k+1}(V)=T(T^{k}(V))\subseteq T(T^{k-1}(V))=T^{k}(V)=U_{k}$ . Then we have proven by induction over $k$ that $U_{k}=T^{k}(V)\subseteq T^{k-1}(V)=U_{k-1}$ , and since $T^{r}=0_{\mathcal{L}(V)}$ , and $U_{k}=T(U_{k-1})$ then $\{0_{V}\}=U_{r}\subseteq U_{r-1}\subseteq\dots\subseteq U_{1}\subseteq U_{0}=V$ and we have shown too that the $U_{k}$ are $T-$ invariant spaces and $U_{r-1}\subseteq\ker T$ .

In the same manner, let $W_{0}=\ker T^{0}=\ker id=\{0_{V}\}$ and $W_{k}=\ker T^{k}$ . Is easy to see that $T(W_{0})=T(\{0_{V}\})=\{0_{V}\}$ therefore $W_{0}\subseteq W_{1}$ , moreover $T^{2}(W_{1})=T(T(W_{1}))=T(\{0_{V}\})=\{0_{V}\}$ therefore $W_{1}\subseteq W_{2}$ . Then, suppose $W_{k-1}\subseteq W_{k}$ , and we see that $T^{k+1}(W_{k})=T(T^{k}(W_{k}))=T(\{0_{V}\})=\{0_{V}\}$ and therefore $W_{k}\subseteq W_{k+1}$ and we conclude we have the chain of nested spaces $\{0_{V}\}=W_{0}\subseteq W_{1}\subseteq\dots\subseteq W_{r-1}\subseteq W_{r}=V$ since $W_{r}=\ker T^{r}=\ker 0_{\mathcal{L}(V)}=V$ .

Since we have a chain of nested spaces in which the largest is $V$ itself, then if we choose a basis for the smallest non-trivial (Supposing $U_{r}\neq U_{r-1}$ ) of them (that is $U_{r-1}$ ) we can climb chain constructing a basis for the larger spaces completing the basis we have already, what is always possible.

Now, since $U_{r-1}\subseteq\ker T$ then every vector in $U_{r-1}$ is a eigenvector for eigenvalue $0$ . Then every basis we choose for $U_{r-1}$ is a basis of eigenvectors. To complete this basis $\{u_{i}^{(r-1)}\}$ to a basis of $U_{r-2}$ (Supposing $U_{r-1}\neq U_{r-2}$ ) we can remember that $T(U_{r-2})=U_{r-1}$ , therefore every vector in $U_{r-1}$ has a preimage in $U_{r-2}$ . Then there are some $u_{i}^{(r-2)}\in U_{r-2}$ (maybe many for each $i$ since we don't know $T$ is inyective) such that $T(u_{i}^{(r-2)})=u_{i}^{(r-1)}$ . It's to be noted that for fixed $i$ is not possible that $u_{i}^{(r-2)}=u_{i}^{(r-1)}$ since $u_{i}^{(r-1)}$ is an eigenvector associated to eigenvalue $0$ and also every vector in $U_{r-1}$ since they are linear combinations of the basis vectors. Since we have stated they are non unique we can choose one and only one for every $i$ . It only remains to see they are linearly independent: let take a linear combination of null vector $\alpha_{i}u_{i}^{(r-1)}+\beta_{i}u_{i}^{(r-2)}=0_{V}$ and let apply $T$ on both sides, $\alpha_{i}T(u_{i}^{(r-1)})+\beta_{i}T(u_{i}^{(r-2)})=\sum_{i}\alpha_{i}0_{V}+\beta_{i}u_{i}^{(r-1)}=\beta_{i}u_{i}^{(r-1)}=0_{V}$ . Since the last sum is a null linear combination of linearly independent vectors (since they form a basis for $U_{r-1}$ ), it implies that $\beta_{i}=0$ for every $i$ . Therefore the initial expression takes the form $\alpha_{i}u_{i}^{(r-1)}=0_{V}$ and $\alpha_{i}=0$ for every $i$ by the same argument. We conclude that they are linearly independent.

At this moment we have $\{u_{i}^{(r-1)},u_{i}^{(r-2)}\}$ a linearly independent set of vectors in $U_{r-2}$ . If $\dim U_{r-2}=2\dim U_{r-1}$ , then we have finished the construction, if not ( $\dim U_{r-2}\geq 2\dim U_{r-1}+1$ ) then we have to choose $u_{j}^{(r-2)}$ with $j=\dim U_{r-1}+1,\dots, \dim U_{r-2}$ that complete the set to a basis of $U_{r-2}$ . Again, is in construction of the $u_{i}^{(r-2)}$ , we remember that $T(U_{r-2})=U_{r-1}$ . Therefore, every vector we choose will have, under $T$ , the form $T(v_{j}^{(r-2)})=\mu_{ji}u_{i}^{(r-1)}$ . But since we want they to be linearly independent from the $u_{i}^{(r-1)}$ and $u_{i}^{(r-2)}$ we can choose them from $\ker T$ , that is we can set $u_{j}^{(r-2)}=v_{j}^{(r-2)}-\mu_{ji}u_{i}^{(r-2)}$ and applying $T$ we obtain $T(u_{i}^{(r-2)})=T(v_{j}^{(r-2)})-\mu_{ji}T(u_{i}^{(r-2)})=\mu_{ji}u_{i}^{(r-1)}-\mu_{ji}u_{i}^{(r-1)}=0_{V}$ . Then we only need to see they are linearly independent with the others. Let, again, a null linear combination $\alpha_{i}u_{i}^{(r-1)}+\beta_{i}u_{i}^{(r-2)}+\gamma_{j}u_{j}^{(r-2)}=0_{V}$ . First we can apply $T$ both sides: $\alpha_{i}T(u_{i}^{(r-1)})+\beta_{i}T(u_{i}^{(r-2)})+\gamma_{j}T(u_{j}^{(r-2)})=\sum_{i}\alpha_{i}0_{V}+\beta_{i}u_{i}^{(r-1)}+\sum_{j}\gamma_{j}0_{V}=\beta_{i}u_{i}^{(r-1)}=0_{V}$ and therefore $\beta_{i}=0$ for every $i$ since $\{u_{i}^{(r-2)}\}$ is a basis. Then the initial expression takes the form $\alpha_{i}u_{i}^{(r-1)}+\gamma_{j}u_{j}^{(r-2)}=0_{V}$ . Note that we have to sets of vectors that are in $\ker T$ ...

This is the point where I don't see a way to say that the $\alpha_{i},\gamma_{i}=0$ for every $i$ in order to say that they are linearly independent. Any kind of help (hints more than everything else) will be good.

Answer

Mostly, you are both on the right track and everything you say is correct, though there are a few spots where a bit more thought could let you be sharper. Let me discuss them first.

You note along the way that "(Supposing $U_r\neq U_{r-1}$ )". In fact, we know that for each $i$ , $0\leq i\lt r$ , $U_{i+1}\neq U_i$ . The reason is that if we have $U_{i+1}=U_i$ , then that means that $U_{i+2}=T(U_{i+1}) = T(U_i) = U_{i+1}$ , and so we have reached a stabilizing point; since we know that the sequence must end with the trivial subspace, that would necessarily imply that $U_i=\{\mathbf{0}\}$ . But we are assuming that the degree of nilpotence of $T$ is $r$ , so that $U_i\neq\{\mathbf{0}\}$ for any $i\lt r$ ; hence $U_{i+1}\neq U_i$ is a certainty, not an assumption.

You also comment parenthetically: "(maybe many for each $i$ since we don't know $T$ is injective)". Actually, we know that $T$ is definitely not injective, because $T$ is nilpotent. The only way $T$ could be both nilpotent and injective is if $\mathbf{V}$ is zero dimensional. And since every vector of $U_{r-1}$ is mapped to $0$ , it is certainly the case that the restriction of $T$ to $U_i$ is not injective for any $i$ , $0\leq i\lt r$ .

As to what you are doing: suppose $u_1,\ldots,u_t$ are the basis for $U_{r-1}$ , and $v_1,\ldots,v_t$ are vectors in $U_{r-2}$ such that $T(v_i) = u_i$ . We want to show that $\{u_1,\ldots,u_t,v_1,\ldots,v_t\}$ is linearly independent; you can do that the way you did before: take a linear combination equal to $\mathbf{0}$ ,
$\alpha_1u_1+\cdots+\alpha_tu_t + \beta_1v_1+\cdots+\beta_t v_t = \mathbf{0}.$
Apply $T$ to get $\beta_1u_1+\cdots + \beta_tu_t=\mathbf{0}$ and conclude the $\beta_j$ are zero; and then use the fact that $u_1,\ldots,u_t$ is linearly independent to conclude that $\alpha_1=\cdots=\alpha_t=0$ .

Now, this may not be a basis for $U_{r-2}$ , since there may be elements of $\mathrm{ker}(T)\cap U_{r-2}$ that are not in $U_{r-1}$ .

The key is to choose what is missing so that they are linearly independent from $u_1,\ldots,u_t$ . How can we do that? Note that $U_{r-1}\subseteq \mathrm{ker}(T)$ , so in fact $U_{r-1}\subseteq \mathrm{ker}(T)\cap U_{r-2}$ .
So we can complete $\{u_1,\ldots,u_t\}$ to a basis for $\mathrm{ker}(T)\cap U_{r-2}$ with some vectors $z_1,\ldots z_s$ .

The question is now is how to show that $\{u_1,\ldots,u_t,v_1,\ldots,v_t,z_1,\ldots,z_s\}$ are linearly independent. The answeer is: the same way. Take a linear combination equal to $0$ :
$\alpha_1u_1+\cdots +\alpha_tu_t + \beta_1v_1+\cdots +\beta_tv_t + \gamma_1z_1+\cdots+\gamma_s z_s = \mathbf{0}.$
Apply $T$ to conclude that the $\beta_i$ are zero; then use the fact that $\{u_1,\ldots,u_t,z_1,\ldots,z_s\}$ is a basis for $\mathrm{ker}(T)\cap U_{r-2}$ to conclude that the $\alpha_i$ and the $\gamma_j$ are all zero as well.

And now you have a basis for $U_{r-2}$ . Why? Because by the Rank-Nullity
Theorem applied to the restriction of $T$ to $U_{r-2}$ , we know that
$\dim(U_{r-2}) = \dim(T(U_{r-2})) + \dim(\mathrm{ker}(T)\cap U_{r-2}).$
But $T(U_{r-2}) = U_{r-1}$ , so $\dim(T(U_{r-2})) = \dim(U_{r-1}) = t$ ; and $\dim{ker}(T)\cap U_{r-2} = t+s$ , since $\{u_1,\ldots,u_t,z_1,\ldots,z_s\}$ is a basis for this subspace. Hence, $\dim(U_{r-2}) = t+t+s=2t+s$ , which is exactly the number of linearly independent vectors you have.

You want to use the same idea "one step up": you will have that $u_1,\ldots,u_t,z_1,\ldots,z_s$ is a linearly independent subset of $U_{r-3}\cap\mathrm{ker}(T)$ , so you will complete it to a basis of that intersection; after adding preimages to $z_1,\ldots,z_s$ and $v_1,\ldots,v_t$ , you will get a "nice" basis for $U_{r-3}$ . And so on.

Blog

Thursday, 26 December 2019