Sunday, 11 September 2016

matrices - Inverting the sum of a Dense and Diagonal matrix



So, lets assume that we have two matrices given to us:



$A \in \mathbb{\Re}^{M \times K}$: just some dense, real, arbitrary valued matrix.



$L \in \Re^{K \times K}$: A positive diagonal matrix



Now, lets say we have the following situation:




$\left( A^T A + L \right)^{-1}$



Is it possible to rephrase this statement in such a way as to make use of the knowledge of
$\left( A^T A \right)^{-1}$ so as to avoid this addition inside the inverse?



I'm sorry if this is a little vague :P Just running into an analytical hiccup and trying to figure out how to break this statement apart.



Edit: Would it be possible to analyze this expression using an eigen decomposition? That is, if we say



$A^T A = Q \Lambda Q^{-1}$




Then, if $L$ is diagonal:



$\left( A^T A + L \right) = Q \left( \Lambda + L \right) Q^{-1}$



Which would make



$\left( A^T A + L \right)^{-1} = Q \left( \Lambda + L \right)^{-1} Q^{-1}$



Which might be usable somehow because $\left( \Lambda + L \right)$ is a diagonal matrix and allowing you to calculate the inverse directly by taking $1$ over the diagonal entries, right? Or is this completely off?




Edit 2: @Rcollyer So, I changed around my formulation entirely by only looking at rank 1 updates of $A^T A$ and was able to use the Sherman-Morrison (a form of the Woodbury) formula to break it apart and get at what I wanted :P


Answer



The only method I am aware of is the Sherman–Morrison formula for performing a rank-1 update of an inverse, or its generalization, the Woodbury matrix identity for performing rank-k updates. In both cases, their advantage lies in the fact that $L$ does not have full rank. If $L$ has full rank, these methods won't provide you with enough speed up to make them worth it over just performing an LU decomposition on $A^{\mathrm{T}}A + L$ directly.



I ran into a similar problem where I potentially had several hundred to several thousand matrices to invert where the portion of the matrix sum that was changing had full rank, and LU decomposition was the best I was able to come up with.



Edit: Hunting through my stack of papers, I came across a good review of these techniques.



Edit (2nd question): Your second approach doesn't work, in general. The error is in the second equation, it should be




$$
\begin{align}
A^\mathrm{T}A + L &= Q \Lambda Q^{-1} + L \\
&= Q \Lambda Q^{-1} + Q Q^{-1} L Q Q^{-1} \\
&= Q( \Lambda + Q^{-1} L Q )Q^{-1}
\end{align}
$$



In general, $Q^{-1} L Q$ is not diagonal, unless $L$ commutes with $Q$ which usually only occurs when it is proportional to $I$.




Edit (idyl speculation): Upon reviewing the second question, I though of a way that may allow you to use that method. As stated previously, the only way for the second method to work is for $L$ to commute with $Q$. Since $L$ is diagonal, we can rewrite $L$ as $\alpha I + N$, where $\alpha$ is the most common diagonal element of $L$. Then, the rank of $N$ will be less than the rank of $A^\mathrm{T}A$. Thus, we can calculate $(A^\mathrm{T}A + L)^{-1}$ in two steps: calculate



$$B^{-1} = Q(\Lambda + \alpha I)^{-1}Q^{-1}$$



and then perform a $\text{rank}(N)$ update of $B^{-1}$ using the Woodbury matrix identity. If all the diagonal elements of $L$ are equally common, $\text{rank}(N) = \text{rank}(A^\mathrm{T}A) - 1$. So, this method is likely only worth it if you have many different $L$ matrices, and the cost of diagonalizing $A^\mathrm{T}A$ can be amortized. Obviously, the larger $A^\mathrm{T}A$ is, the smaller the difference between $\text{rank}(N)$ and $\text{rank}(A^\mathrm{T}A)$ for you to see a net benefit, but it will require some thought as to whether it is worth it, overall.


No comments:

Post a Comment

real analysis - How to find $lim_{hrightarrow 0}frac{sin(ha)}{h}$

How to find $\lim_{h\rightarrow 0}\frac{\sin(ha)}{h}$ without lhopital rule? I know when I use lhopital I easy get $$ \lim_{h\rightarrow 0}...