So, lets assume that we have two matrices given to us:
A∈ℜM×K: just some dense, real, arbitrary valued matrix.
L∈ℜK×K: A positive diagonal matrix
Now, lets say we have the following situation:
(ATA+L)−1
Is it possible to rephrase this statement in such a way as to make use of the knowledge of
(ATA)−1 so as to avoid this addition inside the inverse?
I'm sorry if this is a little vague :P Just running into an analytical hiccup and trying to figure out how to break this statement apart.
Edit: Would it be possible to analyze this expression using an eigen decomposition? That is, if we say
ATA=QΛQ−1
Then, if L is diagonal:
(ATA+L)=Q(Λ+L)Q−1
Which would make
(ATA+L)−1=Q(Λ+L)−1Q−1
Which might be usable somehow because (Λ+L) is a diagonal matrix and allowing you to calculate the inverse directly by taking 1 over the diagonal entries, right? Or is this completely off?
Edit 2: @Rcollyer So, I changed around my formulation entirely by only looking at rank 1 updates of ATA and was able to use the Sherman-Morrison (a form of the Woodbury) formula to break it apart and get at what I wanted :P
Answer
The only method I am aware of is the Sherman–Morrison formula for performing a rank-1 update of an inverse, or its generalization, the Woodbury matrix identity for performing rank-k updates. In both cases, their advantage lies in the fact that L does not have full rank. If L has full rank, these methods won't provide you with enough speed up to make them worth it over just performing an LU decomposition on ATA+L directly.
I ran into a similar problem where I potentially had several hundred to several thousand matrices to invert where the portion of the matrix sum that was changing had full rank, and LU decomposition was the best I was able to come up with.
Edit: Hunting through my stack of papers, I came across a good review of these techniques.
Edit (2nd question): Your second approach doesn't work, in general. The error is in the second equation, it should be
ATA+L=QΛQ−1+L=QΛQ−1+QQ−1LQQ−1=Q(Λ+Q−1LQ)Q−1
In general, Q−1LQ is not diagonal, unless L commutes with Q which usually only occurs when it is proportional to I.
Edit (idyl speculation): Upon reviewing the second question, I though of a way that may allow you to use that method. As stated previously, the only way for the second method to work is for L to commute with Q. Since L is diagonal, we can rewrite L as αI+N, where α is the most common diagonal element of L. Then, the rank of N will be less than the rank of ATA. Thus, we can calculate (ATA+L)−1 in two steps: calculate
B−1=Q(Λ+αI)−1Q−1
and then perform a rank(N) update of B−1 using the Woodbury matrix identity. If all the diagonal elements of L are equally common, rank(N)=rank(ATA)−1. So, this method is likely only worth it if you have many different L matrices, and the cost of diagonalizing ATA can be amortized. Obviously, the larger ATA is, the smaller the difference between rank(N) and rank(ATA) for you to see a net benefit, but it will require some thought as to whether it is worth it, overall.
No comments:
Post a Comment