38.3: Pseudoinverse
( \newcommand{\kernel}{\mathrm{null}\,}\)
In this class we often talk about solving problems of the form:
Ax=b
Currently we have determined that this problem becomes very nice when the n×n matrix A has an inverse. We can easily multiply each side by the inverse:
A−1Ax=A−1b
Since A−1A=I the solution for x is simply:
x=A−1b
Now, let us consider a a more general problem where the m×n matrix A is not square, i.e. m≠n and its rank r maybe less than m and/or n. In this case we want to find a Pseudoinverse (which we denote as A+) which acts like an inverse for a non-square matrix. In other words we want to find an A+ for A such that:
A+A≈I
Assuming we can find the n×m matrix A+, we should then be able to solve for x as follows:
Ax=b
A+Ax=A+b
x≈A+b
How do we know there is a Pseudoinverse
Assuming the general case of a m×n matrix A where its rank r maybe less than m and/or n (r≤m and r≤n). We can conclude the following about the fundamental spaces of A:
- The rowspace of A is in Rn with dimension r
- The columnspace of A is in Rm also with dimension r.
- The nullspace of A is in Rn with dimension n−r
- The nullspace of A⊤ is in Rm with dimension m−r.
Because the rowspace of A and the column space A have the same dimension then A is a the one-to-one mapping from the row space to the columnspace. In other words:
- For any x in the rowspace, we have that Ax is one point in the columnspace. If x′ is another point in the row space different from x, we have Ax≠Ax′ (The mapping is one-to-one).
- For any y in the columnspace, we can find x in the rowspace such that Ax=y (The mapping is onto).
The above is not really a proof but hopefully there is sufficient information to convince yourself that this is true.
How to compute pseudoinverse
We want to find the n×m matrix that maps from columnspace to the rowspace of A, and x=A+Ax, if x is in the rowspace.
- Let's apply SVD on A: A=UΣV⊤, where U is a m×m matrix, V⊤ is a n×n matrix, and Σ is a diagonal m×n matrix. We can decompose the matrices as A=[⋮⋮U1U2⋮⋮][Σ1000][⋯V⊤1⋯⋯V⊤2⋯]. Here U1 is of m×r, U2 is of m×(m−r), Σ1 is of r×r, V⊤1 is of r×n, and V⊤2 is of (n−r)×n.
- The columnspace of U1 is the columnspace of A, and columnspace of U2 is the nullspace of A⊤.
- The rowspace of V1 is the rowspace of A, and rowspace of V2 is the nullspace of A.
- If x is in the rowspace of A, we have that V⊤2x=0. We have Ax=U1Σ1V⊤1x.
- If we define a matrix B=V1Σ−11U⊤1, we have that BAx=V1Σ−11U⊤1U1Σ1V⊤1x=V1V⊤1x. That is BAx=x is x is in the rowspace of A.
- The matrix B is the pseudoinverse of matrix A. A+=V1Σ−11U⊤1 A+=[⋮⋮V1V2⋮⋮][Σ−11000][⋯U⊤1⋯⋯U⊤2⋯].
Let A=[1,2], we know that r=m=1 and n=2.
Calculate the pseudoinverse A+ of A using the numpy.linalg
function pinv
:
Compute AA+ and A+A
If x is in the nullspace of A what is the effect of A+Ax?
If x is in the rowspace of A what is the effect of A+Ax?
Left inverse is pseudoinverse
We can compute the left inverse of A if r=n≤m. In this case, we may have more rows than columns, and the matrix A has full column rank.
In this case, the SVD of A is A=UΣV⊤. Here U is of m×n, Σ is of n×n and nonsingular, V⊤ is of n×n. The pseudoinverse of A is A+=VΣ−1U⊤.
The left inverse of A is (A⊤A)−1A⊤=(VΣU⊤UΣV⊤)−1VΣU⊤=V(ΣΣ)−1V⊤VΣU⊤=VΣ−1U⊤=A+.
Let A=[12], we know that r=n=1 and m=2. Then we have the left inverse.
Calculate the pseudoinverse A⊤ of A.
Calculate the left inverse of A, and verify that it is the same as A⊤.