38.3: Pseudoinverse
- Page ID
- 70398
In this class we often talk about solving problems of the form:
\[Ax = b \nonumber \]
Currently we have determined that this problem becomes very nice when the \(n \times n\) matrix \(A\) has an inverse. We can easily multiply each side by the inverse:
\[A^{-1}Ax = A^{-1}b \nonumber \]
Since \(A^{-1}A = I\) the solution for \(x\) is simply:
\[x = A^{-1}b \nonumber \]
Now, let us consider a a more general problem where the \(m \times n\) matrix \(A\) is not square, i.e. \(m \neq n\) and its rank \(r\) maybe less than \(m\) and/or \(n\). In this case we want to find a Pseudoinverse (which we denote as \(A^+\)) which acts like an inverse for a non-square matrix. In other words we want to find an \(A^+\) for \(A\) such that:
\[A^+A \approx I \nonumber \]
Assuming we can find the \(n \times m\) matrix \(A^+\), we should then be able to solve for \(x\) as follows:
\[Ax = b \nonumber \]
\[A^+Ax = A^+b \nonumber \]
\[x \approx A^+b \nonumber \]
How do we know there is a Pseudoinverse
Assuming the general case of a \(m \times n\) matrix \(A\) where its rank \(r\) maybe less than \(m\) and/or \(n\) (\(r\leq m\) and \(r\leq n\)). We can conclude the following about the fundamental spaces of \(A\):
- The rowspace of \(A\) is in \(R^n\) with dimension \(r\)
- The columnspace of \(A\) is in \(R^m\) also with dimension \(r\).
- The nullspace of \(A\) is in \(R^n\) with dimension \(n−r\)
- The nullspace of \(A^{\top}\) is in \(R^m\) with dimension \(m−r\).
Because the rowspace of \(A\) and the column space \(A\) have the same dimension then \(A\) is a the one-to-one mapping from the row space to the columnspace. In other words:
- For any \(x\) in the rowspace, we have that \(Ax\) is one point in the columnspace. If \(x'\) is another point in the row space different from \(x\), we have \(Ax\neq Ax'\) (The mapping is one-to-one).
- For any \(y\) in the columnspace, we can find \(x\) in the rowspace such that \(Ax=y\) (The mapping is onto).
The above is not really a proof but hopefully there is sufficient information to convince yourself that this is true.
How to compute pseudoinverse
We want to find the \(n \times m\) matrix that maps from columnspace to the rowspace of \(A\), and \(x=A^+Ax\), if \(x\) is in the rowspace.
- Let's apply SVD on \(A\): \(A= U\Sigma V^\top\), where \(U\) is a \(m \times m\) matrix, \(V^{\top}\) is a \(n \times n\) matrix, and \(\Sigma\) is a diagonal \(m \times n\) matrix. We can decompose the matrices as \(A = \begin{bmatrix}\vdots & \vdots \\ U_1 & U_2 \\ \vdots &\vdots\end{bmatrix} \begin{bmatrix}\Sigma_1 & 0 \\ 0 & 0\end{bmatrix} \begin{bmatrix}\cdots & V_1^\top & \cdots \\ \cdots & V_2^\top &\cdots \end{bmatrix}\). Here \(U_1\) is of \(m \times r\), \(U_2\) is of \(m \times (m-r)\), \(\Sigma_1\) is of \(r \times r\), \(V_1^{\top}\) is of \(r \times n\), and \(V_2^{\top}\) is of \((n-r) \times n\).
- The columnspace of \(U_1\) is the columnspace of \(A\), and columnspace of \(U_2\) is the nullspace of \(A^{\top}\).
- The rowspace of \(V_1\) is the rowspace of \(A\), and rowspace of \(V_2\) is the nullspace of \(A\).
- If \(x\) is in the rowspace of \(A\), we have that \(V_2^{\top}x = 0\). We have \(Ax = U_1 \Sigma_1 V_1^{\top}x\).
- If we define a matrix \(B=V_1\Sigma_1^{-1}U_1^\top\), we have that \(BAx=V_1\Sigma_1^{-1}U_1^\top U_1\Sigma_1 V_1^\top x=V_1V_1^\top x\). That is \(BAx=x\) is \(x\) is in the rowspace of \(A\).
- The matrix \(B\) is the pseudoinverse of matrix \(A\). \(A^+ = V_1\Sigma_1^{-1}U_1^\top\) \(A^+ = \begin{bmatrix}\vdots & \vdots \\ V_1 & V_2 \\ \vdots &\vdots\end{bmatrix} \begin{bmatrix}\Sigma_1^{-1} & 0 \\ 0 & 0\end{bmatrix} \begin{bmatrix}\cdots & U_1^\top & \cdots \\ \cdots & U_2^\top &\cdots \end{bmatrix}\).
Let \(A=[1,2]\), we know that \(r=m=1\) and \(n=2\).
Calculate the pseudoinverse \(A^+\) of \(A\) using the numpy.linalg
function pinv
:
Compute \(AA^{+}\) and \(A^{+}A\)
If \(x\) is in the nullspace of \(A\) what is the effect of \(A^{+}Ax\)?
If \(x\) is in the rowspace of \(A\) what is the effect of \(A^{+}Ax\)?
Left inverse is pseudoinverse
We can compute the left inverse of \(A\) if \(r = n \leq m\). In this case, we may have more rows than columns, and the matrix \(A\) has full column rank.
In this case, the SVD of \(A\) is \(A = U \Sigma V^{\top}\). Here \(U\) is of \(m \times n\), \(\Sigma\) is of \(n \times n\) and nonsingular, \(V^{\top}\) is of \(n \times n\). The pseudoinverse of \(A\) is \(A^+ = V\Sigma^{-1}U^\top\).
The left inverse of \(A\) is \((A^\top A)^{-1}A^\top= (V\Sigma U^\top U\Sigma V^\top )^{-1} V\Sigma U^\top = V(\Sigma \Sigma )^{-1} V^\top V\Sigma U^\top = V\Sigma ^{-1} U^\top =A^+\).
Let \(A=\begin{bmatrix}1\\2\end{bmatrix}\), we know that \(r=n=1\) and \(m=2\). Then we have the left inverse.
Calculate the pseudoinverse \(A^{\top}\) of \(A\).
Calculate the left inverse of \(A\), and verify that it is the same as \(A^{\top}\).