7.4: Singular Value Decompositions
The Spectral Theorem has animated the past few sections. In particular, we applied the fact that symmetric matrices can be orthogonally diagonalized to simplify quadratic forms, which enabled us to use principal component analysis to reduce the dimension of a dataset.
But what can we do with matrices that are not symmetric or even square? For instance, the following matrices are not diagonalizable, much less orthogonally so:
In this section, we will develop a description of matrices called the singular value decomposition that is, in many ways, analogous to an orthogonal diagonalization. For example, we have seen that any symmetric matrix can be written in the form \(QDQ^T\) where \(Q\) is an orthogonal matrix and \(D\) is diagonal. A singular value decomposition will have the form \(U\Sigma V^T\) where \(U\) and \(V\) are orthogonal and \(\Sigma\) is diagonal. Most notably, we will see that every matrix has a singular value decomposition whether it's symmetric or not.
Preview Activity 7.4.1.
Let's review orthogonal diagonalizations and quadratic forms as our understanding of singular value decompositions will rely on them.
- Suppose that \(A\) is any matrix. Explain why the matrix \(G = A^TA\) is symmetric.
- Suppose that \(A = \begin{bmatrix} 1 & 2 \\ -2 & -1 \\ \end{bmatrix}\text{.}\) Find the matrix \(G=A^TA\) and write out the quadratic form \(q_G\left(\twovec{x_1}{x_2}\right)\) as a function of \(x_1\) and \(x_2\text{.}\)
-
What is the maximum value of \(q_G(\mathbf x)\) and in which direction does it occur?
- What is the minimum value of \(q_G(\mathbf x)\) and in which direction does it occur?
- What is the geometric relationship between the directions in which the maximum and minimum values occur?
Finding singular value decompositions
We will begin by explaining what a singular value decomposition is and how we can find one for a given matrix \(A\text{.}\)
Recall how the orthogonal diagonalization of a symmetric matrix is formed: if \(A\) is symmetric, we write \(A = QDQ^T\) where the diagonal entries of \(D\) are the eigenvalues of \(A\) and the columns of \(Q\) are the associated eigenvectors. Moreover, the eigenvalues are related to the maximum and minimum values of the associated quadratic form \(q_A(\mathbf u)\) among all unit vectors.
A general matrix, particularly a matrix that is not square, may not have eigenvalues and eigenvectors, but we can discover analogous features, called singular values and singular vectors , by studying a function somewhat similar to a quadratic form. More specifically, any matrix \(A\) defines a function
which measures the length of \(A\mathbf x\text{.}\) For example, the diagonal matrix \(D=\begin{bmatrix} 3 & 0 \\ 0 & -2 \\ \end{bmatrix}\) gives the function \(l_D(\mathbf x) = \sqrt{9x_1^2 + 4x_2^2}\text{.}\) The presence of the square root means that this function is not a quadratic form. We can, however, define the singular values and vectors by looking for the maximum and minimum of this function \(l_A(\mathbf u)\) among all unit vectors \(\mathbf u\text{.}\)
While \(l_A(\mathbf x)\) is not itself a quadratic form, it becomes one if we square it:
We call \(G=A^TA\text{,}\) the Gram matrix associated to \(A\) and note that
This is important in the next activity, which introduces singular values and singular vectors.
Activity 7.4.2.
The following interactive figure will help us explore singular values and vectors geometrically before we begin a more algebraic approach. This figure is also available at gvsu.edu/s/0YE .
Select the matrix \(A=\begin{bmatrix} 1 & 2 \\ -2 & - 1 \\ \end{bmatrix}\text{.}\) As we vary the vector \(\mathbf x\text{,}\) we see the vector \(A\mathbf x\) in gray while the height of the blue bar to the right tells us \(l_A(\mathbf x) = |A\mathbf x|\text{.}\)
-
The first
singular value
\(\sigma_1\) is the maximum value of \(l_A(\mathbf x)\) and an associated
right singular vector
\(\mathbf v_1\) is a unit vector describing a direction in which this maximum occurs.
Use the diagram to find the first singular value \(\sigma_1\) and an associated right singular vector \(\mathbf v_1\text{.}\)
-
The second singular value \(\sigma_2\) is the minimum value of \(l_A(\mathbf x)\) and an associated right singular vector \(\mathbf v_2\) is a unit vector describing a direction in which this minimum occurs.
Use the diagram to find the second singular value \(\sigma_2\) and an associated right singular vector \(\mathbf v_2\text{.}\)
-
Here's how we can find the right singular values and vectors without using the diagram. Remember that \(l_A(\mathbf x) = \sqrt{q_G(\mathbf x)}\) where \(G=A^TA\) is the Gram matrix associated to \(A\text{.}\) Since \(G\) is symmetric, it is orthogonally diagonalizable. Find \(G\) and an orthogonal diagonalization of it.
What is the maximum value of the quadratic form \(q_G(\mathbf x)\) among all unit vectors and in which direction does it occur? What is the minimum value of \(q_G(\mathbf x)\) and in which direction does it occur?
- Because \(l_A(\mathbf x) = \sqrt{q_G(\mathbf x)}\text{,}\) the first singular value \(\sigma_1\) will be the square root of the maximum value of \(q_G(\mathbf x)\) and \(\sigma_2\) the square root of the minimum. Verify that the singular values that you found from the diagram are the square roots of the maximum and minimum values of \(q_G(\mathbf x)\text{.}\)
- Verify that the right singular vectors \(\mathbf v_1\) and \(\mathbf v_2\) that you found from the diagram are the directions in which the maximum and minimum values occur.
-
Finally, we introduce the
left singular vectors
\(\mathbf u_1\) and \(\mathbf u_2\) by requiring that \(A\mathbf v_1 = \sigma_1\mathbf u_1\) and \(A\mathbf v_2=\sigma_2\mathbf u_2\text{.}\) Find the two left singular vectors.
-
Form the matrices
\begin{equation*} U = \begin{bmatrix}\mathbf u_1 & \mathbf u_2 \end{bmatrix}, \hspace{24pt} \Sigma = \begin{bmatrix} \sigma_1 & 0 \\ 0 & \sigma_2 \\ \end{bmatrix}, \hspace{24pt} V = \begin{bmatrix}\mathbf v_1 & \mathbf v_2 \end{bmatrix} \end{equation*}
and explain why \(AV = U\Sigma\text{.}\)
- Finally, explain why \(A=U\Sigma V^T\) and verify that this relationship holds for this specific example.
As this activity shows, the singular values of \(A\) are the maximum and minimum values of \(l_A(\mathbf x)=|A\mathbf x|\) among all unit vectors and the right singular vectors \(\mathbf v_1\) and \(\mathbf v_2\) are the directions in which they occur. The key to finding the singular values and vectors is to utilize the Gram matrix \(G\) and its associated quadratic form \(q_G(\mathbf x)\text{.}\) We will illustrate with some more examples.
We will find a singular value decomposition of the matrix \(A=\begin{bmatrix} 1 & 2 \\ -1 & 2 \end{bmatrix} \text{.}\) Notice that this matrix is not symmetric so it cannot be orthogonally diagonalized.
We begin by constructing the Gram matrix \(G = A^TA = \begin{bmatrix} 2 & 0 \\ 0 & 8 \\ \end{bmatrix}\text{.}\) Since \(G\) is symmetric, it can be orthogonally diagonalized with
We now know that the maximum value of the quadratic form \(q_G(\mathbf x)\) is 8, which occurs in the direction \(\twovec01\text{.}\) Since \(l_A(\mathbf x) = \sqrt{q_G(\mathbf x)}\text{,}\) this tells us that the maximum value of \(l_A(\mathbf x)\text{,}\) the first singular value, is \(\sigma_1=\sqrt{8}\) and that this occurs in the direction of the first right singular vector \(\mathbf v_1=\twovec01\text{.}\)
In the same way, we also know that the second singular value \(\sigma_2=\sqrt{2}\) with associated right singular vector \(\mathbf v_2=\twovec10\text{.}\)
The first left singular vector \(\mathbf u_1\) is defined by \(A\mathbf v_1 = \twovec22 = \sigma_1\mathbf u_1\text{.}\) Because \(\sigma_1 = \sqrt{8}\text{,}\) we have \(\mathbf u_1 = \twovec{1/\sqrt{2}}{1/\sqrt{2}}\text{.}\) Notice that \(\mathbf u_1\) is a unit vector because \(\sigma_1 = |A\mathbf v_1|\text{.}\)
In the same way, the second left singular vector is defined by \(A\mathbf v_2 = \twovec1{-1} = \sigma_2\mathbf u_2\text{,}\) which gives us \(\mathbf u_2 = \twovec{1/\sqrt{2}}{-1/\sqrt{2}}\text{.}\)
We then construct
We now have \(AV=U\Sigma\) because
Because the right singular vectors, the columns of \(V\text{,}\) are eigenvectors of the symmetric matrix \(G\text{,}\) they form an orthonormal basis, which means that \(V\) is orthogonal. Therefore, we have \((AV)V^T = A = U\Sigma V^T\text{.}\) This gives the singular value decomposition
To summarize, we find a singular value decomposition of a matrix \(A\) in the following way:
- Construct the Gram matrix \(G=A^TA\) and find an orthogonal diagonalization to obtain eigenvalues \(\lambda_i\) and an orthonormal basis of eigenvectors.
- The singular values of \(A\) are the squares roots of eigenvalues \(\lambda_i\) of \(G\text{;}\) that is, \(\sigma_i = \sqrt{\lambda_i}\text{.}\) For reasons we'll see in the next section, the singular values are listed in decreasing order: \(\sigma_1 \geq \sigma_2 \geq \ldots \text{.}\) The right singular vectors \(\mathbf v_i\) are the associated eigenvectors of \(G\text{.}\)
-
The left singular vectors \(\mathbf u_i\) are found by \(A\mathbf v_i = \sigma_i\mathbf u_i\text{.}\) Because \(\sigma_i=|A\mathbf v_i|\text{,}\) we know that \(\mathbf u_i\) will be a unit vector.
In fact, the left singular vectors will also form an orthonormal basis. To see this, suppose that the associcated singular values are nonzero. We then have:
\begin{align*} \sigma_i\sigma_j(\mathbf u_i\cdot\mathbf u_j) & {}={} (\sigma_i\mathbf u_i)\cdot(\sigma_j\mathbf u_j) = (A\mathbf v_i)\cdot(A\mathbf v_j)\\ & {}={} \mathbf v_i\cdot(A^TA\mathbf v_j) \\ & {}={} \mathbf v_i\cdot(G\mathbf v_j) = \lambda_j\mathbf v_i\cdot\mathbf v_j = 0 \end{align*}since the right singular vectors are orthogonal.
Let's find a singular value decomposition for the symmetric matrix \(A=\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix}\text{.}\) The associated Gram matrix is
which has an orthogonal diagonalization with
This gives singular values and vectors
and the singular value decomposition \(A=U\Sigma V^T\) where
This example is special because \(A\) is symmetric. With a little thought, it's possible to relate this singular value decomposition to an orthogonal diagonalization of \(A\) using the fact that \(G=A^TA = A^2\text{.}\)
Activity 7.4.3.
In this activity, we will construct the singular value decomposition of \(A=\begin{bmatrix} 1 & 0 & -1 \\ 1 & 1 & 1 \end{bmatrix}\text{.}\) Notice that this matrix is not square so there are no eigenvalues and eigenvectors associated to it.
-
Construct the Gram matrix \(G=A^TA\) and find an orthogonal diagonalization of it.
- Identify the singular values of \(A\) and the right singular vectors \(\mathbf v_1\text{,}\) \(\mathbf v_2\text{,}\) and \(\mathbf v_3\text{.}\) What is the dimension of these vectors? How many nonzero singular values are there?
- Find the left singular vectors \(\mathbf u_1\) and \(\mathbf u_2\) using the fact that \(A\mathbf v_i = \sigma_i\mathbf u_i\text{.}\) What is the dimension of these vectors? What happens if you try to find a third left singular vector \(\mathbf u_3\) in this way?
- As before, form the orthogonal matrices \(U\) and \(V\) from the left and right singular vectors. What are the dimensions of \(U\) and \(V\text{?}\) How do these dimensions relate to the number of rows and columns of \(A\text{?}\)
-
Now form \(\Sigma\) so that it has the same shape as \(A\text{:}\)
\begin{equation*} \Sigma = \begin{bmatrix} \sigma_1 & 0 & 0 \\ 0 & \sigma_2 & 0 \end{bmatrix} \end{equation*}
and verify that \(A = U\Sigma V^T\text{.}\)
- How can you use this singular value decomposition of \(A=U\Sigma V^T\) to easily find a singular value decomposition of \(A^T=\begin{bmatrix} 1 & 1 \\ 0 & 1 \\ -1 & 1 \\ \end{bmatrix}\text{?}\)
We will find a singular value decomposition of the matrix \(A=\begin{bmatrix} 2 & -2 & 1 \\ -4 & -8 & -8 \\ \end{bmatrix}\text{.}\)
Finding an orthogonal diagonalization of \(G=A^TA\) gives
which gives singular values \(\sigma_1=\sqrt{144}=12\text{,}\) \(\sigma_2 = \sqrt{9}= 3\text{,}\) and \(\sigma_3 = 0\text{.}\) The right singular vectors \(\mathbf v_i\) appear as the columns of \(Q\) so that \(V = Q\text{.}\)
We now find
Notice that it's not possible to find a third left singular vector since \(A\mathbf v_3=\zerovec\text{.}\) We thereform form the matrices
which gives the singular value decomposition \(A=U\Sigma V^T\text{.}\)
Notice that \(U\) is a \(2\times2\) orthogonal matrix because \(A\) has two rows, and \(V\) is a \(3\times3\) orthogonal matrix because \(A\) has three columns.
As we'll see in the next section, some additional work may be needed to construct the left singular vectors \(\mathbf u_j\) if more of the singular values are zero, but we won't worry about that now. For the time being, let's record our work in the following theorem.
Theorem 7.4.5. The singular value decomposition.
An \(m\times n\) matrix \(A\) may be written as \(A=U\Sigma V^T\) where \(U\) is an orthogonal \(m\times m\) matrix, \(V\) is an orthogonal \(n\times n\) matrix, and \(\Sigma\) is an \(m\times n\) matrix whose entries are zero except for the singular values of \(A\) which appear in decreasing order on the diagonal.Notice that a singular value decomposition of \(A\) gives us a singular value decomposition of \(A^T\text{.}\) More specifically, if \(A=U\Sigma V^T\text{,}\) then
If \(A=U\Sigma V^T\text{,}\) then \(A^T = V\Sigma^T U^T\text{.}\) In other words, \(A\) and \(A^T\) share the same singular values, and the left singular vectors of \(A\) are the right singular vectors of \(A^T\) and vice-versa.
As we said earlier, the singular value decomposition should be thought of a generalization of an orthogonal diagonalization. For instance, the Spectral Theorem tells us that a symmetric matrix can be written as \(QDQ^T\text{.}\) Many matrices, however, are not symmetric and so they are not orthogonally diagonalizable. However, every matrix has a singular value decomposition \(U\Sigma V^T\text{.}\) The price of this generalization is that we usually have two sets of singular vectors that form the orthogonal matrices \(U\) and \(V\) whereas a symmetric matrix has a single set of eignevectors that form the orthogonal matrix \(Q\text{.}\)
The structure of singular value decompositions
Now that we have an understanding of what a singular value decomposition is and how to construct it, let's explore the ways in which a singular value decomposition reveals the underlying structure of the matrix. As we'll see, the matrices \(U\) and \(V\) in a singular value decomposition provide convenient bases for some important subspaces, such as the column and null spaces of the matrix. This observation will provide the key to some of our uses of these decompositions in the next section.
Activity 7.4.4.
Let's suppose that a matrix \(A\) has a singular value decomposition \(A=U\Sigma V^T\) where
- What are the dimensions of \(A\text{;}\) that is, how many rows and columns does \(A\) have?
-
Suppose we write a three-dimensional vector \(\mathbf x\) as a linear combination of right singular vectors:
\begin{equation*} \mathbf x = c_1\mathbf v_1 + c_2\mathbf v_2 + c_3\mathbf v_3\text{.} \end{equation*}
We would like to find an expression for \(A\mathbf x\text{.}\)
To begin, \(V^T\mathbf x = \threevec{\mathbf v_1\cdot\mathbf x} {\mathbf v_2\cdot\mathbf x} {\mathbf v_3\cdot\mathbf x} = \threevec{c_1}{c_2}{c_3} \text{.}\)
Now \(\Sigma V^T \mathbf x = \begin{bmatrix} 20 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}\threevec{c_1}{c_2}{c_3} = \fourvec{20c_1}{5c_2}00\text{.}\)
And finally, \(A\mathbf x = U\Sigma V^T\mathbf x = \begin{bmatrix} \mathbf u_1 & \mathbf u_2 & \mathbf u_3 & \mathbf u_4 \end{bmatrix} \fourvec{20c_1}{5c_2}00 = 20c_1\mathbf u_1 + 5c_2\mathbf u_2\text{.}\)
To summarize, we have \(A\mathbf x = 20c_1\mathbf u_1 + 5c_2\mathbf u_2\text{.}\)
What condition on \(c_1\text{,}\) \(c_2\text{,}\) and \(c_3\) must be satisfied if \(\mathbf x\) is a solution to the equation \(A\mathbf x=40\mathbf u_1 + 20\mathbf u_2\text{?}\) Is there a unique solution or infinitely many?
- Remembering that \(\mathbf u_1\) and \(\mathbf u_2\) are linearly independent, what condition on \(c_1\text{,}\) \(c_2\text{,}\) and \(c_3\) must be satisfied if \(A\mathbf x = \zerovec\text{?}\)
- How do the right singular vectors \(\mathbf v_i\) provide a basis for \(Nul(A)\text{,}\) the subspace of solutions to the equation \(A\mathbf x = \zerovec\text{?}\)
-
Remember that \(\mathbf b\) is in \(Col(A)\) if the equation \(A\mathbf x = \mathbf b\) is consistent, which means that
\begin{equation*} A\mathbf x = 20c_1\mathbf u_1 + 5c_2\mathbf u_2 = \mathbf b \end{equation*}
for some coefficients \(c_1\) and \(c_2\text{.}\) How do the left singular vectors \(\mathbf u_i\) provide an orthonormal basis for \(Col(A)\text{?}\)
- Remember that \(Rank(A)\) is the dimension of the column space. What is \(Rank(A)\) and how do the number of nonzero singular values determine \(Rank(A)\text{?}\)
This activity shows how a singular value decomposition of a matrix encodes important information about its null and column spaces. This is, in fact, the key observation that makes singular value decompositions so useful: the left and right singular vectors provide orthonormal bases for \(Nul(A)\) and \(Col(A)\text{.}\)
Suppose we have a singular value decomposition \(A=U\Sigma V^T\) where \(\Sigma = \begin{bmatrix} \sigma_1 & 0 & 0 & 0 & 0 \\ 0 & \sigma_2 & 0 & 0 & 0 \\ 0 & 0 & \sigma_3 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ \end{bmatrix} \text{.}\) This means that \(A\) has four rows and five columns just as \(\Sigma\) does.
As in the activity, if \(\mathbf x = c_1 \mathbf v_1 + c_2\mathbf v_2 + \ldots + c_5\mathbf v_5\text{,}\) we have
If \(\mathbf b\) is in the \(Col(A)\text{,}\) then \(\mathbf b\) must have the form
which says that \(\mathbf b\) is a linear combination of \(\mathbf u_1\text{,}\) \(\mathbf u_2\text{,}\) and \(\mathbf u_3\text{.}\) These three vectors therefore form a basis for \(Col(A)\text{.}\) In fact, since they are columns in the orthogonal matrix \(U\text{,}\) they form an orthonormal basis for \(Col(A)\text{.}\)
Remembering that \(Rank(A)=dimCol(A)\text{,}\) we see that \(Rank(A) = 3\text{,}\) which results from the three nonzero singular values. In general, the rank \(r\) of a matrix \(A\) equals the number of nonzero singular values, and \(\mathbf u_1, \mathbf u_2, \ldots,\mathbf u_r\) form an orthonormal basis for \(Col(A)\text{.}\)
Moreover, if \(\mathbf x = c_1 \mathbf v_1 + c_2\mathbf v_2 + \ldots + c_5\mathbf v_5\) satisfies \(A\mathbf x = \zerovec\text{,}\) then
which implies that \(c_1=0\text{,}\) \(c_2=0\text{,}\) and \(c_3=0\text{.}\) Therefore, \(\mathbf x = c_4\mathbf v_4+c_5\mathbf v_5\) so \(\mathbf v_4\) and \(\mathbf v_5\) form an orthonormal basis for \(Nul(A)\text{.}\)
More generally, if \(A\) is an \(m\times n\) matrix and if \(Rank(A) = r\text{,}\) the last \(n-r\) right singular vectors form an orthonormal basis for \(Nul(A)\text{.}\)
Generally speaking, if the rank of an \(m\times n\) matrix \(A\) is \(r\text{,}\) then there are \(r\) nonzero singular values and \(\Sigma\) has the form
The first \(r\) columns of \(U\) form an orthonormal basis for \(Col(A)\text{:}\)
and the last \(n-r\) columns of \(V\) form an orthonormal basis for \(Nul(A)\text{:}\)
In fact, we can say more. Remember that Proposition 7.4.6 says that \(A\) and its transpose \(A^T\) share the same singular values. Since the rank of a matrix equals its number of nonzero singular values, this means that \(Rank(A)=Rank(A^T)\text{,}\) a fact that we cited back in Section 6.2 .
For any matrix \(A\text{,}\)
If we have a singular value decomposition of an \(m\times n\) matrix \(A=U\Sigma V^T\text{,}\) Proposition 7.4.6 also tells us that the left singular vectors of \(A\) are the right singular vectors of \(A^T\text{.}\) Therefore, \(U\) is the \(m\times m\) matrix whose columns are the right singular vectors of \(A^T\text{.}\) This means that the last \(m-r\) vectors form an orthonormal basis for \(Nul(A^T)\text{.}\) Therefore, the columns of \(U\) provide orthonormal bases for \(Col(A)\) and \(Nul(A^T)\text{:}\)
This reflects the familiar fact that \(Nul(A^T)\) is the orthogonal complement of \(Col(A)\text{.}\)
In the same way, \(V\) is the \(n\times n\) matrix whose columns are the left singular vectors of \(A^T\text{,}\) which means that the first \(r\) vectors form an orthonormal basis for \(Col(A^T)\text{.}\) Because the columns of \(A^T\) are the rows of \(A\text{,}\) this subspace is sometimes called the row space of \(A\) and denoted \(Row(A)\text{.}\) While we have yet to have an occasion to use \(Row(A)\text{,}\) there are times when it is important to have an orthonormal basis for it. Fortunately, a singular value decomposition provides just that. To summarize, the columns of \(V\) provide orthonormal bases for \(Col(A^T)\) and \(Nul(A)\text{:}\)
Considered altogether, the subspaces \(Col(A)\text{,}\) \(Nul(A)\text{,}\) \(Col(A^T)\text{,}\) and \(Nul(A^T)\) are called the four fundamental subspaces associated to \(A\text{.}\) In addition to telling us the rank of a matrix, a singular value decomposition gives us orthonormal bases for all four fundamental subspaces.
Theorem 7.4.9.
Suppose \(A\) is an \(m\times n\) matrix having a singular value decomposition \(A=U\Sigma V^T\text{.}\) Then
- \(r=Rank(A)\) is the number of nonzero singular values.
- The columns \(\mathbf u_1,\mathbf u_2,\ldots,\mathbf u_r\) form an orthonormal basis for \(Col(A)\text{.}\)
- The columns \(\mathbf u_{r+1},\ldots,\mathbf u_m\) form an orthonormal basis for \(Nul(A^T)\text{.}\)
- The columns \(\mathbf v_1,\mathbf v_2,\ldots,\mathbf v_r\) form an orthonormal basis for \(Col(A^T)\text{.}\)
- The columns \(\mathbf v_{r+1},\ldots,\mathbf v_n\) form an orthonormal basis for \(Nul(A)\text{.}\)
When we previously outlined a procedure for finding a singular decomposition of an \(m\times n\) matrix \(A\text{,}\) we found the left singular vectors \(\mathbf u_j\) using the expression \(A\mathbf v_j = \sigma_j\mathbf u_j\text{.}\) This produces left singular vectors \(\mathbf u_1, \mathbf u_2,\ldots,\mathbf u_r\text{,}\) where \(r=Rank(A)\text{.}\) If \(r\lt m\text{,}\) however, we still need to find the left singular vectors \(\mathbf u_{r+1},\ldots,\mathbf u_m\text{.}\) Theorem 7.4.9 tells us how to do that: because those vectors form an orthonormal basis for \(Nul(A^T)\text{,}\) we can find them by solving \(A^T\mathbf x = \zerovec\) to find a basis for \(Nul(A^T)\) and applying the Gram-Schmidt algorithm.
We won't worry about this issue too much, however, as we will frequently use software to find singular value decompositions for us.
Reduced singular value decompositions
As we'll see in the next section, there are times when it is helpful to express a singular value decomposition in a slightly different form.
Activity 7.4.5.
Suppose we have a singular value decomposition \(A = U\Sigma V^T\) where
- What are the dimensions of \(A\text{?}\) What is \(Rank(A)\text{?}\)
- Identify bases for \(Col(A)\) and \(Col(A^T)\text{.}\)
-
Explain why
\begin{equation*} U\Sigma = \begin{bmatrix} \mathbf u_1 & \mathbf u_2 \end{bmatrix} \begin{bmatrix} 18 & 0 & 0 \\ 0 & 4 & 0 \\ \end{bmatrix}\text{.} \end{equation*}
-
Explain why
\begin{equation*} \begin{bmatrix} 18 & 0 & 0 \\ 0 & 4 & 0 \\ \end{bmatrix}V^T = \begin{bmatrix} 18 & 0 \\ 0 & 4 \\ \end{bmatrix} \begin{bmatrix} \mathbf v_1 & \mathbf v_2 \end{bmatrix}^T\text{.} \end{equation*}
- If \(A = U\Sigma V^T\text{,}\) explain why \(A=U_r\Sigma_rV_r^T\) where the columns of \(U_r\) are an orthonormal basis for \(Col(A)\text{,}\) \(\Sigma_r\) is a diagonal, invertible matrix, and the columns of \(V_r\) form an orthonormal basis for \(Col(A^T)\text{.}\)
We call this a reduced singular value decomposition .
If \(A\) is an \(m\times n\) matrix having rank \(r\text{,}\) then \(A=U_r \Sigma_r V_r^T\) where
- \(U_r\) is an \(m\times r\) matrix whose columns form an orthonormal basis for \(Col(A)\text{,}\)
- \(\Sigma_r=\begin{bmatrix} \sigma_1 & 0 & \ldots & 0 \\ 0 & \sigma_2 & \ldots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \sigma_r \\ \end{bmatrix}\) is an \(r\times r\) diagonal, invertible matrix, and
- \(V_r\) is an \(n\times r\) matrix whose columns form an orthonormal basis for \(Col(A^T)\text{.}\)
In Example 7.4.4, we found the singular value decomposition
Since there are two nonzero singular values, \(Rank(A) =2\) so that the reduced singular value decomposition is
Summary
This section has explored singular value decompositions, how to find them, and how they organize important information about a matrix.
- A singular value decomposition of a matrix \(A\) is a factorization where \(A=U\Sigma V^T\text{.}\) The matrix \(\Sigma\) has the same shape as \(A\text{,}\) and its only nonzero entries are the singular values of \(A\text{,}\) which appear in decreasing order on the diagonal. The matrices \(U\) and \(V\) are orthogonal and contain the left and right singular vectors.
- To find a singular value decomposition of a matrix, we construct the Gram matrix \(G=A^TA\text{,}\) which is symmetric. The singular values of \(A\) are the square roots of the eigenvalues of \(G\text{,}\) and the right singular vectors \(\mathbf v_j\) are the associated eigenvectors of \(G\text{.}\) The left singular vectors \(\mathbf u_j\) are determined from the relationship \(A\mathbf v_j=\sigma_j\mathbf u_j\text{.}\)
- A singular value decomposition organizes fundamental information about a matrix. For instance, the number of nonzero singular values is the rank \(r\) of the matrix. The first \(r\) left singular vectors form an orthonormal basis for \(Col(A)\) with the remaining left singular vectors forming an orthonormal basis of \(Nul(A^T)\text{.}\) The first \(r\) right singular vectors form an orthonormal basis for \(Col(A^T)\) while the remaining right singular vectors form an orthonormal basis of \(Nul(A)\text{.}\)
- If \(A\) is a rank \(r\) matrix, we can write a reduced singular value decomposition as \(A=U_r\Sigma_rV_r^T\) where the columns of \(U_r\) form an orthonormal basis for \(Col(A)\text{,}\) the columns of \(V_r\) form an orthonormal basis for \(Col(A^T)\text{,}\) and \(\Sigma_r\) is an \(r\times r\) diagonal, invertible matrix.
Exercises 7.4.5Exercises
Consider the matrix \(A = \begin{bmatrix} 1 & 2 & 1 \\ 0 & -1 & 2 \\ \end{bmatrix} \text{.}\)
- Find the Gram matrix \(G=A^TA\) and use it to find the singular values and right singular vectors of \(A\text{.}\)
- Find the left singular vectors.
- Form the matrices \(U\text{,}\) \(\Sigma\text{,}\) and \(V\) and verify that \(A=U\Sigma V^T\text{.}\)
- What is \(Rank(A)\) and what does this say about \(Col(A)\text{?}\)
- Determine an orthonormal basis for \(Nul(A)\text{.}\)
Find singular value decompositions for the following matrices:
- \(\begin{bmatrix} 0 & 0 \\ 0 & -8 \end{bmatrix}\text{.}\)
- \(\begin{bmatrix} 2 & 3 \\ 0 & 2 \end{bmatrix}\text{.}\)
- \(\displaystyle \begin{bmatrix} 4 & 0 & 0 \\ 0 & 0 & 2 \end{bmatrix}\)
- \(\displaystyle \begin{bmatrix} 4 & 0 \\ 0 & 0 \\ 0 & 2 \end{bmatrix}\)
Consider the matrix \(A = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix} \text{.}\)
- Find a singular value decomposition of \(A\) and verify that it is also an orthogonal diagonalization of \(A\text{.}\)
- If \(A\) is a symmetric, positive semidefinite matrix, explain why a singular value decomposition of \(A\) is an orthogonal diagonalization of \(A\text{.}\)
Suppose that the matrix \(A\) has the singular value decomposition
- What are the dimensions of \(A\text{?}\)
- What is \(Rank(A)\text{?}\)
- Find orthonormal bases for \(Col(A)\text{,}\) \(Nul(A)\text{,}\) \(Col(A^T)\text{,}\) and \(Nul(A^T)\text{.}\)
- Find the orthogonal projection of \(\mathbf b=\fourvec102{-1}\) onto \(Col(A)\text{.}\)
Consider the matrix \(A = \begin{bmatrix} 1 & 0 & -1 \\ 2 & 2 & 0 \\ -1 & 1 & 2\\ \end{bmatrix} \text{.}\)
- Construct the Gram matrix \(G\) and use it to find the singular values and right singular vectors \(\mathbf v_1\text{,}\) \(\mathbf v_2\text{,}\) and \(\mathbf v_3\) of \(A\text{.}\) What are the matrices \(\Sigma\) and \(V\) in a singular value decomposition?
- What is \(Rank(A)\text{?}\)
- Find as many left singular \(\mathbf u_j\) as you can using the relationship \(A\mathbf v_j=\sigma_j\mathbf u_j\text{.}\)
- Find an orthonormal basis for \(Nul(A^T)\) and use it to construct the matrix \(U\) so that \(A=U\Sigma V^T\text{.}\)
- State an orthonormal basis for \(Nul(A)\) and an orthonormal basis for \(Col(A)\text{.}\)
Consider the matrix \(B=\begin{bmatrix} 1 & 0 \\ 2 & -1 \\ 1 & 2 \end{bmatrix}\) and notice that \(B=A^T\) where \(A\) is the matrix in Exercise 7.4.5.1.
- Use your result from Exercise 7.4.5.1 to find a singular value decomposition of \(B=U\Sigma V^T\text{.}\)
- What is \(Rank(B)\text{?}\) Determine a basis for \(Col(B)\) and \(Col(B)^\perp\text{.}\)
- Suppose that \(\mathbf b=\threevec{-3}47\text{.}\) Use the bases you found in the previous part of this exericse to write \(\mathbf b=\widehat{\mathbf{b}}+\mathbf b^\perp\text{,}\) where \(\widehat{\mathbf{b}}\) is in \(Col(B)\) and \(\mathbf b^\perp\) is in \(Col(B)^\perp\text{.}\)
- Find the least squares approximate solution to the equation \(B\mathbf x=\mathbf b\text{.}\)
Suppose that \(A\) is a square \(m\times m\) matrix with singular value decomposition \(A=U\Sigma V^T\text{.}\)
- If \(A\) is invertible, find a singular value decomposition of \(A^{-1}\text{.}\)
- What condition on the singular values must hold for \(A\) to be invertible?
- How are the singular values of \(A\) and the singular values of \(A^{-1}\) related to one another?
- How are the right and left singular vectors of \(A\) related to the right and left singular vectors of \(A^{-1}\text{?}\)
- If \(Q\) is an orthogonal matrix, remember that \(Q^TQ=I\text{.}\) Explain why \(\det Q = \pm 1\text{.}\)
- If \(A=U\Sigma V^T\) is a singular value decomposition of a square matrix \(A\text{,}\) explain why \(|\det A|\) is the product of the singular values of \(A\text{.}\)
- What does this say about the singular values of \(A\) if \(A\) is invertible?
If \(A\) is a matrix and \(G=A^TA\) its Gram matrix, remember that
- For a general matrix \(A\text{,}\) explain why the eigenvalues of \(G\) are nonnegative.
- Given a symmetric matrix \(A\) having an eigenvalue \(\lambda\text{,}\) explain why \(\lambda^2\) is an eigenvalue of \(G\text{.}\)
- If \(A\) is symmetric, explain why the singular values of \(A\) equal the absolute value of its eigenvalues: \(\sigma_j = |\lambda_j|\text{.}\)
Determine whether the following statements are true or false and explain your reasoning.
- If \(A=U\Sigma V^T\) is a singular value decomposition of \(A\text{,}\) then \(G=V(\Sigma^T\Sigma)V^T\) is an orthogonal diagonalization of its Gram matrix.
- If \(A=U\Sigma V^T\) is a singular value decomposition of a rank 2 matrix \(A\text{,}\) then \(\mathbf v_1\) and \(\mathbf v_2\) form an orthonormal basis for the column space \(Col(A)\text{.}\)
- If \(A\) is a diagonalizable matrix, then its set of singular values is the same as its set of eigenvalues.
- If \(A\) is a \(10\times7\) matrix and \(\sigma_7 = 4\text{,}\) then the columns of \(A\) are linearly independent.
- The Gram matrix is always orthogonally diagonalizable.
Suppose that \(A=U\Sigma V^T\) is a singular value decomposition of the \(m\times n\) matrix \(A\text{.}\) If \(\sigma_1,\ldots,\sigma_r\) are the nonzero singular values, the general form of the matrix \(\Sigma\) is
- If you know that the columns of \(A\) are linearly independent, what more can you say about the form of \(\Sigma\text{?}\)
- If you know that the columns of \(A\) span \(\mathbb R^m\text{,}\) what more can you say about the form of \(\Sigma\text{?}\)
- If you know that the columns of \(A\) are linearly independent and span \(\mathbb R^m\text{,}\) what more can you say about the form of \(\Sigma\text{?}\)