4.10: Spanning, Linear Independence and Basis in Rⁿ
- Page ID
- 21268
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)- Determine the span of a set of vectors, and determine if a vector is contained in a specified span.
- Determine if a set of vectors is linearly independent.
- Understand the concepts of subspace, basis, and dimension.
- Find the row space, column space, and null space of a matrix.
By generating all linear combinations of a set of vectors one can obtain various subsets of \(\mathbb{R}^{n}\) which we call subspaces. For example what set of vectors in \(\mathbb{R}^{3}\) generate the \(XY\)-plane? What is the smallest such set of vectors can you find? The tools of spanning, linear independence and basis are exactly what is needed to answer these and similar questions and are the focus of this section. The following definition is essential.
Let \(U\) and \(W\) be sets of vectors in \(\mathbb{R}^n\). If all vectors in \(U\) are also in \(W\), we say that \(U\) is a subset of \(W\), denoted \[U \subseteq W\nonumber \]
Spanning Set of Vectors
We begin this section with a definition.
The collection of all linear combinations of a set of vectors \(\{ \vec{u}_1, \cdots ,\vec{u}_k\}\) in \(\mathbb{R}^{n}\) is known as the span of these vectors and is written as \(\mathrm{span} \{\vec{u}_1, \cdots , \vec{u}_k\}\).
Consider the following example.
Describe the span of the vectors \(\vec{u}=\left[ \begin{array}{rrr} 1 & 1 & 0 \end{array} \right]^T\) and \(\vec{v}=\left[ \begin{array}{rrr} 3 & 2 & 0 \end{array} \right]^T \in \mathbb{R}^{3}\).
Solution
You can see that any linear combination of the vectors \(\vec{u}\) and \(\vec{v}\) yields a vector of the form \(\left[ \begin{array}{rrr} x & y & 0 \end{array} \right]^T\) in the \(XY\)-plane.
Moreover every vector in the \(XY\)-plane is in fact such a linear combination of the vectors \(\vec{u}\) and \(\vec{v}\). That’s because \[\left[ \begin{array}{r} x \\ y \\ 0 \end{array} \right] = (-2x+3y) \left[ \begin{array}{r} 1 \\ 1 \\ 0 \end{array} \right] + (x-y)\left[ \begin{array}{r} 3 \\ 2 \\ 0 \end{array} \right]\nonumber \]
Thus \(\mathrm{span}\{\vec{u},\vec{v}\}\) is precisely the \(XY\)-plane.
You can convince yourself that no single vector can span the \(XY\)-plane. In fact, take a moment to consider what is meant by the span of a single vector.
However you can make the set larger if you wish. For example consider the larger set of vectors \(\{ \vec{u}, \vec{v}, \vec{w}\}\) where \(\vec{w}=\left[ \begin{array}{rrr} 4 & 5 & 0 \end{array} \right]^T\). Since the first two vectors already span the entire \(XY\)-plane, the span is once again precisely the \(XY\)-plane and nothing has been gained. Of course if you add a new vector such as \(\vec{w}=\left[ \begin{array}{rrr} 0 & 0 & 1 \end{array} \right]^T\) then it does span a different space. What is the span of \(\vec{u}, \vec{v}, \vec{w}\) in this case?
The distinction between the sets \(\{ \vec{u}, \vec{v}\}\) and \(\{ \vec{u}, \vec{v}, \vec{w}\}\) will be made using the concept of linear independence.
Consider the vectors \(\vec{u}, \vec{v}\), and \(\vec{w}\) discussed above. In the next example, we will show how to formally demonstrate that \(\vec{w}\) is in the span of \(\vec{u}\) and \(\vec{v}\).
Let \(\vec{u}=\left[ \begin{array}{rrr} 1 & 1 & 0 \end{array} \right]^T\) and \(\vec{v}=\left[ \begin{array}{rrr} 3 & 2 & 0 \end{array} \right]^T \in \mathbb{R}^{3}\). Show that \(\vec{w} = \left[ \begin{array}{rrr} 4 & 5 & 0 \end{array} \right]^{T}\) is in \(\mathrm{span} \left\{ \vec{u}, \vec{v} \right\}\).
Solution
For a vector to be in \(\mathrm{span} \left\{ \vec{u}, \vec{v} \right\}\), it must be a linear combination of these vectors. If \(\vec{w} \in \mathrm{span} \left\{ \vec{u}, \vec{v} \right\}\), we must be able to find scalars \(a,b\) such that\[\vec{w} = a \vec{u} +b \vec{v}\nonumber \]
We proceed as follows. \[\left[ \begin{array}{r} 4 \\ 5 \\ 0 \end{array} \right] = a \left[ \begin{array}{r} 1 \\ 1 \\ 0 \end{array} \right] + b \left[ \begin{array}{r} 3 \\ 2 \\ 0 \end{array} \right]\nonumber \] This is equivalent to the following system of equations \[\begin{aligned} a + 3b &= 4 \\ a + 2b &= 5\end{aligned}\]
We solving this system the usual way, constructing the augmented matrix and row reducing to find the reduced row-echelon form. \[\left[ \begin{array}{rr|r} 1 & 3 & 4 \\ 1 & 2 & 5 \end{array} \right] \rightarrow \cdots \rightarrow \left[ \begin{array}{rr|r} 1 & 0 & 7 \\ 0 & 1 & -1 \end{array} \right]\nonumber \] The solution is \(a=7, b=-1\). This means that \[\vec{w} = 7 \vec{u} - \vec{v}\nonumber \] Therefore we can say that \(\vec{w}\) is in \(\mathrm{span} \left\{ \vec{u}, \vec{v} \right\}\).
Linearly Independent Set of Vectors
We now turn our attention to the following question: what linear combinations of a given set of vectors \(\{ \vec{u}_1, \cdots ,\vec{u}_k\}\) in \(\mathbb{R}^{n}\) yields the zero vector? Clearly \(0\vec{u}_1 + 0\vec{u}_2+ \cdots + 0 \vec{u}_k = \vec{0}\), but is it possible to have \(\sum_{i=1}^{k}a_{i}\vec{u}_{i}=\vec{0}\) without all coefficients being zero?
You can create examples where this easily happens. For example if \(\vec{u}_1=\vec{u}_2\), then \(1\vec{u}_1 - \vec{u}_2+ 0 \vec{u}_3 + \cdots + 0 \vec{u}_k = \vec{0}\), no matter the vectors \(\{ \vec{u}_3, \cdots ,\vec{u}_k\}\). 0But sometimes it can be more subtle.
Consider the vectors \[\vec{u}_1=\left[ \begin{array}{rrr} 0 & 1 & -2 \end{array} \right]^T, \vec{u}_2=\left[ \begin{array}{rrr} 1 & 1 & 0 \end{array} \right]^T, \vec{u}_3=\left[ \begin{array}{rrr} -2 & 3 & 2 \end{array} \right]^T, \mbox{ and } \vec{u}_4=\left[ \begin{array}{rrr} 1 & -2 & 0 \end{array} \right]^T\nonumber \] in \(\mathbb{R}^{3}\).
Then verify that \[1\vec{u}_1 +0 \vec{u}_2+ - \vec{u}_3 -2 \vec{u}_4 = \vec{0}\nonumber \]
You can see that the linear combination does yield the zero vector but has some non-zero coefficients. Thus we define a set of vectors to be linearly dependent if this happens.
A set of non-zero vectors \(\{ \vec{u}_1, \cdots ,\vec{u}_k\}\) in \(\mathbb{R}^{n}\) is said to be linearly dependent if a linear combination of these vectors without all coefficients being zero does yield the zero vector.
Note that if \(\sum_{i=1}^{k}a_{i}\vec{u}_{i}=\vec{0}\) and some coefficient is non-zero, say \(a_1 \neq 0\), then \[\vec{u}_1 = \frac{-1}{a_1} \sum_{i=2}^{k}a_{i}\vec{u}_{i}\nonumber \] and thus \(\vec{u}_1\) is in the span of the other vectors. And the converse clearly works as well, so we get that a set of vectors is linearly dependent precisely when one of its vector is in the span of the other vectors of that set.
In particular, you can show that the vector \(\vec{u}_1\) in the above example is in the span of the vectors \(\{ \vec{u}_2, \vec{u}_3, \vec{u}_4 \}\).
If a set of vectors is NOT linearly dependent, then it must be that any linear combination of these vectors which yields the zero vector must use all zero coefficients. This is a very important notion, and we give it its own name of linear independence.
A set of non-zero vectors \(\{ \vec{u}_1, \cdots ,\vec{u}_k\}\) in \(\mathbb{R}^{n}\) is said to be linearly independent if whenever \[\sum_{i=1}^{k}a_{i}\vec{u}_{i}=\vec{0}\nonumber \] it follows that each \(a_{i}=0\).
Note also that we require all vectors to be non-zero to form a linearly independent set.
To view this in a more familiar setting, form the \(n \times k\) matrix \(A\) having these vectors as columns. Then all we are saying is that the set \(\{ \vec{u}_1, \cdots ,\vec{u}_k\}\) is linearly independent precisely when \(AX=0\) has only the trivial solution.
Here is an example.
Consider the vectors \(\vec{u}=\left[ \begin{array}{rrr} 1 & 1 & 0 \end{array} \right]^T\), \(\vec{v}=\left[ \begin{array}{rrr} 1 & 0 & 1 \end{array} \right]^T\), and \(\vec{w}=\left[ \begin{array}{rrr} 0 & 1 & 1 \end{array} \right]^T\) in \(\mathbb{R}^{3}\). Verify whether the set \(\{\vec{u}, \vec{v}, \vec{w}\}\) is linearly independent.
Solution
So suppose that we have a linear combinations \(a\vec{u} + b \vec{v} + c\vec{w} = \vec{0}\). Then you can see that this can only happen with \(a=b=c=0\).
As mentioned above, you can equivalently form the \(3 \times 3\) matrix \(A = \left[ \begin{array}{ccc} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \\ \end{array} \right]\), and show that \(AX=0\) has only the trivial solution.
Thus this means the set \(\left\{ \vec{u}, \vec{v}, \vec{w} \right\}\) is linearly independent.
In terms of spanning, a set of vectors is linearly independent if it does not contain unnecessary vectors, that is not vector is in the span of the others.
Thus we put all this together in the following important theorem.
Let \(\left\{\vec{u}_{1},\cdots ,\vec{u}_{k}\right\}\) be a collection of vectors in \(\mathbb{R}^{n}\). Then the following are equivalent:
- It is linearly independent, that is whenever \[\sum_{i=1}^{k}a_{i}\vec{u}_{i}=\vec{0}\nonumber \] it follows that each coefficient \(a_{i}=0\).
- No vector is in the span of the others.
- The system of linear equations \(AX=0\) has only the trivial solution, where \(A\) is the \(n \times k\) matrix having these vectors as columns.
The last sentence of this theorem is useful as it allows us to use the reduced row-echelon form of a matrix to determine if a set of vectors is linearly independent. Let the vectors be columns of a matrix \(A\). Find the reduced row-echelon form of \(A\). If each column has a leading one, then it follows that the vectors are linearly independent.
Sometimes we refer to the condition regarding sums as follows: The set of vectors, \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{k}\right\}\) is linearly independent if and only if there is no nontrivial linear combination which equals the zero vector. A nontrivial linear combination is one in which not all the scalars equal zero. Similarly, a trivial linear combination is one in which all scalars equal zero.
Here is a detailed example in \(\mathbb{R}^{4}\).
Determine whether the set of vectors given by \[\left\{ \left[ \begin{array}{r} 1 \\ 2 \\ 3 \\ 0 \end{array} \right], \; \left[ \begin{array}{r} 2 \\ 1 \\ 0 \\ 1 \end{array} \right] , \; \left[ \begin{array}{r} 0 \\ 1 \\ 1 \\ 2 \end{array} \right] , \; \left[ \begin{array}{r} 3 \\ 2 \\ 2 \\ 0 \end{array} \right] \right\}\nonumber \] is linearly independent. If it is linearly dependent, express one of the vectors as a linear combination of the others.
Solution
In this case the matrix of the corresponding homogeneous system of linear equations is \[\left[ \begin{array}{rrrr|r} 1 & 2 & 0 & 3 & 0\\ 2 & 1 & 1 & 2 & 0 \\ 3 & 0 & 1 & 2 & 0 \\ 0 & 1 & 2 & 0 & 0 \end{array} \right]\nonumber \]
The reduced row-echelon form is \[\left[ \begin{array}{rrrr|r} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \end{array} \right]\nonumber \]
and so every column is a pivot column and the corresponding system \(AX=0\) only has the trivial solution. Therefore, these vectors are linearly independent and there is no way to obtain one of the vectors as a linear combination of the others.
Consider another example.
Determine whether the set of vectors given by \[\left\{ \left[ \begin{array}{r} 1 \\ 2 \\ 3 \\ 0 \end{array} \right], \; \left[ \begin{array}{r} 2 \\ 1 \\ 0 \\ 1 \end{array} \right], \; \left[ \begin{array}{r} 0 \\ 1 \\ 1 \\ 2 \end{array} \right], \; \left[ \begin{array}{r} 3 \\ 2 \\ 2 \\ -1 \end{array} \right] \right\}\nonumber \] is linearly independent. If it is linearly dependent, express one of the vectors as a linear combination of the others.
Solution
Form the \(4 \times 4\) matrix \(A\) having these vectors as columns: \[A= \left[ \begin{array}{rrrr} 1 & 2 & 0 & 3 \\ 2 & 1 & 1 & 2 \\ 3 & 0 & 1 & 2 \\ 0 & 1 & 2 & -1 \end{array} \right]\nonumber \] Then by Theorem \(\PageIndex{1}\), the given set of vectors is linearly independent exactly if the system \(AX=0\) has only the trivial solution.
The augmented matrix for this system and corresponding reduced row-echelon form are given by \[\left[ \begin{array}{rrrr|r} 1 & 2 & 0 & 3 & 0 \\ 2 & 1 & 1 & 2 & 0 \\ 3 & 0 & 1 & 2 & 0 \\ 0 & 1 & 2 & -1 & 0 \end{array} \right] \rightarrow \cdots \rightarrow \left[ \begin{array}{rrrr|r} 1 & 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 \\ 0 & 0 & 1 & -1 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{array} \right]\nonumber \] Not all the columns of the coefficient matrix are pivot columns and so the vectors are not linearly independent. In this case, we say the vectors are linearly dependent.
It follows that there are infinitely many solutions to \(AX=0\), one of which is \[\left[ \begin{array}{r} 1 \\ 1 \\ -1 \\ -1 \end{array} \right]\nonumber \] Therefore we can write \[1\left[ \begin{array}{r} 1 \\ 2 \\ 3 \\ 0 \end{array} \right] +1\left[ \begin{array}{r} 2 \\ 1 \\ 0 \\ 1 \end{array} \right] -1 \left[ \begin{array}{r} 0 \\ 1 \\ 1 \\ 2 \end{array} \right] -1 \left[ \begin{array}{r} 3 \\ 2 \\ 2 \\ -1 \end{array} \right] = \left[ \begin{array}{r} 0 \\ 0 \\ 0 \\ 0 \end{array} \right]\nonumber \]
This can be rearranged as follows \[1\left[ \begin{array}{r} 1 \\ 2 \\ 3 \\ 0 \end{array} \right] +1\left[ \begin{array}{r} 2 \\ 1 \\ 0 \\ 1 \end{array} \right] -1 \left[ \begin{array}{r} 0 \\ 1 \\ 1 \\ 2 \end{array} \right] =\left[ \begin{array}{r} 3 \\ 2 \\ 2 \\ -1 \end{array} \right]\nonumber \] This gives the last vector as a linear combination of the first three vectors.
Notice that we could rearrange this equation to write any of the four vectors as a linear combination of the other three.
When given a linearly independent set of vectors, we can determine if related sets are linearly independent.
Let \(\{ \vec{u},\vec{v},\vec{w}\}\) be an independent set of \(\mathbb{R}^n\). Is \(\{\vec{u}+\vec{v}, 2\vec{u}+\vec{w}, \vec{v}-5\vec{w}\}\) linearly independent?
Solution
Suppose \(a(\vec{u}+\vec{v}) + b(2\vec{u}+\vec{w}) + c(\vec{v}-5\vec{w})=\vec{0}_n\) for some \(a,b,c\in\mathbb{R}\). Then \[(a+2b)\vec{u} + (a+c)\vec{v} + (b-5c)\vec{w}=\vec{0}_n.\nonumber \]
Since \(\{\vec{u},\vec{v},\vec{w}\}\) is independent, \[\begin{aligned} a + 2b & = 0 \\ a + c & = 0 \\ b - 5c & = 0 \end{aligned}\]
This system of three equations in three variables has the unique solution \(a=b=c=0\). Therefore, \(\{\vec{u}+\vec{v}, 2\vec{u}+\vec{w}, \vec{v}-5\vec{w}\}\) is independent.
The following corollary follows from the fact that if the augmented matrix of a homogeneous system of linear equations has more columns than rows, the system has infinitely many solutions.
Let \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{k}\right\}\) be a set of vectors in \(\mathbb{R}^{n}\). If \(k>n\), then the set is linearly dependent (i.e. NOT linearly independent).
- Proof
-
Form the \(n \times k\) matrix \(A\) having the vectors \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{k}\right\}\) as its columns and suppose \(k > n\). Then \(A\) has rank \(r \leq n <k\), so the system \(AX=0\) has a nontrivial solution and thus not linearly independent by Theorem \(\PageIndex{1}\).
Consider the vectors \[\left\{ \left[ \begin{array}{r} 1 \\ 4 \end{array} \right], \left[ \begin{array}{r} 2 \\ 3 \end{array} \right], \left[ \begin{array}{r} 3 \\ 2 \end{array} \right] \right\}\nonumber \] Are these vectors linearly independent?
Solution
This set contains three vectors in \(\mathbb{R}^2\). By Corollary \(\PageIndex{1}\) these vectors are linearly dependent. In fact, we can write \[(-1) \left[ \begin{array}{r} 1 \\ 4 \end{array} \right] + (2) \left[ \begin{array}{r} 2 \\ 3 \end{array} \right] = \left[ \begin{array}{r} 3 \\ 2 \end{array} \right]\nonumber \] showing that this set is linearly dependent.
The third vector in the previous example is in the span of the first two vectors. We could find a way to write this vector as a linear combination of the other two vectors. It turns out that the linear combination which we found is the only one, provided that the set is linearly independent.
Let \(U \subseteq\mathbb{R}^n\) be an independent set. Then any vector \(\vec{x}\in\mathrm{span}(U)\) can be written uniquely as a linear combination of vectors of \(U\).
- Proof
-
To prove this theorem, we will show that two linear combinations of vectors in \(U\) that equal \(\vec{x}\) must be the same. Let \(U =\{ \vec{u}_1, \vec{u}_2, \ldots, \vec{u}_k\}\). Suppose that there is a vector \(\vec{x}\in \mathrm{span}(U)\) such that \[\begin{aligned} \vec{x} & = s_1\vec{u}_1 + s_2\vec{u}_2 + \cdots + s_k\vec{u}_k, \mbox{ for some } s_1, s_2, \ldots, s_k\in\mathbb{R}, \mbox{ and} \\ \vec{x} & = t_1\vec{u}_1 + t_2\vec{u}_2 + \cdots + t_k\vec{u}_k, \mbox{ for some } t_1, t_2, \ldots, t_k\in\mathbb{R}.\end{aligned}\] Then \(\vec{0}_n=\vec{x}-\vec{x} = (s_1-t_1)\vec{u}_1 + (s_2-t_2)\vec{u}_2 + \cdots + (s_k-t_k)\vec{u}_k\).
Since \(U\) is independent, the only linear combination that vanishes is the trivial one, so \(s_i-t_i=0\) for all \(i\), \(1\leq i\leq k\).
Therefore, \(s_i=t_i\) for all \(i\), \(1\leq i\leq k\), and the representation is unique.Let \(U \subseteq\mathbb{R}^n\) be an independent set. Then any vector \(\vec{x}\in\mathrm{span}(U)\) can be written uniquely as a linear combination of vectors of \(U\).
Suppose that \(\vec{u},\vec{v}\) and \(\vec{w}\) are nonzero vectors in \(\mathbb{R}^3\), and that \(\{ \vec{v},\vec{w}\}\) is independent. Consider the set \(\{ \vec{u},\vec{v},\vec{w}\}\). When can we know that this set is independent? It turns out that this follows exactly when \(\vec{u}\not\in\mathrm{span}\{\vec{v},\vec{w}\}\).
Suppose that \(\vec{u},\vec{v}\) and \(\vec{w}\) are nonzero vectors in \(\mathbb{R}^3\), and that \(\{ \vec{v},\vec{w}\}\) is independent. Prove that \(\{ \vec{u},\vec{v},\vec{w}\}\) is independent if and only if \(\vec{u}\not\in\mathrm{span}\{\vec{v},\vec{w}\}\).
Solution
If \(\vec{u}\in\mathrm{span}\{\vec{v},\vec{w}\}\), then there exist \(a,b\in\mathbb{R}\) so that \(\vec{u}=a\vec{v} + b\vec{w}\). This implies that \(\vec{u}-a\vec{v} - b\vec{w}=\vec{0}_3\), so \(\vec{u}-a\vec{v} - b\vec{w}\) is a nontrivial linear combination of \(\{ \vec{u},\vec{v},\vec{w}\}\) that vanishes, and thus \(\{ \vec{u},\vec{v},\vec{w}\}\) is dependent.
Now suppose that \(\vec{u}\not\in\mathrm{span}\{\vec{v},\vec{w}\}\), and suppose that there exist \(a,b,c\in\mathbb{R}\) such that \(a\vec{u}+b\vec{v}+c\vec{w}=\vec{0}_3\). If \(a\neq 0\), then \(\vec{u}=-\frac{b}{a}\vec{v}-\frac{c}{a}\vec{w}\), and \(\vec{u}\in\mathrm{span}\{\vec{v},\vec{w}\}\), a contradiction. Therefore, \(a=0\), implying that \(b\vec{v}+c\vec{w}=\vec{0}_3\). Since \(\{ \vec{v},\vec{w}\}\) is independent, \(b=c=0\), and thus \(a=b=c=0\), i.e., the only linear combination of \(\vec{u},\vec{v}\) and \(\vec{w}\) that vanishes is the trivial one.
Therefore, \(\{ \vec{u},\vec{v},\vec{w}\}\) is independent.
Consider the following useful theorem.
Let \(A\) be an invertible \(n \times n\) matrix. Then the columns of \(A\) are independent and span \(\mathbb{R}^n\). Similarly, the rows of \(A\) are independent and span the set of all \(1 \times n\) vectors.
This theorem also allows us to determine if a matrix is invertible. If an \(n \times n\) matrix \(A\) has columns which are independent, or span \(\mathbb{R}^n\), then it follows that \(A\) is invertible. If it has rows that are independent, or span the set of all \(1 \times n\) vectors, then \(A\) is invertible.
A Short Application to Chemistry
The following section applies the concepts of spanning and linear independence to the subject of chemistry.
When working with chemical reactions, there are sometimes a large number of reactions and some are in a sense redundant. Suppose you have the following chemical reactions. \[\begin{array}{c} CO+\frac{1}{2}O_{2}\rightarrow CO_{2} \\ H_{2}+\frac{1}{2}O_{2}\rightarrow H_{2}O \\ CH_{4}+\frac{3}{2}O_{2}\rightarrow CO+2H_{2}O \\ CH_{4}+2O_{2}\rightarrow CO_{2}+2H_{2}O \end{array}\nonumber \] There are four chemical reactions here but they are not independent reactions. There is some redundancy. What are the independent reactions? Is there a way to consider a shorter list of reactions? To analyze this situation, we can write the reactions in a matrix as follows \[\left[ \begin{array}{cccccc} CO & O_{2} & CO_{2} & H_{2} & H_{2}O & CH_{4} \\ 1 & 1/2 & -1 & 0 & 0 & 0 \\ 0 & 1/2 & 0 & 1 & -1 & 0 \\ -1 & 3/2 & 0 & 0 & -2 & 1 \\ 0 & 2 & -1 & 0 & -2 & 1 \end{array} \right]\nonumber \]
Each row contains the coefficients of the respective elements in each reaction. For example, the top row of numbers comes from \(CO+\frac{1}{2}O_{2}-CO_{2}=0\) which represents the first of the chemical reactions.
We can write these coefficients in the following matrix \[\left[ \begin{array}{rrrrrr} 1 & 1/2 & -1 & 0 & 0 & 0 \\ 0 & 1/2 & 0 & 1 & -1 & 0 \\ -1 & 3/2 & 0 & 0 & -2 & 1 \\ 0 & 2 & -1 & 0 & -2 & 1 \end{array} \right]\nonumber \] Rather than listing all of the reactions as above, it would be more efficient to only list those which are independent by throwing out that which is redundant. We can use the concepts of the previous section to accomplish this.
First, take the reduced row-echelon form of the above matrix. \[\left[ \begin{array}{rrrrrr} 1 & 0 & 0 & 3 & -1 & -1 \\ 0 & 1 & 0 & 2 & -2 & 0 \\ 0 & 0 & 1 & 4 & -2 & -1 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{array} \right]\nonumber \] The top three rows represent “independent" reactions which come from the original four reactions. One can obtain each of the original four rows of the matrix given above by taking a suitable linear combination of rows of this reduced row-echelon matrix.
With the redundant reaction removed, we can consider the simplified reactions as the following equations \[\begin{array}{c} CO+3H_{2}-1H_{2}O-1CH_{4}=0 \\ O_{2}+2H_{2}-2H_{2}O=0 \\ CO_{2}+4H_{2}-2H_{2}O-1CH_{4}=0 \end{array}\nonumber \] In terms of the original notation, these are the reactions \[\begin{array}{c} CO+3H_{2}\rightarrow H_{2}O+CH_{4} \\ O_{2}+2H_{2}\rightarrow 2H_{2}O \\ CO_{2}+4H_{2}\rightarrow 2H_{2}O+CH_{4} \end{array}\nonumber \]
These three reactions provide an equivalent system to the original four equations. The idea is that, in terms of what happens chemically, you obtain the same information with the shorter list of reactions. Such a simplification is especially useful when dealing with very large lists of reactions which may result from experimental evidence.
Subspaces and Basis
The goal of this section is to develop an understanding of a subspace of \(\mathbb{R}^n\). Before a precise definition is considered, we first examine the subspace test given below.
A subset \(V\) of \(\mathbb{R}^n\) is a subspace of \(\mathbb{R}^n\) if
- the zero vector of \(\mathbb{R}^n\), \(\vec{0}_n\), is in \(V\);
- \(V\) is closed under addition, i.e., for all \(\vec{u},\vec{w}\in V\), \(\vec{u}+\vec{w}\in V\);
- \(V\) is closed under scalar multiplication, i.e., for all \(\vec{u}\in V\) and \(k\in\mathbb{R}\), \(k\vec{u}\in V\).
This test allows us to determine if a given set is a subspace of \(\mathbb{R}^n\). Notice that the subset \(V = \left\{ \vec{0} \right\}\) is a subspace of \(\mathbb{R}^n\) (called the zero subspace ), as is \(\mathbb{R}^n\) itself. A subspace which is not the zero subspace of \(\mathbb{R}^n\) is referred to as a proper subspace.
A subspace is simply a set of vectors with the property that linear combinations of these vectors remain in the set. Geometrically in \(\mathbb{R}^{3}\), it turns out that a subspace can be represented by either the origin as a single point, lines and planes which contain the origin, or the entire space \(\mathbb{R}^{3}\).
Consider the following example of a line in \(\mathbb{R}^3\).
In \(\mathbb{R}^3\), the line \(L\) through the origin that is parallel to the vector \({\vec{d}}= \left[ \begin{array}{r} -5 \\ 1 \\ -4 \end{array}\right]\) has (vector) equation \(\left[ \begin{array}{r} x \\ y \\ z \end{array}\right] =t\left[ \begin{array}{r} -5 \\ 1 \\ -4 \end{array}\right], t\in\mathbb{R}\), so \[L=\left\{ t{\vec{d}} ~|~ t\in\mathbb{R}\right\}.\nonumber \] Then \(L\) is a subspace of \(\mathbb{R}^3\).
Solution
Using the subspace test given above we can verify that \(L\) is a subspace of \(\mathbb{R}^3\).
- First: \(\vec{0}_3\in L\) since \(0\vec{d}=\vec{0}_3\).
- Suppose \(\vec{u},\vec{v}\in L\). Then by definition, \(\vec{u}=s\vec{d}\) and \(\vec{v}=t\vec{d}\), for some \(s,t\in\mathbb{R}\). Thus \[\vec{u}+\vec{v} = s\vec{d}+t\vec{d} = (s+t)\vec{d}.\nonumber \] Since \(s+t\in\mathbb{R}\), \(\vec{u}+\vec{v}\in L\); i.e., \(L\) is closed under addition.
- Suppose \(\vec{u}\in L\) and \(k\in\mathbb{R}\) (\(k\) is a scalar). Then \(\vec{u}=t\vec{d}\), for some \(t\in\mathbb{R}\), so \[k\vec{u}=k(t\vec{d})=(kt)\vec{d}.\nonumber \] Since \(kt\in\mathbb{R}\), \(k\vec{u}\in L\); i.e., \(L\) is closed under scalar multiplication.
Since \(L\) satisfies all conditions of the subspace test, it follows that \(L\) is a subspace.
Note that there is nothing special about the vector \(\vec{d}\) used in this example; the same proof works for any nonzero vector \(\vec{d}\in\mathbb{R}^3\), so any line through the origin is a subspace of \(\mathbb{R}^3\).
We are now prepared to examine the precise definition of a subspace as follows.
Let \(V\) be a nonempty collection of vectors in \(\mathbb{R}^{n}.\) Then \(V\) is called a subspace if whenever \(a\) and \(b\) are scalars and \(\vec{u}\) and \(\vec{v}\) are vectors in \(V,\) the linear combination \(a \vec{u}+ b \vec{v}\) is also in \(V\).
More generally this means that a subspace contains the span of any finite collection vectors in that subspace. It turns out that in \(\mathbb{R}^{n}\), a subspace is exactly the span of finitely many of its vectors.
Let \(V\) be a nonempty collection of vectors in \(\mathbb{R}^{n}.\) Then \(V\) is a subspace of \(\mathbb{R}^{n}\) if and only if there exist vectors \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{k}\right\}\) in \(V\) such that \[V= \mathrm{span}\left\{ \vec{u}_{1},\cdots ,\vec{u}_{k}\right\}\nonumber \] Furthermore, let \(W\) be another subspace of \(\mathbb{R}^n\) and suppose \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{k}\right\} \in W\). Then it follows that \(V\) is a subset of \(W\).
Note that since \(W\) is arbitrary, the statement that \(V \subseteq W\) means that any other subspace of \(\mathbb{R}^n\) that contains these vectors will also contain \(V\).
- Proof
-
We first show that if \(V\) is a subspace, then it can be written as \(V= \mathrm{span}\left\{ \vec{u}_{1},\cdots ,\vec{u}_{k}\right\}\). Pick a vector \(\vec{u}_{1}\) in \(V\). If \(V= \mathrm{span}\left\{ \vec{u}_{1}\right\} ,\) then you have found your list of vectors and are done. If \(V\neq \mathrm{span}\left\{ \vec{u}_{1}\right\} ,\) then there exists \(\vec{u}_{2}\) a vector of \(V\) which is not in \(\mathrm{span}\left\{ \vec{u}_{1}\right\} .\) Consider \(\mathrm{span}\left\{ \vec{u}_{1},\vec{u}_{2}\right\}.\) If \(V=\mathrm{span}\left\{ \vec{u}_{1},\vec{u}_{2}\right\}\), we are done. Otherwise, pick \(\vec{u}_{3}\) not in \(\mathrm{span}\left\{ \vec{u}_{1},\vec{u}_{2}\right\} .\) Continue this way. Note that since \(V\) is a subspace, these spans are each contained in \(V\). The process must stop with \(\vec{u}_{k}\) for some \(k\leq n\) by Corollary \(\PageIndex{1}\), and thus \(V=\mathrm{span}\left\{ \vec{u}_{1},\cdots , \vec{u}_{k}\right\}\).
Now suppose \(V=\mathrm{span}\left\{ \vec{u}_{1},\cdots , \vec{u}_{k}\right\}\), we must show this is a subspace. So let \(\sum_{i=1}^{k}c_{i}\vec{u}_{i}\) and \(\sum_{i=1}^{k}d_{i}\vec{u}_{i}\) be two vectors in \(V\), and let \(a\) and \(b\) be two scalars. Then \[a \sum_{i=1}^{k}c_{i}\vec{u}_{i}+ b \sum_{i=1}^{k}d_{i}\vec{u}_{i}= \sum_{i=1}^{k}\left( a c_{i}+b d_{i}\right) \vec{u}_{i}\nonumber \] which is one of the vectors in \(\mathrm{span}\left\{ \vec{u}_{1},\cdots , \vec{u}_{k}\right\}\) and is therefore contained in \(V\). This shows that \(\mathrm{span}\left\{ \vec{u}_{1},\cdots ,\vec{u}_{k}\right\}\) has the properties of a subspace.
To prove that \(V \subseteq W\), we prove that if \(\vec{u}_i\in V\), then \(\vec{u}_i \in W\).
Suppose \(\vec{u}\in V\). Then \(\vec{u}=a_1\vec{u}_1 + a_2\vec{u}_2 + \cdots + a_k\vec{u}_k\) for some \(a_i\in\mathbb{R}\), \(1\leq i\leq k\). Since \(W\) contain each \(\vec{u}_i\) and \(W\) is a vector space, it follows that \(a_1\vec{u}_1 + a_2\vec{u}_2 + \cdots + a_k\vec{u}_k \in W\).
Since the vectors \(\vec{u}_i\) we constructed in the proof above are not in the span of the previous vectors (by definition), they must be linearly independent and thus we obtain the following corollary.
If \(V\) is a subspace of \(\mathbb{R}^{n},\) then there exist linearly independent vectors \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{k}\right\}\) in \(V\) such that \(V=\mathrm{span}\left\{ \vec{u}_{1},\cdots ,\vec{u}_{k}\right\}\).
In summary, subspaces of \(\mathbb{R}^{n}\) consist of spans of finite, linearly independent collections of vectors of \(\mathbb{R}^{n}\). Such a collection of vectors is called a basis.
Let \(V\) be a subspace of \(\mathbb{R}^{n}\). Then \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{k}\right\}\) is a basis for \(V\) if the following two conditions hold.
- \(\mathrm{span}\left\{ \vec{u}_{1},\cdots ,\vec{u}_{k}\right\} =V\)
- \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{k}\right\}\) is linearly independent
Note the plural of basis is bases.
The following is a simple but very useful example of a basis, called the standard basis.
Let \(\vec{e}_i\) be the vector in \(\mathbb{R}^n\) which has a \(1\) in the \(i^{th}\) entry and zeros elsewhere, that is the \(i^{th}\) column of the identity matrix. Then the collection \(\left\{\vec{e}_1, \vec{e}_2, \cdots, \vec{e}_n \right\}\) is a basis for \(\mathbb{R}^n\) and is called the standard basis of \(\mathbb{R}^n\).
The main theorem about bases is not only they exist, but that they must be of the same size. To show this, we will need the the following fundamental result, called the Exchange Theorem.
Suppose \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{r}\right\}\) is a linearly independent set of vectors in \(\mathbb{R}^n\), and each \(\vec{u}_{k}\) is contained in \(\mathrm{span}\left\{ \vec{v}_{1},\cdots ,\vec{v}_{s}\right\}\) Then \(s\geq r.\)
In words, spanning sets have at least as many vectors as linearly independent sets.
- Proof
-
Since each \(\vec{u}_j\) is in \(\mathrm{span}\left\{ \vec{v}_{1},\cdots ,\vec{v}_{s}\right\}\), there exist scalars \(a_{ij}\) such that \[\vec{u}_{j}=\sum_{i=1}^{s}a_{ij}\vec{v}_{i}\nonumber \] Suppose for a contradiction that \(s<r\). Then the matrix \(A = \left[ a_{ij} \right]\) has fewer rows, \(s\) than columns, \(r\). Then the system \(AX=0\) has a non trivial solution \(\vec{d}\), that is there is a \(\vec{d}\neq \vec{0}\) such that \(A\vec{d}=\vec{0}\). In other words, \[\sum_{j=1}^{r}a_{ij}d_{j}=0,\;i=1,2,\cdots ,s\nonumber \] Therefore, \[\begin{aligned} \sum_{j=1}^{r}d_{j}\vec{u}_{j} &=\sum_{j=1}^{r}d_{j}\sum_{i=1}^{s}a_{ij} \vec{v}_{i} \\ &=\sum_{i=1}^{s}\left( \sum_{j=1}^{r}a_{ij}d_{j}\right) \vec{v} _{i}=\sum_{i=1}^{s}0\vec{v}_{i}=0\end{aligned}\] which contradicts the assumption that \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{r}\right\}\) is linearly independent, because not all the \(d_{j}\) are zero. Thus this contradiction indicates that \(s\geq r\).
We are now ready to show that any two bases are of the same size.
Let \(V\) be a subspace of \(\mathbb{R}^{n}\) with two bases \(B_1\) and \(B_2\). Suppose \(B_1\) contains \(s\) vectors and \(B_2\) contains \(r\) vectors. Then \(s=r.\)
- Proof
-
This follows right away from Theorem 9.4.4. Indeed observe that \(B_1 = \left\{ \vec{u}_{1},\cdots ,\vec{u}_{s}\right\}\) is a spanning set for \(V\) while \(B_2 = \left\{ \vec{v}_{1},\cdots ,\vec{v}_{r}\right\}\) is linearly independent, so \(s \geq r.\) Similarly \(B_2 = \left\{ \vec{v}_{1},\cdots ,\vec{v} _{r}\right\}\) is a spanning set for \(V\) while \(B_1 = \left\{ \vec{u}_{1},\cdots , \vec{u}_{s}\right\}\) is linearly independent, so \(r\geq s\).
The following definition can now be stated.
Let \(V\) be a subspace of \(\mathbb{R}^{n}\). Then the dimension of \(V\), written \(\mathrm{dim}(V)\) is defined to be the number of vectors in a basis.
The next result follows.
The dimension of \(\mathbb{R}^{n}\) is \(n.\)
- Proof
-
You only need to exhibit a basis for \(\mathbb{R}^{n}\) which has \(n\) vectors. Such a basis is the standard basis \(\left\{ \vec{e}_{1},\cdots , \vec{e}_{n}\right\}\).
Consider the following example.
Let \[V=\left\{ \left[\begin{array}{c} a\\ b\\ c\\ d\end{array}\right]\in\mathbb{R}^4 ~:~ a-b=d-c \right\}.\nonumber \] Show that \(V\) is a subspace of \(\mathbb{R}^4\), find a basis of \(V\), and find \(\dim(V)\).
Solution
The condition \(a-b=d-c\) is equivalent to the condition \(a=b-c+d\), so we may write
\[V =\left\{ \left[\begin{array}{c} b-c+d\\ b\\ c\\ d\end{array}\right] ~:~b,c,d \in\mathbb{R} \right\} = \left\{ b\left[\begin{array}{c} 1\\ 1\\ 0\\ 0\end{array}\right] +c\left[\begin{array}{c} -1\\ 0\\ 1\\ 0\end{array}\right] +d\left[\begin{array}{c} 1\\ 0\\ 0\\ 1\end{array}\right] ~:~ b,c,d\in\mathbb{R} \right\}\nonumber \]
This shows that \(V\) is a subspace of \(\mathbb{R}^4\), since \(V=\mathrm{span}\{ \vec{u}_1, \vec{u}_2, \vec{u}_3 \}\) where
\[\vec{u}_1 = \left[\begin{array}{r} 1 \\ 1 \\ 0 \\ 0 \end{array}\right], \vec{u}_2 = \left[\begin{array}{r} -1 \\ 0 \\ 1 \\ 0 \end{array}\right], \vec{u}_3 = \left[\begin{array}{r} 1 \\ 0 \\ 0 \\ 1 \end{array}\right]\nonumber \]
Furthermore,
\[\left\{ \left[\begin{array}{c} 1\\ 1\\ 0\\ 0\end{array}\right], \left[\begin{array}{c} -1\\ 0\\ 1\\ 0\end{array}\right], \left[\begin{array}{c} 1\\ 0\\ 0\\ 1\end{array}\right] \right\}\nonumber \] is linearly independent, as can be seen by taking the reduced row-echelon form of the matrix whose columns are \(\vec{u}_1, \vec{u}_2\) and \(\vec{u}_3\).
\[\left[\begin{array}{rrr} 1 & -1 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right] \rightarrow \left[\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{array}\right]\nonumber \]
Since every column of the reduced row-echelon form matrix has a leading one, the columns are linearly independent.
Therefore \(\{ \vec{u}_1, \vec{u}_2, \vec{u}_3 \}\) is linearly independent and spans \(V\), so is a basis of \(V\). Hence \(V\) has dimension three.
We continue by stating further properties of a set of vectors in \(\mathbb{R}^{n}\).
The following properties hold in \(\mathbb{R}^{n}\):
- Suppose \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{n}\right\}\) is linearly independent. Then \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{n}\right\}\) is a basis for \(\mathbb{R}^{n}\).
- Suppose \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{m}\right\}\) spans \(\mathbb{R}^{n}.\) Then \(m\geq n.\)
- If \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{n}\right\}\) spans \(\mathbb{R}^{n},\) then \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{n}\right\}\) is linearly independent.
- Proof
-
Assume first that \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{n}\right\}\) is linearly independent, and we need to show that this set spans \(\mathbb{R}^{n}\). To do so, let \(\vec{v}\) be a vector of \(\mathbb{R}^{n}\), and we need to write \(\vec{v}\) as a linear combination of \(\vec{u}_i\)’s. Consider the matrix \(A\) having the vectors \(\vec{u}_i\) as columns: \[A = \left[ \begin{array}{rrr} \vec{u}_{1} & \cdots & \vec{u}_{n} \end{array} \right]\nonumber \]
By linear independence of the \(\vec{u}_i\)’s, the reduced row-echelon form of \(A\) is the identity matrix. Therefore the system \(A\vec{x}= \vec{v}\) has a (unique) solution, so \(\vec{v}\) is a linear combination of the \(\vec{u}_i\)’s.
To establish the second claim, suppose that \(m<n.\) Then letting \(\vec{u}_{i_{1}},\cdots ,\vec{u}_{i_{k}}\) be the pivot columns of the matrix \[\left[ \begin{array}{ccc} \vec{u}_{1} & \cdots & \vec{u}_{m} \end{array} \right]\nonumber \] it follows \(k\leq m<n\) and these \(k\) pivot columns would be a basis for \(\mathbb{R}^{n}\) having fewer than \(n\) vectors, contrary to Corollary \(\PageIndex{3}\).
Finally consider the third claim. If \(\left\{ \vec{u}_{1},\cdots ,\vec{u}_{n}\right\}\) is not linearly independent, then replace this list with \(\left\{ \vec{u}_{i_{1}},\cdots ,\vec{u}_{i_{k}}\right\}\) where these are the pivot columns of the matrix \[\left[ \begin{array}{ccc} \vec{u}_{1} & \cdots & \vec{u}_{n} \end{array} \right]\nonumber \] Then \(\left\{ \vec{u}_{i_{1}},\cdots ,\vec{u}_{i_{k}}\right\}\) spans \(\mathbb{R}^{n}\) and is linearly independent, so it is a basis having less than \(n\) vectors again contrary to Corollary \(\PageIndex{3}\).
The next theorem follows from the above claim.
Let \(V\) be a subspace of \(\mathbb{R}^n\). Then there exists a basis of \(V\) with \(\dim(V)\leq n\).
Consider Corollary \(\PageIndex{4}\) together with Theorem \(\PageIndex{8}\). Let \(\dim(V) = r\). Suppose there exists an independent set of vectors in \(V\). If this set contains \(r\) vectors, then it is a basis for \(V\). If it contains less than \(r\) vectors, then vectors can be added to the set to create a basis of \(V\). Similarly, any spanning set of \(V\) which contains more than \(r\) vectors can have vectors removed to create a basis of \(V\).
We illustrate this concept in the next example.
Consider the set \(U\) given by \[U=\left\{ \left.\left[\begin{array}{c} a\\ b\\ c\\ d\end{array}\right] \in\mathbb{R}^4 ~\right|~ a-b=d-c \right\}\nonumber \] Then \(U\) is a subspace of \(\mathbb{R}^4\) and \(\dim(U)=3\).
Then \[S=\left\{ \left[\begin{array}{c} 1\\ 1\\ 1\\ 1\end{array}\right], \left[\begin{array}{c} 2\\ 3\\ 3\\ 2\end{array}\right] \right\},\nonumber \] is an independent subset of \(U\). Therefore \(S\) can be extended to a basis of \(U\).
Solution
To extend \(S\) to a basis of \(U\), find a vector in \(U\) that is not in \(\mathrm{span}(S)\). \[\left[\begin{array}{rrr} 1 & 2 & ? \\ 1 & 3 & ? \\ 1 & 3 & ? \\ 1 & 2 & ? \end{array}\right]\nonumber \]
\[\left[\begin{array}{rrr} 1 & 2 & 1 \\ 1 & 3 & 0 \\ 1 & 3 & -1 \\ 1 & 2 & 0 \end{array}\right] \rightarrow \left[\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{array}\right]\nonumber \]
Therefore, \(S\) can be extended to the following basis of \(U\): \[\left\{ \left[\begin{array}{r} 1\\ 1\\ 1\\ 1\end{array}\right], \left[\begin{array}{r} 2\\ 3\\ 3\\ 2\end{array}\right], \left[\begin{array}{r} 1\\ 0\\ -1\\ 0\end{array}\right] \right\},\nonumber \]
Next we consider the case of removing vectors from a spanning set to result in a basis.
Let \(W\) be a subspace. Also suppose that \(W=span\left\{ \vec{w} _{1},\cdots ,\vec{w}_{m}\right\}\). Then there exists a subset of \(\left\{ \vec{w}_{1},\cdots ,\vec{w}_{m}\right\}\) which is a basis for \(W\).
- Proof
-
Let \(S\) denote the set of positive integers such that for \(k\in S,\) there exists a subset of \(\left\{ \vec{w}_{1},\cdots ,\vec{w}_{m}\right\}\) consisting of exactly \(k\) vectors which is a spanning set for \(W\). Thus \(m\in S\). Pick the smallest positive integer in \(S\). Call it \(k\). Then there exists \(\left\{ \vec{u}_{1},\cdots , \vec{u}_{k}\right\} \subseteq \left\{ \vec{w}_{1},\cdots ,\vec{w} _{m}\right\}\) such that \(\text{span}\left\{ \vec{u}_{1},\cdots ,\vec{u} _{k}\right\} =W.\) If \[\sum_{i=1}^{k}c_{i}\vec{w}_{i}=\vec{0}\nonumber \] and not all of the \(c_{i}=0,\) then you could pick \(c_{j}\neq 0\), divide by it and solve for \(\vec{u}_{j}\) in terms of the others, \[\vec{w}_{j}=\sum_{i\neq j}\left( -\frac{c_{i}}{c_{j}}\right) \vec{w}_{i}\nonumber \] Then you could delete \(\vec{w}_{j}\) from the list and have the same span. Any linear combination involving \(\vec{w}_{j}\) would equal one in which \(\vec{w}_{j}\) is replaced with the above sum, showing that it could have been obtained as a linear combination of \(\vec{w}_{i}\) for \(i\neq j\). Thus \(k-1\in S\) contrary to the choice of \(k\). Hence each \(c_{i}=0\) and so \(\left\{ \vec{u}_{1},\cdots ,\vec{u} _{k}\right\}\) is a basis for \(W\) consisting of vectors of \(\left\{ \vec{w} _{1},\cdots ,\vec{w}_{m}\right\}\).
The following example illustrates how to carry out this shrinking process which will obtain a subset of a span of vectors which is linearly independent.
Let \(W\) be the subspace \[span\left\{ \left[ \begin{array}{r} 1 \\ 2 \\ -1 \\ 1 \end{array} \right] ,\left[ \begin{array}{r} 1 \\ 3 \\ -1 \\ 1 \end{array} \right] ,\left[ \begin{array}{r} 8 \\ 19 \\ -8 \\ 8 \end{array} \right] ,\left[ \begin{array}{r} -6 \\ -15 \\ 6 \\ -6 \end{array} \right] ,\left[ \begin{array}{r} 1 \\ 3 \\ 0 \\ 1 \end{array} \right] ,\left[ \begin{array}{r} 1 \\ 5 \\ 0 \\ 1 \end{array} \right] \right\}\nonumber \] Find a basis for \(W\) which consists of a subset of the given vectors.
Solution
You can use the reduced row-echelon form to accomplish this reduction. Form the matrix which has the given vectors as columns. \[\left[ \begin{array}{rrrrrr} 1 & 1 & 8 & -6 & 1 & 1 \\ 2 & 3 & 19 & -15 & 3 & 5 \\ -1 & -1 & -8 & 6 & 0 & 0 \\ 1 & 1 & 8 & -6 & 1 & 1 \end{array} \right]\nonumber \] Then take the reduced row-echelon form
\[\left[ \begin{array}{rrrrrr} 1 & 0 & 5 & -3 & 0 & -2 \\ 0 & 1 & 3 & -3 & 0 & 2 \\ 0 & 0 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{array} \right]\nonumber \] It follows that a basis for \(W\) is
\[\left\{ \left[ \begin{array}{r} 1 \\ 2 \\ -1 \\ 1 \end{array} \right] ,\left[ \begin{array}{r} 1 \\ 3 \\ -1 \\ 1 \end{array} \right] ,\left[ \begin{array}{c} 1 \\ 3 \\ 0 \\ 1 \end{array} \right] \right\}\nonumber \] Since the first, second, and fifth columns are obviously a basis for the column space of the , the same is true for the matrix having the given vectors as columns.
Consider the following theorems regarding a subspace contained in another subspace.
Let \(V\) and \(W\) be subspaces of \(\mathbb{R}^n\), and suppose that \(W\subseteq V\). Then \(\dim(W) \leq \dim(V)\) with equality when \(W=V\).
Let \(W\) be any non-zero subspace \(\mathbb{R}^{n}\) and let \(W\subseteq V\) where \(V\) is also a subspace of \(\mathbb{R}^{n}\). Then every basis of \(W\) can be extended to a basis for \(V\).
The proof is left as an exercise but proceeds as follows. Begin with a basis for \(W,\left\{ \vec{w}_{1},\cdots ,\vec{w}_{s}\right\}\) and add in vectors from \(V\) until you obtain a basis for \(V\). Not that the process will stop because the dimension of \(V\) is no more than \(n\).
Consider the following example.
Let \(V=\mathbb{R}^{4}\) and let \[W=\mathrm{span}\left\{ \left[ \begin{array}{c} 1 \\ 0 \\ 1 \\ 1 \end{array} \right] ,\left[ \begin{array}{c} 0 \\ 1 \\ 0 \\ 1 \end{array} \right] \right\}\nonumber \] Extend this basis of \(W\) to a basis of \(\mathbb{R}^{n}\).
Solution
An easy way to do this is to take the reduced row-echelon form of the matrix
\[\left[ \begin{array}{cccccc} 1 & 0 & 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 & 1 & 0 \\ 1 & 1 & 0 & 0 & 0 & 1 \end{array} \right] \label{basiseq1}\]
Note how the given vectors were placed as the first two columns and then the matrix was extended in such a way that it is clear that the span of the columns of this matrix yield all of \(\mathbb{R}^{4}\). Now determine the pivot columns. The reduced row-echelon form is
\[\left[ \begin{array}{rrrrrr} 1 & 0 & 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 & -1 & 1 \\ 0 & 0 & 1 & 0 & -1 & 0 \\ 0 & 0 & 0 & 1 & 1 & -1 \end{array} \right] \label{basiseq2}\]
Therefore the pivot columns are \[\left[ \begin{array}{c} 1 \\ 0 \\ 1 \\ 1 \end{array} \right] ,\left[ \begin{array}{c} 0 \\ 1 \\ 0 \\ 1 \end{array} \right] ,\left[ \begin{array}{c} 1 \\ 0 \\ 0 \\ 0 \end{array} \right] ,\left[ \begin{array}{c} 0 \\ 1 \\ 0 \\ 0 \end{array} \right]\nonumber \]
and now this is an extension of the given basis for \(W\) to a basis for \(\mathbb{R}^{4}\).
Why does this work? The columns of \(\eqref{basiseq1}\) obviously span \(\mathbb{R }^{4}\). In fact the span of the first four is the same as the span of all six.
Consider another example.
Let \(W\) be the span of \(\left[ \begin{array}{c} 1 \\ 0 \\ 1 \\ 0 \end{array} \right]\) in \(\mathbb{R}^{4}\). Let \(V\) consist of the span of the vectors \[\left[ \begin{array}{c} 1 \\ 0 \\ 1 \\ 0 \end{array} \right] ,\left[ \begin{array}{c} 0 \\ 1 \\ 1 \\ 1 \end{array} \right] ,\left[ \begin{array}{r} 7 \\ -6 \\ 1 \\ -6 \end{array} \right] ,\left[ \begin{array}{r} -5 \\ 7 \\ 2 \\ 7 \end{array} \right] ,\left[ \begin{array}{c} 0 \\ 0 \\ 0 \\ 1 \end{array} \right]\nonumber \] Find a basis for \(V\) which extends the basis for \(W\).
Solution
Note that the above vectors are not linearly independent, but their span, denoted as \(V\) is a subspace which does include the subspace \(W\).
Using the process outlined in the previous example, form the following matrix
\[\left[ \begin{array}{rrrrr} 1 & 0 & 7 & -5 & 0 \\ 0 & 1 & -6 & 7 & 0 \\ 1 & 1 & 1 & 2 & 0 \\ 0 & 1 & -6 & 7 & 1 \end{array} \right]\nonumber \]
Next find its reduced row-echelon form \[\left[ \begin{array}{rrrrr} 1 & 0 & 7 & -5 & 0 \\ 0 & 1 & -6 & 7 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \end{array} \right]\nonumber \]
It follows that a basis for \(V\) consists of the first two vectors and the last. \[\left\{ \left[ \begin{array}{c} 1 \\ 0 \\ 1 \\ 0 \end{array} \right] ,\left[ \begin{array}{c} 0 \\ 1 \\ 1 \\ 1 \end{array} \right] ,\left[ \begin{array}{c} 0 \\ 0 \\ 0 \\ 1 \end{array} \right] \right\}\nonumber \] Thus \(V\) is of dimension 3 and it has a basis which extends the basis for \(W\).
Row Space, Column Space, and Null Space of a Matrix
We begin this section with a new definition.
Let \(A\) be an \(m\times n\) matrix. The column space of \(A\), written \(\mathrm{col}(A)\), is the span of the columns. The row space of \(A\), written \(\mathrm{row}(A)\), is the span of the rows.
Using the reduced row-echelon form, we can obtain an efficient description of the row and column space of a matrix. Consider the following lemma.
Let \(A\) and \(B\) be \(m\times n\) matrices such that \(A\) can be carried to \(B\) by elementary row \(\left[ \mbox{column} \right]\) operations. Then \(\mathrm{row}(A)=\mathrm{row}(B)\) \(\left[\mathrm{col}(A)=\mathrm{col}(B) \right]\).
- Proof
-
We will prove that the above is true for row operations, which can be easily applied to column operations.
Let \(\vec{r}_1, \vec{r}_2, \ldots, \vec{r}_m\) denote the rows of \(A\).
- If \(B\) is obtained from \(A\) by a interchanging two rows of \(A\), then \(A\) and \(B\) have exactly the same rows, so \(\mathrm{row}(B)=\mathrm{row}(A)\).
- Suppose \(p\neq 0\), and suppose that for some \(j\), \(1\leq j\leq m\), \(B\) is obtained from \(A\) by multiplying row \(j\) by \(p\). Then \[\mathrm{row}(B)=\mathrm{span}\{ \vec{r}_1, \ldots, p\vec{r}_{j}, \ldots, \vec{r}_m\}.\nonumber \] Since \[\{ \vec{r}_1, \ldots, p\vec{r}_{j}, \ldots, \vec{r}_m\} \subseteq\mathrm{row}(A),\nonumber \] it follows that \(\mathrm{row}(B)\subseteq\mathrm{row}(A)\). Conversely, since \[\{ \vec{r}_1, \ldots, \vec{r}_m\}\subseteq\mathrm{row}(B),\nonumber \] it follows that \(\mathrm{row}(A)\subseteq\mathrm{row}(B)\). Therefore, \(\mathrm{row}(B)=\mathrm{row}(A)\).
Suppose \(p\neq 0\), and suppose that for some \(i\) and \(j\), \(1\leq i,j\leq m\), \(B\) is obtained from \(A\) by adding \(p\) time row \(j\) to row \(i\). Without loss of generality, we may assume \(i<j\).
Then \[\mathrm{row}(B)=\mathrm{span}\{ \vec{r}_1, \ldots, \vec{r}_{i-1}, \vec{r}_i+p\vec{r}_j, \ldots,\vec{r}_j,\ldots, \vec{r}_m\}.\nonumber \]
Since \[\{ \vec{r}_1, \ldots, \vec{r}_{i-1}, \vec{r}_i+p\vec{r}_{j}, \ldots, \vec{r}_m\} \subseteq\mathrm{row}(A),\nonumber \] it follows that \(\mathrm{row}(B)\subseteq\mathrm{row}(A)\).
Conversely, since \[\{ \vec{r}_1, \ldots, \vec{r}_m\}\subseteq\mathrm{row}(B),\nonumber \] it follows that \(\mathrm{row}(A)\subseteq\mathrm{row}(B)\). Therefore, \(\mathrm{row}(B)=\mathrm{row}(A)\).
Consider the following lemma.
Let \(A\) be an \(m \times n\) matrix and let \(R\) be its reduced row-echelon form. Then the nonzero rows of \(R\) form a basis of \(\mathrm{row}(R)\), and consequently of \(\mathrm{row}(A)\).
This lemma suggests that we can examine the reduced row-echelon form of a matrix in order to obtain the row space. Consider now the column space. The column space can be obtained by simply saying that it equals the span of all the columns. However, you can often get the column space as the span of fewer columns than this. A variation of the previous lemma provides a solution. Suppose \(A\) is row reduced to its reduced row-echelon form \(R\). Identify the pivot columns of \(R\) (columns which have leading ones), and take the corresponding columns of \(A\). It turns out that this forms a basis of \(\mathrm{col}(A)\).
Before proceeding to an example of this concept, we revisit the definition of rank.
Previously, we defined \(\mathrm{rank}(A)\) to be the number of leading entries in the row-echelon form of \(A\). Using an understanding of dimension and row space, we can now define rank as follows: \[\mbox{rank}(A) = \dim(\mathrm{row}(A))\nonumber \]
Consider the following example.
Find the rank of the following matrix and describe the column and row spaces. \[A = \left[ \begin{array}{rrrrr} 1 & 2 & 1 & 3 & 2 \\ 1 & 3 & 6 & 0 & 2 \\ 3 & 7 & 8 & 6 & 6 \end{array} \right]\nonumber \]
Solution
The reduced row-echelon form of \(A\) is \[\left[ \begin{array}{rrrrr} 1 & 0 & -9 & 9 & 2 \\ 0 & 1 & 5 & -3 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{array} \right]\nonumber \] Therefore, the rank is \(2\).
Notice that the first two columns of \(R\) are pivot columns. By the discussion following Lemma \(\PageIndex{2}\), we find the corresponding columns of \(A\), in this case the first two columns. Therefore a basis for \(\mathrm{col}(A)\) is given by \[\left\{\left[ \begin{array}{r} 1 \\ 1 \\ 3 \end{array} \right] , \left[ \begin{array}{r} 2 \\ 3 \\ 7 \end{array} \right] \right\}\nonumber \]
For example, consider the third column of the original matrix. It can be written as a linear combination of the first two columns of the original matrix as follows. \[\left[ \begin{array}{r} 1 \\ 6 \\ 8 \end{array} \right] =-9\left[ \begin{array}{r} 1 \\ 1 \\ 3 \end{array} \right] +5\left[ \begin{array}{r} 2 \\ 3 \\ 7 \end{array} \right]\nonumber \]
What about an efficient description of the row space? By Lemma \(\PageIndex{2}\) we know that the nonzero rows of \(R\) create a basis of \(\mathrm{row}(A)\). For the above matrix, the row space equals \[\mathrm{row}(A) = \mathrm{span} \left\{ \left[ \begin{array}{rrrrr} 1 & 0 & -9 & 9 & 2 \end{array} \right], \left[ \begin{array}{rrrrr} 0 & 1 & 5 & -3 & 0 \end{array} \right] \right\}\nonumber \]
Notice that the column space of \(A\) is given as the span of columns of the original matrix, while the row space of \(A\) is the span of rows of the reduced row-echelon form of \(A\).
Consider another example.
Find the rank of the following matrix and describe the column and row spaces. \[\left[ \begin{array}{rrrrrr} 1 & 2 & 1 & 3 & 2 \\ 1 & 3 & 6 & 0 & 2 \\ 1 & 2 & 1 & 3 & 2 \\ 1 & 3 & 2 & 4 & 0 \end{array} \right]\nonumber \]
Solution
The reduced row-echelon form is \[\left[ \begin{array}{rrrrrr} 1 & 0 & 0 & 0 & \frac{13}{2} \\ 0 & 1 & 0 & 2 & -\frac{5}{2} \\ 0 & 0 & 1 & -1 & \frac{1}{2} \\ 0 & 0 & 0 & 0 & 0 \end{array} \right]\nonumber \] and so the rank is \(3\). The row space is given by \[\mathrm{row}(A) = \mathrm{span} \left\{ \left[ \begin{array}{ccccc} 1 & 0 & 0 & 0 & \frac{13}{2} \end{array} \right], \left[ \begin{array}{rrrrr} 0 & 1 & 0 & 2 & -\frac{5}{2} \end{array} \right] , \left[ \begin{array}{rrrrr} 0 & 0 & 1 & -1 & \frac{1}{2} \end{array} \right] \right\}\nonumber \]
Notice that the first three columns of the reduced row-echelon form are pivot columns. The column space is the span of the first three columns in the original matrix, \[\mathrm{col}(A) = \mathrm{span} \left\{ \left[ \begin{array}{r} 1 \\ 1 \\ 1 \\ 1 \end{array} \right], \; \left[ \begin{array}{r} 2 \\ 3 \\ 2 \\ 3 \end{array} \right] , \; \left[ \begin{array}{r} 1 \\ 6 \\ 1 \\ 2 \end{array} \right] \right\}\nonumber \]
Consider the solution given above for Example \(\PageIndex{17}\), where the rank of \(A\) equals \(3\). Notice that the row space and the column space each had dimension equal to \(3\). It turns out that this is not a coincidence, and this essential result is referred to as the Rank Theorem and is given now. Recall that we defined \(\mathrm{rank}(A) = \mathrm{dim}(\mathrm{row}(A))\).
Let \(A\) be an \(m \times n\) matrix. Then \(\mathrm{dim}(\mathrm{col} (A))\), the dimension of the column space, is equal to the dimension of the row space, \(\mathrm{dim}(\mathrm{row}(A))\).
The following statements all follow from the Rank Theorem.
Let \(A\) be a matrix. Then the following are true:
- \(\mathrm{rank}(A) = \mathrm{rank}(A^T)\).
- For \(A\) of size \(m \times n\), \(\mathrm{rank}(A) \leq m\) and \(\mathrm{rank}(A) \leq n\).
- For \(A\) of size \(n \times n\), \(A\) is invertible if and only if \(\mathrm{rank}(A) = n\).
- For invertible matrices \(B\) and \(C\) of appropriate size, \(\mathrm{rank}(A) = \mathrm{rank}(BA) = \mathrm{rank}(AC)\).
Consider the following example.
Let \[A = \left[ \begin{array}{rr} 1 & 2 \\ -1 & 1 \end{array} \right]\nonumber \] Find \(\mathrm{rank}(A)\) and \(\mathrm{rank}(A^T)\).
Solution
To find \(\mathrm{rank}(A)\) we first row reduce to find the reduced row-echelon form. \[A = \left[ \begin{array}{rr} 1 & 2 \\ -1 & 1 \end{array} \right] \rightarrow \cdots \rightarrow \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \end{array}\right]\nonumber \]
Therefore the rank of \(A\) is \(2\). Now consider \(A^T\) given by \[A^T = \left[ \begin{array}{rr} 1 & -1 \\ 2 & 1 \end{array} \right]\nonumber \] Again we row reduce to find the reduced row-echelon form.
\[\left[ \begin{array}{rr} 1 & -1 \\ 2 & 1 \end{array} \right] \rightarrow \cdots \rightarrow \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \end{array} \right]\nonumber \]
You can see that \(\mathrm{rank}(A^T) = 2\), the same as \(\mathrm{rank}(A)\).
We now define what is meant by the null space of a general \(m\times n\) matrix.
The null space of a matrix \(A\), also referred to as the kernel of \(A\), is defined as follows. \[\mathrm{null} \left( A\right) =\left\{ \vec{x} :A \vec{x} =\vec{0}\right\}\nonumber \]
It can also be referred to using the notation \(\ker \left( A\right)\). Similarly, we can discuss the image of \(A\), denoted by \(\mathrm{im}\left( A\right)\). The image of \(A\) consists of the vectors of \(\mathbb{R}^{m}\) which “get hit” by \(A\). The formal definition is as follows.
The image of \(A\), written \(\mathrm{im}\left( A\right)\) is given by \[\mathrm{im}\left( A \right) = \left\{ A\vec{x} : \vec{x} \in \mathbb{R}^n \right\}\nonumber \]
Consider \(A\) as a mapping from \(\mathbb{R}^{n}\) to \(\mathbb{R}^{m}\) whose action is given by multiplication. The following diagram displays this scenario. \[\overset{\mathrm{null} \left( A\right) }{\mathbb{R}^{n}}\ \overset{A}{\rightarrow }\ \overset{ \mathrm{im}\left( A\right) }{\mathbb{R}^{m}}\nonumber \] As indicated, \(\mathrm{im}\left( A\right)\) is a subset of \(\mathbb{R}^{m}\) while \(\mathrm{null} \left( A\right)\) is a subset of \(\mathbb{R}^{n}\).
It turns out that the null space and image of \(A\) are both subspaces. Consider the following example.
Let \(A\) be an \(m\times n\) matrix. Then the null space of \(A\), \(\mathrm{null}(A)\) is a subspace of \(\mathbb{R}^n\).
Solution
- Since \(A\vec{0}_n=\vec{0}_m\), \(\vec{0}_n\in\mathrm{null}(A)\).
- Let \(\vec{x},\vec{y}\in\mathrm{null}(A)\). Then \(A\vec{x}=\vec{0}_m\) and \(A\vec{y}=\vec{0}_m\), so \[A(\vec{x}+\vec{y})=A\vec{x}+A\vec{y} = \vec{0}_m+\vec{0}_m=\vec{0}_m,\nonumber \] and thus \(\vec{x}+\vec{y}\in\mathrm{null}(A)\).
- Let \(\vec{x}\in\mathrm{null}(A)\) and \(k\in\mathbb{R}\). Then \(A\vec{x}=\vec{0}_m\), so \[A(k\vec{x}) = k(A\vec{x})=k\vec{0}_m=\vec{0}_m,\nonumber \] and thus \(k\vec{x}\in\mathrm{null}(A)\).
Therefore by the subspace test, \(\mathrm{null}(A)\) is a subspace of \(\mathbb{R}^n\).
The proof that \(\mathrm{im}(A)\) is a subspace of \(\mathbb{R}^m\) is similar and is left as an exercise to the reader.
We now wish to find a way to describe \(\mathrm{null}(A)\) for a matrix \(A\). However, finding \(\mathrm{null} \left( A\right)\) is not new! There is just some new terminology being used, as \(\mathrm{null} \left( A\right)\) is simply the solution to the system \(A\vec{x}=\vec{0}\).
Let \(A\) be an \(m \times n\) matrix such that \(\mathrm{rank}(A) = r\). Then the system \(A\vec{x}=\vec{0}_m\) has \(n-r\) basic solutions, providing a basis of \(\mathrm{null}(A)\) with \(\dim(\mathrm{null}(A))=n-r\).
Consider the following example.
Let \[A=\left[ \begin{array}{rrr} 1 & 2 & 1 \\ 0 & -1 & 1 \\ 2 & 3 & 3 \end{array} \right]\nonumber \] Find \(\mathrm{null} \left( A\right)\) and \(\mathrm{im}\left( A\right)\).
Solution
In order to find \(\mathrm{null} \left( A\right)\), we simply need to solve the equation \(A\vec{x}=\vec{0}\). This is the usual procedure of writing the augmented matrix, finding the reduced row-echelon form and then the solution. The augmented matrix and corresponding reduced row-echelon form are \[\left[ \begin{array}{rrr|r} 1 & 2 & 1 & 0 \\ 0 & -1 & 1 & 0 \\ 2 & 3 & 3 & 0 \end{array} \right] \rightarrow \cdots \rightarrow \left[ \begin{array}{rrr|r} 1 & 0 & 3 & 0 \\ 0 & 1 & -1 & 0 \\ 0 & 0 & 0 & 0 \end{array} \right]\nonumber \]
The third column is not a pivot column, and therefore the solution will contain a parameter. The solution to the system \(A\vec{x}=\vec{0}\) is given by \[\left[ \begin{array}{r} -3t \\ t \\ t \end{array} \right] :t\in \mathbb{R}\nonumber \] which can be written as \[t \left[ \begin{array}{r} -3 \\ 1 \\ 1 \end{array} \right] :t\in \mathbb{R}\nonumber \]
Therefore, the null space of \(A\) is all multiples of this vector, which we can write as \[\mathrm{null} (A) = \mathrm{span} \left\{ \left[ \begin{array}{r} -3 \\ 1 \\ 1 \end{array} \right] \right\}\nonumber \]
Finally \(\mathrm{im}\left( A\right)\) is just \(\left\{ A\vec{x} : \vec{x} \in \mathbb{R}^n \right\}\) and hence consists of the span of all columns of \(A\), that is \(\mathrm{im}\left( A\right) = \mathrm{col} (A)\).
Notice from the above calculation that that the first two columns of the reduced row-echelon form are pivot columns. Thus the column space is the span of the first two columns in the original matrix, and we get \[\mathrm{im}\left( A\right) = \mathrm{col}(A) = \mathrm{span} \left\{ \left[ \begin{array}{r} 1 \\ 0 \\ 2 \end{array} \right], \; \left[ \begin{array}{r} 2 \\ -1 \\ 3 \end{array} \right] \right\}\nonumber \]
Here is a larger example, but the method is entirely similar.
Let \[A=\left[ \begin{array}{rrrrr} 1 & 2 & 1 & 0 & 1 \\ 2 & -1 & 1 & 3 & 0 \\ 3 & 1 & 2 & 3 & 1 \\ 4 & -2 & 2 & 6 & 0 \end{array} \right]\nonumber \] Find the null space of \(A\).
Solution
To find the null space, we need to solve the equation \(AX=0\). The augmented matrix and corresponding reduced row-echelon form are given by
\[\left[ \begin{array}{rrrrr|r} 1 & 2 & 1 & 0 & 1 & 0 \\ 2 & -1 & 1 & 3 & 0 & 0 \\ 3 & 1 & 2 & 3 & 1 & 0 \\ 4 & -2 & 2 & 6 & 0 & 0 \end{array} \right] \rightarrow \cdots \rightarrow \left[ \begin{array}{rrrrr|r} 1 & 0 & \frac{3}{5} & \frac{6}{5} & \frac{1}{5} & 0 \\ 0 & 1 & \frac{1}{5} & -\frac{3}{5} & \frac{2}{5} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{array} \right]\nonumber \] It follows that the first two columns are pivot columns, and the next three correspond to parameters. Therefore, \(\mathrm{null} \left( A\right)\) is given by \[\left[ \begin{array}{c} \left( -\frac{3}{5}\right) s +\left( -\frac{6}{5}\right) t+\left( \frac{1}{5}\right) r \\ \left( -\frac{1}{5}\right) s +\left( \frac{3}{5}\right) t +\left( - \frac{2}{5}\right) r \\ s \\ t \\ r \end{array} \right] :s ,t ,r\in \mathbb{R}\text{.}\nonumber \] We write this in the form \[s \left[ \begin{array}{r} -\frac{3}{5} \\ -\frac{1}{5} \\ 1 \\ 0 \\ 0 \end{array} \right] + t \left[ \begin{array}{r} -\frac{6}{5} \\ \frac{3}{5} \\ 0 \\ 1 \\ 0 \end{array} \right] + r \left[ \begin{array}{r} \frac{1}{5} \\ -\frac{2}{5} \\ 0 \\ 0 \\ 1 \end{array} \right] :s , t , r\in \mathbb{R}\text{.}\nonumber \] In other words, the null space of this matrix equals the span of the three vectors above. Thus \[\mathrm{null} \left( A\right) =\mathrm{span}\left\{ \left[ \begin{array}{r} -\frac{3}{5} \\ -\frac{1}{5} \\ 1 \\ 0 \\ 0 \end{array} \right] ,\left[ \begin{array}{r} -\frac{6}{5} \\ \frac{3}{5} \\ 0 \\ 1 \\ 0 \end{array} \right] ,\left[ \begin{array}{r} \frac{1}{5} \\ -\frac{2}{5} \\ 0 \\ 0 \\ 1 \end{array} \right] \right\}\nonumber \]
Notice also that the three vectors above are linearly independent and so the dimension of \(\mathrm{null} \left( A\right)\) is 3. The following is true in general, the number of parameters in the solution of \(AX=0\) equals the dimension of the null space. Recall also that the number of leading ones in the reduced row-echelon form equals the number of pivot columns, which is the rank of the matrix, which is the same as the dimension of either the column or row space.
Before we proceed to an important theorem, we first define what is meant by the nullity of a matrix.
The dimension of the null space of a matrix is called the nullity, denoted \(\dim( \mathrm{null}\left(A\right))\).
From our observation above we can now state an important theorem.
Let \(A\) be an \(m\times n\) matrix. Then \(\mathrm{rank}\left( A\right) + \dim( \mathrm{null}\left(A\right)) =n\).
Let \[A=\left[ \begin{array}{rrr} 1 & 2 & 1 \\ 0 & -1 & 1 \\ 2 & 3 & 3 \end{array} \right]\nonumber \]
Find \(\mathrm{rank}\left( A\right)\) and \(\dim( \mathrm{null}\left(A\right))\).
Solution
In the above Example \(\PageIndex{20}\) we determined that the reduced row-echelon form of \(A\) is given by \[\left[ \begin{array}{rrr} 1 & 0 & 3 \\ 0 & 1 & -1 \\ 0 & 0 & 0 \end{array} \right]\nonumber \]
Therefore the rank of \(A\) is \(2\). We also determined that the null space of \(A\) is given by \[\mathrm{null} (A) = \mathrm{span} \left\{ \left[ \begin{array}{r} -3 \\ 1 \\ 1 \end{array} \right] \right\}\nonumber \]
Therefore the nullity of \(A\) is \(1\). It follows from Theorem \(\PageIndex{14}\) that \(\mathrm{rank}\left( A\right) + \dim( \mathrm{null}\left(A\right)) = 2 + 1 = 3\), which is the number of columns of \(A\).
We conclude this section with two similar, and important, theorems.
Let \(A\) be an \(m\times n\) matrix. The following are equivalent.
- \(\mathrm{rank}(A)=n\).
- \(\mathrm{row}(A)=\mathbb{R}^n\), i.e., the rows of \(A\) span \(\mathbb{R}^n\).
- The columns of \(A\) are independent in \(\mathbb{R}^m\).
- The \(n\times n\) matrix \(A^TA\) is invertible.
- There exists an \(n\times m\) matrix \(C\) so that \(CA=I_n\).
- If \(A\vec{x}=\vec{0}_m\) for some \(\vec{x}\in\mathbb{R}^n\), then \(\vec{x}=\vec{0}_n\).
Let \(A\) be an \(m\times n\) matrix. The following are equivalent.
- \(\mathrm{rank}(A)=m\).
- \(\mathrm{col}(A)=\mathbb{R}^m\), i.e., the columns of \(A\) span \(\mathbb{R}^m\).
- The rows of \(A\) are independent in \(\mathbb{R}^n\).
- The \(m\times m\) matrix \(AA^T\) is invertible.
- There exists an \(n\times m\) matrix \(C\) so that \(AC=I_m\).
- The system \(A\vec{x}=\vec{b}\) is consistent for every \(\vec{b}\in\mathbb{R}^m\).