4.1: Eigenvalues and Eigenvectors
( \newcommand{\kernel}{\mathrm{null}\,}\)
- T/F: Given any matrix A, we can always find a vector →x where A→x=→x.
- When is the zero vector an eigenvector for a matrix?
- If →v is an eigenvector of a matrix A with eigenvalue of 2, then what is A→v?
- T/F: If A is a 5×5 matrix, to find the eigenvalues of A, we would need to find the roots of a 5th degree polynomial.
We start by considering the matrix A and vector →x as given below.1
A=[1423]→x=[11]
Multiplying A→x gives:
A→x=[1423][11]=[55]=5[11]!
Wow! It looks like multiplying A→x is the same as 5→x! This makes us wonder lots of things: Is this the only case in the world where something like this happens?2 Is A somehow a special matrix, and A→x=5→x for any vector →x we pick?3 Or maybe →x was a special vector, and no matter what 2×2 matrix A we picked, we would have A→x=5→x.4
A more likely explanation is this: given the matrix A, the number 5 and the vector →x formed a special pair that happened to work together in a nice way. It is then natural to wonder if other “special” pairs exist. For instance, could we find a vector →x where A→x=3→x?
This equation is hard to solve at first; we are not used to matrix equations where →x appears on both sides of “=.” Therefore we put off solving this for just a moment to state a definition and make a few comments.
Let A be an n×n matrix, →x a nonzero n×1 column vector and λ a scalar. If
A→x=λ→x,
then →x is an eigenvector of A and λ is an eigenvalue of A.
The word “eigen” is German for “proper” or “characteristic.” Therefore, an eigenvector of A is a “characteristic vector of A.” This vector tells us something about A.
Why do we use the Greek letter λ (lambda)? It is pure tradition. Above, we used a to represent the unknown scalar, since we are used to that notation. We now switch to λ because that is how everyone else does it.5 Don’t get hung up on this; λ is just a number.
Note that our definition requires that A be a square matrix. If A isn’t square then A→x and λ→x will have different sizes, and so they cannot be equal. Also note that →x must be nonzero. Why? What if →x=→0? Then no matter what λ is, A→x=λ→x. This would then imply that every number is an eigenvalue; if every number is an eigenvalue, then we wouldn’t need a definition for it.6 Therefore we specify that →x≠→0.
Our last comment before trying to find eigenvalues and eigenvectors for given matrices deals with “why we care.” Did we stumble upon a mathematical curiosity, or does this somehow help us build better bridges, heal the sick, send astronauts into orbit, design optical equipment, and understand quantum mechanics? The answer, of course, is “Yes."7 This is a wonderful topic in and of itself: we need no external application to appreciate its worth. At the same time, it has many, many applications to “the real world.” A simple Internet search on “applications of eigenvalues” with confirm this.
Back to our math. Given a square matrix A, we want to find a nonzero vector →x and a scalar λ such that A→x=λ→x. We will solve this using the skills we developed in Chapter 2.
A→x=λ→xoriginal equationA→x−λ→x=→0subtract λ→x from both sides(A−λI)→x=→0factor out →x
Think about this last factorization. We are likely tempted to say
A→x−λ→x=(A−λ)→x,
but this really doesn’t make sense. After all, what does “a matrix minus a number” mean? We need the identity matrix in order for this to be logical.
Let us now think about the equation (A−λI)→x=→0. While it looks complicated, it really is just matrix equation of the type we solved in Section 2.4. We are just trying to solve B→x=→0, where B=(A−λI).
We know from our previous work that this type of equation8 always has a solution, namely, →x=→0. However, we want →x to be an eigenvector and, by the definition, eigenvectors cannot be →0.
This means that we want solutions to (A−λI)→x=→0 other than →x=→0. Recall that Theorem 2.6.4 says that if the matrix (A−λI) is invertible, then the only solution to (A−λI)→x=→0 is →x=→0. Therefore, in order to have other solutions, we need (A−λI) to not be invertible.
Finally, recall from Theorem 3.4.3 that noninvertible matrices all have a determinant of 0. Therefore, if we want to find eigenvalues λ and eigenvectors →x, we need det(A−λI)=0.
Let’s start our practice of this theory by finding λ such that det(A−λI)=0; that is, let’s find the eigenvalues of a matrix.
Find the eigenvalues of A, that is, find λ such that det(A−λI)=0, where
A=[1423].
Solution
(Note that this is the matrix we used at the beginning of this section.) First, we write out what A−λI is:
A−λI=[1423]−λ[1001]=[1423]−[λ00λ]=[1−λ423−λ]
Therefore,
det(A−λI)=|1−λ423−λ|=(1−λ)(3−λ)−8=λ2−4λ−5
Since we want det(A−λI)=0, we want λ2−4λ−5=0. This is a simple quadratic equation that is easy to factor:
λ2−4λ−5=0(λ−5)(λ+1)=0λ=−1,5
According to our above work, det(A−λI) when λ=−1,5. Thus, the eigenvalues of A are −1 and 5.
Earlier, when looking at the same matrix as used in our example, we wondered if we could find a vector →x such that A→x=3→x. According to this example, the answer is “No.” With this matrix A, the only values of λ that work are −1 and 5.
Let’s restate the above in a different way: It is pointless to try to find →x where A→x=3→x, for there is no such →x. There are only 2 equations of this form that have a solution, namely
A→x=−→xandA→x=5→x.
As we introduced this section, we gave a vector →x such that A→x=5→x. Is this the only one? Let’s find out while calling our work an example; this will amount to finding the eigenvectors of A that correspond to the eigenvector of 5.
Find →x such that A→x=5→x, where
A=[1423].
Solution
Recall that our algebra from before showed that if
A→x=λ→xthen(A−λI)→x=→0.
Therefore, we need to solve the equation (A−λI)→x=→0 for →x when λ=5.
A−5I=[1423]−5[1001]=[−442−2]
To solve (A−5I)→x=→0, we form the augmented matrix and put it into reduced row echelon form:
[−4402−20]→rref[1−10000].
Thus
x1=x2x2 is free
and
→x=[x1x2]=x2[11].
We have infinite solutions to the equation A→x=5→x; any nonzero scalar multiple of the vector [11] is a solution. We can do a few examples to confirm this:
[1423][22]=[1010]=5[22];[1423][77]=[3535]=5[77];[1423][−3−3]=[−15−15]=5[−3−3].
Our method of finding the eigenvalues of a matrix A boils down to determining which values of λ give the matrix (A−λI) a determinant of 0. In computing det(A−λI), we get a polynomial in λ whose roots are the eigenvalues of A. This polynomial is important and so it gets its own name.
Let A be an n×n matrix. The characteristic polynomial of A is the nth degree polynomial p(λ)=det(A−λI).
Our definition just states what the characteristic polynomial is. We know from our work so far why we care: the roots of the characteristic polynomial of an n×n matrix A are the eigenvalues of A.
In Examples 4.1.1 and 4.1.2, we found eigenvalues and eigenvectors, respectively, of a given matrix. That is, given a matrix A, we found values λ and vectors →x such that A→x=λ→x. The steps that follow outline the general procedure for finding eigenvalues and eigenvectors; we’ll follow this up with some examples.
Let A be an n×n matrix.
- To find the eigenvalues of A, compute p(λ), the characteristic polynomial of A, set it equal to 0, then solve for λ.
- To find the eigenvectors of A, for each eigenvalue solve the homogeneous system (A−λI)→x=→0.
Find the eigenvalues of A, and for each eigenvalue, find an eigenvector where
A=[−31539].
Solution
To find the eigenvalues, we must compute det(A−λI) and set it equal to 0.
det(A−λI)=|−3−λ1539−λ|=(−3−λ)(9−λ)−45=λ2−6λ−27−45=λ2−6λ−72=(λ−12)(λ+6)
Therefore, det(A−λI)=0 when λ=−6 and 12; these are our eigenvalues. (We should note that p(λ)=λ2−6λ−72 is our characteristic polynomial.) It sometimes helps to give them “names,” so we’ll say λ1=−6 and λ2=12. Now we find eigenvectors.
For λ1=−6:
We need to solve the equation (A−(−6)I)→x=→0. To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
[31503150]→rref[150000].
Our solution is
x1=−5x2x2 is free;
in vector form, we have
→x=x2[−51].
We may pick any nonzero value for x2 to get an eigenvector; a simple option is x2=1. Thus we have the eigenvector
→x1=[−51].
(We used the notation →x1 to associate this eigenvector with the eigenvalue λ1.)
We now repeat this process to find an eigenvector for λ2=12:
In solving (A−12I)→x=→0, we find
[−151503−30]→rref[1−10000
In vector form, we have
→x=x2[11].
Again, we may pick any nonzero value for x2, and so we choose x2=1. Thus an eigenvector for λ2 is
→x2=[11].
To summarize, we have:
eigenvalue λ1=−6 with eigenvector →x1=[−51]
and
eigenvalue λ2=12 with eigenvector →x2=[11]
We should take a moment and check our work: is it true that A→x1=λ1→x1?
A→x1=[−31539][−51]=[30−6]=(−6)[−51]=λ1→x1.
Yes; it appears we have truly found an eigenvalue/eigenvector pair for the matrix A.
Let’s do another example.
Let A=[−3051]. Find the eigenvalues of A and an eigenvector for each eigenvalue.
Solution
We first compute the characteristic polynomial, set it equal to 0, then solve for λ.
det(A−λI)=|−3−λ051−λ|=(−3−λ)(1−λ)
From this, we see that det(A−λI)=0 when λ=−3,1. We’ll set λ1=−3 and λ2=1.
Finding an eigenvector for λ1:
We solve (A−(−3)I)→x=→0 for →x by row reducing the appropriate matrix:
[000540]→rref[15/40000].
Our solution, in vector form, is
→x=x2[−5/41].
Again, we can pick any nonzero value for x2; a nice choice would eliminate the fraction. Therefore we pick x2=4, and find
→x1=[−54].
Finding an eigenvector for λ2:
We solve (A−(1)I)→x=→0 for →x by row reducing the appropriate matrix:
[−400500]→rref[100000].
We’ve seen a matrix like this before,9 but we may need a bit of a refreshing. Our first row tells us that x1=0, and we see that no rows/equations involve x2. We conclude that x2 is free. Therefore, our solution, in vector form, is
→x=x2[01].
We pick x2=1, and find
→x2=[01].
To summarize, we have: eigenvalue λ1=−3 with eigenvector →x1=[−54] and eigenvalue λ2=1 with eigenvector →x2=[01].
So far, our examples have involved 2×2 matrices. Let’s do an example with a 3×3 matrix.
Find the eigenvalues of A, and for each eigenvalue, give one eigenvector, where
A=[−7−210−323−6−29].
Solution
We first compute the characteristic polynomial, set it equal to 0, then solve for λ. A warning: this process is rather long. We’ll use cofactor expansion along the first row; don’t get bogged down with the arithmetic that comes from each step; just try to get the basic idea of what was done from step to step.
det(A−λI)=|−7−λ−210−32−λ3−6−29−λ|=(−7−λ)|2−λ3−29−λ|−(−2)|−33−69−λ|+10|−32−λ−6−2|=(−7−λ)(λ2−11λ+24)+2(3λ−9)+10(−6λ+18)=−λ3+4λ2−λ−6=−(λ+1)(λ−2)(λ−3)
In the last step we factored the characteristic polynomial −λ3+4λ2−λ−6. Factoring polynomials of degree >2 is not trivial; we’ll assume the reader has access to methods for doing this accurately.10
Our eigenvalues are λ1=−1, λ2=2 and λ3=3. We now find corresponding eigenvectors.
For λ1=−1:
We need to solve the equation (A−(−1)I)→x=→0. To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
[−6−2100−3330−6−2100]→rref[10−1.5001−.500000]
Our solution, in vector form, is
→x=x3[3/21/21].
We can pick any nonzero value for x3; a nice choice would get rid of the fractions. So we’ll set x3=2 and choose →x1=[312] as our eigenvector.
For λ2=2:
We need to solve the equation (A−2I)→x=→0. To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
[−9−2100−3030−6−270]→rref[10−1001−.500000]
Our solution, in vector form, is
→x=x3[11/21].
We can pick any nonzero value for x3; again, a nice choice would get rid of the fractions. So we’ll set x3=2 and choose →x2=[212] as our eigenvector.
For λ3=3:
We need to solve the equation (A−3I)→x=→0. To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
[−10−2100−3−130−6−260]→rref[10−1001000000]
Our solution, in vector form, is (note that x2=0):
→x=x3[101].
We can pick any nonzero value for x3; an easy choice is x3=1, so →x3=[101] as our eigenvector.
To summarize, we have the following eigenvalue/eigenvector pairs:
eigenvalue λ1=−1 with eigenvector →x1=[312] eigenvale λ2=2 with eigenvector →x2=[212] eigenvalue λ3=3 with eigenvector →x3=[101]
Let’s practice once more.
Find the eigenvalues of A, and for each eigenvalue, give one eigenvector, where
A=[2−11016034].
Solution
We first compute the characteristic polynomial, set it equal to 0, then solve for λ. We’ll use cofactor expansion down the first column (since it has lots of zeros).
det(A−λI)=|2−λ−1101−λ6034−λ|=(2−λ)|1−λ634−λ|=(2−λ)(λ2−5λ−14)=(2−λ)(λ−7)(λ+2)
Notice that while the characteristic polynomial is cubic, we never actually saw a cubic; we never distributed the (2−λ) across the quadratic. Instead, we realized that this was a factor of the cubic, and just factored the remaining quadratic. (This makes this example quite a bit simpler than the previous example.)
Our eigenvalues are λ1=−2, λ2=2 and λ3=7. We now find corresponding eigenvectors.
For λ1=−2:
We need to solve the equation (A−(−2)I)→x=→0. To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
[4−11003600360]→rref[103/4001200000]
Our solution, in vector form, is
→x=x3[−3/4−21].
We can pick any nonzero value for x3; a nice choice would get rid of the fractions. So we’ll set x3=4 and choose →x1=[−3−84] as our eigenvector.
For λ2=2:
We need to solve the equation (A−2I)→x=→0. To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
[0−1100−1600320]→rref[010000100000]
This looks funny, so we’ll look remind ourselves how to solve this. The first two rows tell us that x2=0 and x3=0, respectively. Notice that no row/equation uses x1; we conclude that it is free. Therefore, our solution in vector form is
→x=x1[100].
We can pick any nonzero value for x1; an easy choice is x1=1 and choose →x2=[100] as our eigenvector.
For λ3=7:
We need to solve the equation (A−7I)→x=→0. To do this, we form the appropriate augmented matrix and put it into reduced row echelon form.
[−5−1100−66003−30]→rref[100001−100000]
Our solution, in vector form, is (note that x1=0):
→x=x3[011].
We can pick any nonzero value for x3; an easy choice is x3=1, so →x3=[011] as our eigenvector.
To summarize, we have the following eigenvalue/eigenvector pairs:
eigenvalue λ1=−2 with eigenvector →x1=[−3−84] eigenvalue λ2=2 with eigenvector →x2=[100] eigenvalue λ3=7 with eigenvector →x3=[011]
In this section we have learned about a new concept: given a matrix A we can find certain values λ and vectors →x where A→x=λ→x. In the next section we will continue to the pattern we have established in this text: after learning a new concept, we see how it interacts with other concepts we know about. That is, we’ll look for connections between eigenvalues and eigenvectors and things like the inverse, determinants, the trace, the transpose, etc.
Footnotes
[1] Recall this matrix and vector were used in Example 2.3.10.
[2] Probably not.
[3] Probably not.
[4] See footnote 2.
[5] An example of mathematical peer pressure.
[7] Except for the “understand quantum mechanics” part. Nobody truly understands that stuff; they just probably understand it.
[8] Recall this is a homogeneous system of equations.
[9] Our future need of knowing how to handle this situation is foretold in footnote 3 in Section 1.4.
[10] You probably learned how to do this in an algebra course. As a reminder, possible roots can be found by factoring the constant term (in this case, −6) of the polynomial. That is, the roots of this equation could be ±1,±2,±3 and ±6. That’s 12 things to check.
One could also graph this polynomial to find the roots. Graphing will show us that λ=3 looks like a root, and a simple calculation will confirm that it is.