1.6: Operations with Matrices
( \newcommand{\kernel}{\mathrm{null}\,}\)
In the previous section we saw the important connection between linear functions and matrices. In this section we will discuss various operations on matrices which we will find useful in our later work with linear functions.
The algebra of matrices
If M is an n×m matrix with aij in the ith row and jth column, i=1,2,…,n,j=1,2,…,m, then we will write M=[aij]. With this notation the definitions of addition, subtraction, and scalar multiplication for matrices are straightforward.
Definition 1.6.1
Suppose M=[aij] and N=[bij] are n×m matrices and c is a real number. Then we define
M+N=[aij+bij],
M−N=[aij−bij],
and
cM=[caij].
In other words, we define addition, subtraction, and scalar multiplication for matrices by performing these operations on the individual elements of the matrices, in a manner similar to the way we perform these operations on vectors.
Example 1.6.1
If
M=[123−53−1]
and
N=[3141−32],
then, for example,
M+N=[1+32+13+4−5+13−3−1+2]=[437−401],M−N=[1−32−13−4−5−13+3−1−2]=[−21−1−66−3],
and
3M=[369−159−2].
These operations have natural interpretations in terms of linear functions. Suppose L:Rm→Rn and K:Rm→Rn are linear with L(x)=Mx and K(x)=Nx for n×m matrices M and N. If we define L+K:Rn→Rm by
(L+K)(x)=L(x)+K(x),
then
(L+K)(ej)=L(ej)+K(ej)
for j=1,2,…,m. Hence the jth column of the matrix which represents L+K is the sum of the jth columns of M and N. In other words,
(L+K)(x)=(M+N)x
for all x in Rm. Similarly, if we define L−K:Rm→Rn by
(L−K)(x)=L(x)−K(x),
then
(L−K)(x)=(M−N)x.
If, for any scalar c, we define cL:Rm→Rn by
cL(x)=c(L(x)),
then
cL(ej)=c(L(ej))
for j=1,2,…,m. Hence the jth column of the matrix which represents cL is the scalar c times the jth column of M. That is,
cL(x)=(cM)x
for all x in Rm. In short, the operations of addition, subtraction, and scalar multiplication for matrices corresponds in a natural way with the operations of addition, subtraction, and scalar multiplication for linear functions.
Now consider the case where L:Rm→Rp and K:Rp→Rn are linear functions. Let M be the p×m matrix such that L(x)=Mx for all x in Rm and let N be the n×p matrix such that K(x)=Nx for all x in Rp. Since for any x in Rm, L(x) is in Rp, we can form K∘L:Rm→Rn, the composition of K with L, defined by
K∘L(x)=K(L(x)).
Now
K(L(x))=N(Mx),
so it would be natural to define NM, the product of the matrices N and M, to be the matrix of K∘L, in which case we would have
N(Mx)=(NM)x.
Thus we want the jth column of NM, j=1,2,…,m, to be
K∘L(ej)=N(L(ej)),
which is just the dot product of L(ej) with the rows of N. But L(ej) is the jth column of M, so the jth column of NM is formed by taking the dot product of the jth column of M with the rows of N. In other words, the entry in the ith row and jth column of NM is the dot product of the ith row of N with the jth column of M. We write this out explicitly in the following definition.
Definition 1.6.2
If N=[aij] is an n×p matrix and M=[bij] is a p×m matrix, then we define the product of N and M to be the n×m matrix NM=[cij], where
cij=p∑k=1aikbkj,
i=1,2,…,n and j=1,2,…,m.
Note that NM is an n×m matrix since K∘L:Rm→Rn. Moreover, the product NM of two matrices N and M is defined only when the number of columns of N is equal to the number of rows of M.
Example 1.6.2
If
N=[12−132−2]
and
M=[2−21312−1−2],
then
NM=[12−132−2][2−21312−1−2]=[2+2−2+41−23−4−2+32+6−1−3−3−64−2−4−42+26+4]=[42−1−118−4−92−8410].
Note that N is 3×2, M is 2×4, and NM is 3×4. Also, note that it is not possible to form the product in the other order.
Example 1.6.3
Let L:R2→R3 be the linear function defined by
L(x,y)=(3x−2y,x+y,4y)
and let K:R3→R2 be the linear function defined by
K(x,y,z)=(2x−y+z,x−y−z).
Then the matrix for L is
M=[3−21104],
the matrix for K is
N=[2−111−1−1],
and the matrix for K∘L:R2→R2 is
NM=[2−111−1−1][3−21104]=[6−1+0−4−1+43−1+0−2−1−4]=[5−12−7].
In other words,
K∘L(x,y)=[5−12−7][xy]=[5x−y2x−7y].
Note that it in this case it is possible to form the composition in the other order. The matrix for L∘K:R3→R3 is
MN=[3−21104][2−111−1−1]=[6−2−3+23+22+1−1−11−10+40−40−4]=[4−153−204−4−4],
and so
L∘K(x,y,z)=[4−153−204−4−4][xyz]=[4x−y+5z3x−2y4x−4y−4z].
In particular, note that not only is NM≠MN, but in fact NM and MN are not even the same size.
Determinants
The notion of the determinant of a matrix is closely related to the idea of area and volume. To begin our definition, consider the 2×2 matrix
M=[a1a2b1b2]

and let a=(a1,a2) and b=(b1,b2). If P is the parallelogram which has a and b for adjacent sides and A is the area of P (see Figure 1.6.1), then we saw in Section 1.3 that
A=‖(a1,a2,0)×(b1,b2,0)‖=‖(0,0,a1b2−a2b1‖=|a1b2−a2b1|.
This motivates the following definition.
Definition 1.6.3
Given a 2×2 matrix
M=[a1a2b1b2],
the determinant of M, denoted det(M), is
det(M)=a1b2−a2b1.
Hence we have A=|det(M)|. In words, for a 2×2 matrix M, the absolute value of the determinant of M equals the area of the parallelogram which has the rows of M for adjacent sides.
Example 1.6.4
We have
det[13−45]=(1)(5)−(3)(−4)=5+12=17.
Now consider a 3×3 matrix
M=[a1a2a3b1b2b3c1c2c3]
and let a=(a1,a2,a3), b=(b1,b2,b3), and c=(c1,c2,c3). If V is the volume of the parallelepiped P with adjacent edges a, b, and c, then, again from Section 1.3,
V=|a⋅(b×c)|=|a1(b2c3−b3c2)+a2(b3c1−b1c3)+a3(b1c2−b2c1)|=|a1det[b2b3c2c3]−a2det[b1b3c1c3]+a3det[b1b2c1c2]|.
Definition 1.6.4
Given a 3×3 matrix
M=[a1a2a3b1b2b3c1c2c3],
the determinant of M, denoted det(M), is
det(M)=a1det[b2b3c2c3]−a2det[b1b3c1c3]+a3det[b1b2c1c2].
Similar to the 2×2 case, we have V=|det(M)|.
Example 1.6.5
We have
det[23921−451−1]=2det[1−41−1]−3det[2−45−1]+9det[2151]=2(−1+4)−3(−2+20)+9(2−5)=6−54−27=−75.
Given an n×n matrix M=[aij], let Mij be the (n−1)×(n−1) matrix obtained by deleting the ith row and jth column of M. If for n=1 we first define det(M)=a11 (that is, the determinant of a 1×1 matrix is just the value of its single entry), then we could express, for n=2, the definition of a the determinant of a 2×2 matrix given in (???) in the form
det(M)=a11det(M11)−a12det(M12)=a11a22−a12a21.
Similarly, with n=3, we could express the definition of the determinant of M given in (???) in the form
det(M)=a11det(M11)−a12det(M12)+a13det(M13).
Following this pattern, we may form a recursive definition for the determinant of an n×n matrix.
Definition 1.6.5
Suppose M=[aij] is an n×n matrix and let Mij be the (n−1)×(n−1) matrix obtained by deleting the ith row and jth column of M, i=1,2,…,n and j=1,2,…,n. For n=1, we define the determinant of M, denoted det(M), by
det(M)=a11
For n>1, we define the determinant of M, denoted det(M), by
det(M)=a11det(M11)−a12det(M12)+⋯+(−1)1+na1ndet(M1n)=n∑j=1(−1)1+ja1jdet(M1j).
We call the definition recursive because we have defined the determinant of an n×n matrix in terms of the determinants of (n−1)×(n−1) matrices, which in turn are defined in terms of the determinants of (n−2)×(n−2) matrices, and so on, until we have reduced the problem to computing the determinants of 1×1 matrices.
Example 1.6.6
For an example of the determinant of a 4×4 matrix, we have
det[21322141−23−121211]=2det[1413−12211]−det[241−2−121111]+3det[211−232121]−2det[214−23−1121]=2((−1−2)−4(3−4)+(3+2))−(2(−1−2)−4(−2−2)+(−2+1))+3(2(3−4)−(−2−2)+(−4−3))−2(2(3+2)−(−2+1)+4(−4−3))=2(−3+4+5)−(−6+16−1)+3(−2+4−7)−2(10+1−28)=12−9−15+34=22.
The next theorem states that there is nothing special about using the first row of the matrix in the expansion of the determinant specified in (1.6.45), nor is there anything special about expanding along a row instead of a column. The practical effect is that we may compute the determinant of a given matrix expanding along whichever row or column is most convenient. The proof of this theorem would take us too far afield at this point, so we will omit it (but you will be asked to verify the theorem for the special cases n=2 and n=3 in Exercise 10).
Theorem 1.6.1
Let M=[aij] be an n×n matrix and let Mij be the (n−1)×(n−1) matrix obtained by deleting the ith row and jth column of M. Then for any i=1,2,…,n,
det(M)=n∑j=1(−1)i+jaijdet(Mij),
and for any j=1,2,…,n,
det(M)=n∑i=1(−1)i+jaijdet(Mij),
Example 1.6.7
The simplest way to compute the determinant of the matrix
M=[403231−30−2]
is to expand along the second column. Namely,
det(M)=(−1)1+2(0)det[21−3−2]+(−1)2+2(3)det[43−3−2]+(−1)3+2(0)det[4321]=3(−8+9)=3.
You should verify that expanding along the first row, as we did in the definition of the determinant, gives the same result.
In order to return to the problem of computing volumes, we need to define a parallelepiped in Rn. First note that if P is a parallelogram in R2 with adjacent sides given by the vectors a and b, then
P={y:y=ta+sb,0≤t≤1,0≤s≤1}.
That is, for 0≤t≤1, ta is a point between 0 and a, and for 0≤s≤1, sb is a point between 0 and b; hence ta+sb is a point in the parallelogram P. Moreover, every point in P may be expressed in this form. See Figure 1.6.2. The following definition generalizes this characterization of parallelograms.

Definition 1.6.6
Let a1,a2,…,an be linearly independent vectors in Rn. We call
P={y:y=t1a1+t2a2+⋯+tnan,0≤ti≤1,i=1,2,…,n}
an n-dimensional parallelepiped with adjacent edges a1,a2,…,an.
Definition 1.6.7
Let P be an n-dimensional parallelepiped with adjacent edges a1,a2,…,an and let M be the n×n matrix which has a1,a2,…,an for its rows. Then the volume of P is defined to be |det(M)|.
It may be shown, using (???) and induction, that if N is the matrix obtained by interchanging the rows and columns of an n×n matrix M, then det(N)=det(M) (see Exercise 12). Thus we could have defined M in the previous definition using a1,a2,…,an for columns rather than rows.
Now suppose L:Rn→Rn is linear and let M be the n×n matrix such that L(x)=Mx for all x in Rn. Let C be the n-dimensional parallelepiped with adjacent edges e1,e2,…,en, the standard basis vectors for Rn. Then C is a 1×1 square when n=2 and a 1×1×1 cube when n=3. In general, we may think of C as an n-dimensional unit cube. Note that the volume of C is, by definition,
det[100⋯0010⋯0001⋯0⋮⋮⋮⋱⋮000⋯1]=1.
Suppose L(e1),L(e2),…,L(en) are linearly independent and let P be the n-dimensional parallelepiped with adjacent edges L(e1),L(e2),…,L(en). Note that if
x=t1e1+t2e2+⋯+tnen,
where 0≤tk≤1 for k=1,2,…,n, is a point in C, then
L(x)=t1L(e1)+t2L(e2)+⋯+tnL(en)
is a point in P. In fact, L maps the n-dimensional unit cube C exactly onto the n-dimensional parallelepiped P. Since L(e1),L(e2),…,L(en) are the columns of M, it follows that the volume of P equals |det(M)|. In other words, |det(M)| measures how much L stretches or shrinks the volume of a unit cube.
Theorem 1.6.2
Suppose L:Rn→Rn is linear and M is the n×n matrix such that L(x)=Mx. If L(e1),L(e2),…,L(en) are linear independent and P is the n-dimensional parallelepiped with adjacent edges L(e1),L(e2),…,L(en), then the volume of P is equal to |det(M)|.