2.2: Multiplication of Matrices
The next important matrix operation we will explore is multiplication of matrices. The operation of matrix multiplication is one of the most important and useful of the matrix operations. Throughout this section, we will also demonstrate how matrix multiplication relates to linear systems of equations.
First, we provide a formal definition of row and column vectors.
Matrices of size \(n\times 1\) or \(1\times n\) are called vectors . If \(X\) is such a matrix, then we write \(x_{i}\) to denote the entry of \(X\) in the \(i^{th}\) row of a column matrix, or the \(i^{th}\) column of a row matrix.
The \(n\times 1\) matrix \[X=\left[ \begin{array}{c} x_{1} \\ \vdots \\ x_{n} \end{array} \right]\nonumber \] is called a column vector. The \(1\times n\) matrix \[X = \left[ \begin{array}{ccc} x_{1} & \cdots & x_{n} \end{array} \right]\nonumber \] is called a row vector .
We may simply use the term vector throughout this text to refer to either a column or row vector. If we do so, the context will make it clear which we are referring to.
In this chapter, we will again use the notion of linear combination of vectors as in Definition 9.2.2 . In this context, a linear combination is a sum consisting of vectors multiplied by scalars. For example, \[\left[ \begin{array}{r} 50 \\ 122 \end{array} \right] = 7\left[ \begin{array}{r} 1 \\ 4 \end{array} \right] +8\left[ \begin{array}{r} 2 \\ 5 \end{array} \right] +9\left[ \begin{array}{r} 3 \\ 6 \end{array} \right]\nonumber \] is a linear combination of three vectors.
It turns out that we can express any system of linear equations as a linear combination of vectors. In fact, the vectors that we will use are just the columns of the corresponding augmented matrix!
Suppose we have a system of equations given by \[\begin{array}{c} a_{11}x_{1}+\cdots +a_{1n}x_{n}=b_{1} \\ \vdots \\ a_{m1}x_{1}+\cdots +a_{mn}x_{n}=b_{m} \end{array}\nonumber \] We can express this system in vector form which is as follows: \[x_1 \left[ \begin{array}{c} a_{11}\\ a_{21}\\ \vdots \\ a_{m1} \end{array} \right] + x_2 \left[ \begin{array}{c} a_{12}\\ a_{22}\\ \vdots \\ a_{m2} \end{array} \right] + \cdots + x_n \left[ \begin{array}{c} a_{1n}\\ a_{2n}\\ \vdots \\ a_{mn} \end{array} \right] = \left[ \begin{array}{c} b_1\\ b_2\\ \vdots \\ b_m \end{array} \right]\nonumber \]
Notice that each vector used here is one column from the corresponding augmented matrix. There is one vector for each variable in the system, along with the constant vector.
The first important form of matrix multiplication is multiplying a matrix by a vector. Consider the product given by \[\left[ \begin{array}{rrr} 1 & 2 & 3 \\ 4 & 5 & 6 \end{array} \right] \left[ \begin{array}{r} 7 \\ 8 \\ 9 \end{array} \right]\nonumber \] We will soon see that this equals \[7\left[ \begin{array}{c} 1 \\ 4 \end{array} \right] +8\left[ \begin{array}{c} 2 \\ 5 \end{array} \right] +9\left[ \begin{array}{c} 3 \\ 6 \end{array} \right] =\left[ \begin{array}{c} 50 \\ 122 \end{array} \right]\nonumber \]
In general terms, \[\begin{aligned} \left[ \begin{array}{ccc} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \end{array} \right] \left[ \begin{array}{c} x_{1} \\ x_{2} \\ x_{3} \end{array} \right] &= \ x_{1}\left[ \begin{array}{c} a_{11} \\ a_{21} \end{array} \right] +x_{2}\left[ \begin{array}{c} a_{12} \\ a_{22} \end{array} \right] +x_{3}\left[ \begin{array}{c} a_{13} \\ a_{23} \end{array} \right] \\ &=\left[ \begin{array}{c} a_{11}x_{1}+a_{12}x_{2}+a_{13}x_{3} \\ a_{21}x_{1}+a_{22}x_{2}+a_{23}x_{3} \end{array} \right] \end{aligned}\] Thus you take \(x_{1}\) times the first column, add to \(x_{2}\) times the second column, and finally \(x_{3}\) times the third column. The above sum is a linear combination of the columns of the matrix. When you multiply a matrix on the left by a vector on the right, the numbers making up the vector are just the scalars to be used in the linear combination of the columns as illustrated above.
Here is the formal definition of how to multiply an \(m\times n\) matrix by an \(n\times 1\) column vector.
Let \(A=\left[ a_{ij} \right]\) be an \(m\times n\) matrix and let \(X\) be an \(n\times 1\) matrix given by \[A=\left[ A_{1} \cdots A_{n}\right], X = \left[ \begin{array}{r} x_{1} \\ \vdots \\ x_{n} \end{array} \right]\nonumber \]
Then the product \(AX\) is the \(m\times 1\) column vector which equals the following linear combination of the columns of \(A\): \[x_{1}A_{1}+x_{2}A_{2}+\cdots +x_{n}A_{n} = \sum_{j=1}^{n}x_{j}A_{j}\nonumber \]
If we write the columns of \(A\) in terms of their entries, they are of the form \[A_{j} = \left[ \begin{array}{c} a_{1j} \\ a_{2j} \\ \vdots \\ a_{mj} \end{array} \right]\nonumber \] Then, we can write the product \(AX\) as \[AX = x_{1}\left[ \begin{array}{c} a_{11} \\ a_{21} \\ \vdots \\ a_{m1} \end{array} \right] + x_{2}\left[ \begin{array}{c} a_{12} \\ a_{22} \\ \vdots \\ a_{m2} \end{array} \right] +\cdots + x_{n}\left[ \begin{array}{c} a_{1n} \\ a_{2n} \\ \vdots \\ a_{mn} \end{array} \right]\nonumber \]
Note that multiplication of an \(m \times n\) matrix and an \(n \times 1\) vector produces an \(m \times 1\) vector.
Here is an example.
Compute the product \(AX\) for \[A = \left[ \begin{array}{rrrr} 1 & 2 & 1 & 3 \\ 0 & 2 & 1 & -2 \\ 2 & 1 & 4 & 1 \end{array} \right], X = \left[ \begin{array}{r} 1 \\ 2 \\ 0 \\ 1 \end{array} \right]\nonumber \]
Solution
We will use Definition \(\PageIndex{3}\) to compute the product. Therefore, we compute the product \(AX\) as follows. \[\begin{aligned} & 1\left[ \begin{array}{r} 1 \\ 0 \\ 2 \end{array} \right] + 2\left[ \begin{array}{r} 2 \\ 2 \\ 1 \end{array} \right] + 0\left[ \begin{array}{r} 1 \\ 1 \\ 4 \end{array} \right] + 1 \left[ \begin{array}{r} 3 \\ -2\\ 1 \end{array} \right] \\ &= \left[ \begin{array}{r} 1 \\ 0 \\ 2 \end{array} \right] + \left[ \begin{array}{r} 4 \\ 4 \\ 2 \end{array} \right] + \left[ \begin{array}{r} 0 \\ 0 \\ 0 \end{array} \right] + \left[ \begin{array}{r} 3 \\ -2\\ 1 \end{array} \right] \\ &= \left[ \begin{array}{r} 8 \\ 2 \\ 5 \end{array} \right]\end{aligned}\]
Using the above operation, we can also write a system of linear equations in matrix form . In this form, we express the system as a matrix multiplied by a vector. Consider the following definition.
Suppose we have a system of equations given by \[\begin{array}{c} a_{11}x_{1}+\cdots +a_{1n}x_{n}=b_{1} \\ a_{21}x_{1}+ \cdots + a_{2n}x_{n} = b_{2} \\ \vdots \\ a_{m1}x_{1}+\cdots +a_{mn}x_{n}=b_{m} \end{array}\nonumber \] Then we can express this system in matrix form as follows. \[\left[ \begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{array} \right] \left[ \begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array} \right] = \left[ \begin{array}{c} b_{1}\\ b_{2}\\ \vdots \\ b_{m} \end{array} \right]\nonumber \]
The expression \(AX=B\) is also known as the Matrix Form of the corresponding system of linear equations. The matrix \(A\) is simply the coefficient matrix of the system, the vector \(X\) is the column vector constructed from the variables of the system, and finally the vector \(B\) is the column vector constructed from the constants of the system. It is important to note that any system of linear equations can be written in this form.
Notice that if we write a homogeneous system of equations in matrix form, it would have the form \(AX=0\), for the zero vector \(0\).
You can see from this definition that a vector \[X = \left[ \begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array} \right]\nonumber \] will satisfy the equation \(AX=B\) only when the entries \(x_{1}, x_{2}, \cdots, x_{n}\) of the vector \(X\) are solutions to the original system.
Now that we have examined how to multiply a matrix by a vector, we wish to consider the case where we multiply two matrices of more general sizes, although these sizes still need to be appropriate as we will see. For example, in Example \(\PageIndex{1}\) , we multiplied a \(3 \times 4\) matrix by a \(4 \times 1\) vector. We want to investigate how to multiply other sizes of matrices.
We have not yet given any conditions on when matrix multiplication is possible! For matrices \(A\) and \(B\), in order to form the product \(AB\), the number of columns of \(A\) must equal the number of rows of \(B.\) Consider a product \(AB\) where \(A\) has size \(m\times n\) and \(B\) has size \(n \times p\). Then, the product in terms of size of matrices is given by \[(m\times\overset{\text{these must match!}}{\widehat{n)\;(n}\times p})=m\times p\nonumber \]
Note the two outside numbers give the size of the product. One of the most important rules regarding matrix multiplication is the following. If the two middle numbers don’t match, you can’t multiply the matrices!
When the number of columns of \(A\) equals the number of rows of \(B\) the two matrices are said to be conformable and the product \(AB\) is obtained as follows.
Let \(A\) be an \(m\times n\) matrix and let \(B\) be an \(n\times p\) matrix of the form \[B=\left[ B_{1} \cdots B_{p}\right]\nonumber \] where \(B_{1},...,B_{p}\) are the \(n\times 1\) columns of \(B\). Then the \(m\times p\) matrix \(AB\) is defined as follows: \[AB = A \left[ B_{1} \cdots B_{p}\right] = \left[ (A B)_{1} \cdots (AB)_{p}\right]\nonumber \] where \((AB)_{k}\) is an \(m\times 1\) matrix or column vector which gives the \(k^{th}\) column of \(AB\).
Below is a video on matrix multiplication.
Consider the following example.
Find \(AB\) if possible. \[A = \left[ \begin{array}{rrr} 1 & 2 & 1 \\ 0 & 2 & 1 \end{array} \right], B = \left[ \begin{array}{rrr} 1 & 2 & 0 \\ 0 & 3 & 1 \\ -2 & 1 & 1 \end{array} \right]\nonumber \]
Solution
The first thing you need to verify when calculating a product is whether the multiplication is possible. The first matrix has size \(2\times 3\) and the second matrix has size \(3\times 3\). The inside numbers are equal, so \(A\) and \(B\) are conformable matrices. According to the above discussion \(AB\) will be a \(2\times 3\) matrix. Definition \(\PageIndex{5}\) gives us a way to calculate each column of \(AB\), as follows.
\[\left[ \overset{ \text{First column}}{\overbrace{\left[ \begin{array}{rrr} 1 & 2 & 1 \\ 0 & 2 & 1 \end{array} \right] \left[ \begin{array}{r} 1 \\ 0 \\ -2 \end{array} \right] }},\overset{\text{Second column}}{\overbrace{\left[ \begin{array}{rrr} 1 & 2 & 1 \\ 0 & 2 & 1 \end{array} \right] \left[ \begin{array}{r} 2 \\ 3 \\ 1 \end{array} \right] }},\overset{\text{Third column}}{\overbrace{\left[ \begin{array}{rrr} 1 & 2 & 1 \\ 0 & 2 & 1 \end{array} \right] \left[ \begin{array}{r} 0 \\ 1 \\ 1 \end{array} \right] }}\right]\nonumber \] You know how to multiply a matrix times a vector, using Definition \(\PageIndex{3}\) for each of the three columns. Thus \[\left[ \begin{array}{rrr} 1 & 2 & 1 \\ 0 & 2 & 1 \end{array} \right] \left[ \begin{array}{rrr} 1 & 2 & 0 \\ 0 & 3 & 1 \\ -2 & 1 & 1 \end{array} \right] = \ \left[ \begin{array}{rrr} -1 & 9 & 3 \\ -2 & 7 & 3 \end{array} \right]\nonumber \]
Since vectors are simply \(n \times 1\) or \(1 \times m\) matrices, we can also multiply a vector by another vector.
Multiply if possible \(\left[ \begin{array}{r} 1 \\ 2 \\ 1 \end{array} \right] \left[ \begin{array}{rrrr} 1 & 2 & 1 & 0 \end{array} \right] .\)
Solution
In this case we are multiplying a matrix of size \(3 \times 1\) by a matrix of size \(1 \times 4.\) The inside numbers match so the product is defined. Note that the product will be a matrix of size \(3 \times 4\). Using Definition \(\PageIndex{5}\) , we can compute this product as follows \(\:\) \[\left[ \begin{array}{r} 1 \\ 2 \\ 1 \end{array} \right] \left[ \begin{array}{rrrr} 1 & 2 & 1 & 0 \end{array} \right] = \left[ \overset{ \text{First column}}{\overbrace{\left[ \begin{array}{r} 1 \\ 2 \\ 1 \end{array} \right] \left[ \begin{array}{r} 1 \end{array} \right] }},\overset{\text{Second column}}{\overbrace{\left[ \begin{array}{r} 1 \\ 2\\ 1 \end{array} \right] \left[ \begin{array}{r} 2 \end{array} \right] }},\overset{\text{Third column}}{\overbrace{\left[ \begin{array}{r} 1 \\ 2 \\ 1 \end{array} \right] \left[ \begin{array}{r} 1 \end{array} \right] }}, \overset {\text{Fourth column}}{\overbrace{\left[ \begin{array}{r} 1\\ 2\\ 1 \end{array} \right] \left[ \begin{array}{r} 0 \end{array} \right]}} \right]\nonumber \]
You can use Definition \(\PageIndex{3}\) to verify that this product is \[\left[ \begin{array}{cccc} 1 & 2 & 1 & 0 \\ 2 & 4 & 2 & 0 \\ 1 & 2 & 1 & 0 \end{array} \right]\nonumber \]
Find \(BA\) if possible. \[B = \left[ \begin{array}{ccc} 1 & 2 & 0 \\ 0 & 3 & 1 \\ -2 & 1 & 1 \end{array} \right], A = \left[ \begin{array}{ccc} 1 & 2 & 1 \\ 0 & 2 & 1 \end{array} \right]\nonumber \]
Solution
First check if it is possible. This product is of the form \(\left( 3\times 3\right) \left( 2\times 3\right) .\) The inside numbers do not match and so you can’t do this multiplication.
In this case, we say that the multiplication is not defined. Notice that these are the same matrices which we used in Example \(\PageIndex{2}\) . In this example, we tried to calculate \(BA\) instead of \(AB\). This demonstrates another property of matrix multiplication. While the product \(AB\) maybe be defined, we cannot assume that the product \(BA\) will be possible. Therefore, it is important to always check that the product is defined before carrying out any calculations.
Earlier, we defined the zero matrix \(0\) to be the matrix (of appropriate size) containing zeros in all entries. Consider the following example for multiplication by the zero matrix.
Below is a video determining if matrix multiplication is possible.
Compute the product \(A0\) for the matrix \[A= \left[ \begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array} \right]\nonumber \] and the \(2 \times 2\) zero matrix given by \[0= \left[ \begin{array}{rr} 0 & 0 \\ 0 & 0 \end{array} \right]\nonumber \]
Solution
In this product, we compute \[\left[ \begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array} \right] \left[ \begin{array}{rr} 0 & 0 \\ 0 & 0 \end{array} \right] = \left[ \begin{array}{rr} 0 & 0 \\ 0 & 0 \end{array} \right]\nonumber \]
Hence, \(A0=0\).
Notice that we could also multiply \(A\) by the \(2 \times 1\) zero vector given by \(\left[ \begin{array}{r} 0 \\ 0 \end{array} \right]\). The result would be the \(2 \times 1\) zero vector. Therefore, it is always the case that \(A0=0\), for an appropriately sized zero matrix or vector.