# 3.1: The Matrix Transpose

- Page ID
- 63392

- T/F: If \(A\) is a \(3\times 5\) matrix, then \(A^{T}\) will be a \(5\times 3\) matrix.
- Where are there zeros in an upper triangular matrix?
- T/F: A matrix is symmetric if it doesn’t change when you take its transpose.
- What is the transpose of the transpose of \(A\)?
- Give 2 other terms to describe symmetric matrices besides “interesting.”

We jump right in with a definition.

Let \(A\) be an \(m\times n\) matrix. The *tranpsose *of \(A\), denoted \(A^{T}\), is the \(n\times m\) matrix whose columns are the respective rows of \(A\).

Examples will make this definition clear.

Find the transpose of \(A=\left[\begin{array}{ccc}{1}&{2}&{3}\\{4}&{5}&{6}\end{array}\right].\)

**Solution**

Note that \(A\) is a \(2\times 3\) matrix, so \(A^{T}\) will be a \(3 \times 2\) matrix. By the definition, the first column of \(A^{T}\) is the first row of \(A\); the second column of \(A^{T}\) is the second row of \(A\). Therefore,

\[A^{T}=\left[\begin{array}{cc}{1}&{4}\\{2}&{5}\\{3}&{6}\end{array}\right]. \nonumber \]

Find the transpose of the following matrices.

\[A=\left[\begin{array}{cccc}{7}&{2}&{9}&{1}\\{2}&{-1}&{3}&{0}\\{-5}&{3}&{0}&{11}\end{array}\right]\quad B=\left[\begin{array}{ccc}{1}&{10}&{-2}\\{3}&{-5}&{7}\\{4}&{2}&{-3}\end{array}\right]\quad C=\left[\begin{array}{ccccc}{1}&{-1}&{7}&{8}&{3}\end{array}\right] \nonumber \]

**Solution**

We find each transpose using the definition without explanation. Make note of the dimensions of the original matrix and the dimensions of its transpose.

\[A^{T}=\left[\begin{array}{ccc}{7}&{2}&{-5}\\{2}&{-1}&{3}\\{9}&{3}&{0}\\{1}&{0}&{11}\end{array}\right]\quad B^{T}=\left[\begin{array}{ccc}{1}&{3}&{4}\\{10}&{-5}&{2}\\{-2}&{7}&{-3}\end{array}\right]\quad C^{T}=\left[\begin{array}{c}{1}\\{-1}\\{7}\\{8}\\{3}\end{array}\right] \nonumber \]

Notice that with matrix \(B\), when we took the transpose, the *diagonal* did not change. We can see what the diagonal is below where we rewrite \(B\) and \(B^{T}\) with the diagonal in bold. We’ll follow this by a definition of what we mean by “the diagonal of a matrix,” along with a few other related definitions.

\[B=\left[\begin{array}{ccc}{\mathbf{1}}&{10}&{-2}\\{3}&{\mathbf{-5}}&{7}\\{4}&{2}&{\mathbf{-3}}\end{array}\right]\quad B^{T}=\left[\begin{array}{ccc}{\mathbf{1}}&{3}&{4}\\{10}&{\mathbf{-5}}&{2}\\{-2}&{7}&{\mathbf{-3}}\end{array}\right] \nonumber \]

It is probably pretty clear why we call those entries “the diagonal.” Here is the formal definition.

Let \(A\) be an \(m\times n\) matrix. The *diagonal *of \(A\) consists of the entries \(a_{11},\: a_{22},\cdots\) of \(A\).

- A
*diagonal matrix*is an \(n\times n\) matrix in which the only nonzero entries lie on the diagonal. - An
*upper (lower) triangular*matrix is a matrix in which any nonzero entries lie on or above (below) the diagonal.

Consider the matrices \(A\), \(B\), \(C\) and \(I_{4}\), as well as their transposes, where

\[A=\left[\begin{array}{ccc}{1}&{2}&{3}\\{0}&{4}&{5}\\{0}&{0}&{6}\end{array}\right]\quad B=\left[\begin{array}{ccc}{3}&{0}&{0}\\{0}&{7}&{0}\\{0}&{0}&{-1}\end{array}\right]\quad C=\left[\begin{array}{ccc}{1}&{2}&{3}\\{0}&{4}&{5}\\{0}&{0}&{6}\\{0}&{0}&{0}\end{array}\right]. \nonumber \]

Identify the diagonal of each matrix, and state whether each matrix is diagonal, upper triangular, lower triangular, or none of the above.

**Solution**

We first compute the transpose of each matrix.

\[A^{T}=\left[\begin{array}{ccc}{1}&{0}&{0}\\{2}&{4}&{0}\\{3}&{5}&{6}\end{array}\right]\quad B^{T}=\left[\begin{array}{ccc}{3}&{0}&{0}\\{0}&{7}&{0}\\{0}&{0}&{-1}\end{array}\right]\quad C^{T}=\left[\begin{array}{cccc}{1}&{0}&{0}&{0}\\{2}&{4}&{0}&{0}\\{3}&{5}&{6}&{0}\end{array}\right] \nonumber \]

Note that \(I_{4}^{T}=I_{4}\).

The diagonals of \(A\) and \(A^{T}\) are the same, consisting of the entries \(1,\: 4\) and \(6\). The diagonals of \(B\) and \(B^{T}\) are also the same, consisting of the entries \(3,\: 7\) and \(-1\). Finally, the diagonals of \(C\) and \(C^{T}\) are the same, consisting of the entries \(1,\: 4\) and \(6\).

The matrix \(A\) is upper triangular; the only nonzero entries lie on or above the diagonal. Likewise, \(A^{T}\) is lower triangular.

The matrix \(B\) is diagonal. By their definitions, we can also see that \(B\) is both upper and lower triangular. Likewise, \(I_{4}\) is diagonal, as well as upper and lower triangular.

Finally, \(C\) is upper triangular, with \(C^{T}\) being lower triangular.

Make note of the definitions of diagonal and triangular matrices. We specify that a diagonal matrix must be square, but triangular matrices don’t have to be. (“Most” of the time, however, the ones we study are.) Also, as we mentioned before in the example, by definition a diagonal matrix is also both upper and lower triangular. Finally, notice that by definition, the transpose of an upper triangular matrix is a lower triangular matrix, and vice-versa.

There are many questions to probe concerning the transpose operations.\(^{1}\) The first set of questions we’ll investigate involve the matrix arithmetic we learned from last chapter. We do this investigation by way of examples, and then summarize what we have learned at the end.

Let

\[A=\left[\begin{array}{ccc}{1}&{2}&{3}\\{4}&{5}&{6}\end{array}\right]\quad\text{and}\quad B=\left[\begin{array}{ccc}{1}&{2}&{1}\\{3}&{-1}&{0}\end{array}\right]. \nonumber \]

Find \(A^{T}+B^{T}\) and \((A+B)^{T}\).

**Solution**

We note that

\[A^{T}=\left[\begin{array}{cc}{1}&{4}\\{2}&{5}\\{3}&{6}\end{array}\right]\quad\text{and}\quad B^{T}=\left[\begin{array}{cc}{1}&{3}\\{2}&{-1}\\{1}&{0}\end{array}\right]. \nonumber \]

Therefore

\[\begin{align}\begin{aligned}A^{T}+B^{T}&=\left[\begin{array}{cc}{1}&{4}\\{2}&{5}\\{3}&{6}\end{array}\right]+\left[\begin{array}{cc}{1}&{3}\\{2}&{-1}\\{1}&{0}\end{array}\right] \\ &=\left[\begin{array}{cc}{2}&{7}\\{4}&{4}\\{4}&{6}\end{array}\right].\end{aligned}\end{align} \nonumber \]

Also,

\[\begin{align}\begin{aligned}(A+B)^{T}&=\left(\left[\begin{array}{ccc}{1}&{2}&{3}\\{4}&{5}&{6}\end{array}\right]+\left[\begin{array}{ccc}{1}&{2}&{1}\\{3}&{-1}&{0}\end{array}\right]\right)^{T} \\ &=\left(\left[\begin{array}{ccc}{2}&{4}&{4}\\{7}&{4}&{6}\end{array}\right]\right)^{T} \\ &=\left[\begin{array}{cc}{2}&{7}\\{4}&{4}\\{4}&{6}\end{array}\right].\end{aligned}\end{align} \nonumber \]

It looks like “the sum of the transposes is the transpose of the sum."\(^{2}\) This should lead us to wonder how the transpose works with multiplication.

Let

\[A=\left[\begin{array}{cc}{1}&{2}\\{3}&{4}\end{array}\right]\quad\text{and}\quad B=\left[\begin{array}{ccc}{1}&{2}&{-1}\\{1}&{0}&{1}\end{array}\right]. \nonumber \]

Find \((AB)^{T}\), \(A^{T}B^{T}\) and \(B^{T}A^{T}\).

**Solution**

We first note that

\[A^{T}=\left[\begin{array}{cc}{1}&{3}\\{2}&{4}\end{array}\right]\quad\text{and}\quad B^{T}=\left[\begin{array}{cc}{1}&{1}\\{2}&{0}\\{-1}&{1}\end{array}\right]. \nonumber \]

Find \((AB)^{T}\):

\[\begin{align}\begin{aligned}(AB)^{T}&=\left(\left[\begin{array}{cc}{1}&{2}\\{3}&{4}\end{array}\right]\left[\begin{array}{ccc}{1}&{2}&{-1}\\{1}&{0}&{1}\end{array}\right]\right)^{T} \\ &=\left(\left[\begin{array}{ccc}{3}&{2}&{1}\\{7}&{6}&{1}\end{array}\right]\right)^{T} \\ &=\left[\begin{array}{cc}{3}&{7}\\{2}&{6}\\{1}&{1}\end{array}\right]\end{aligned}\end{align} \nonumber \]

Now find \(A^{T}B^{T}\):

\[\begin{align}\begin{aligned}A^{T}B^{T}&=\left[\begin{array}{cc}{1}&{3}\\{2}&{4}\end{array}\right]\left[\begin{array}{cc}{1}&{1}\\{2}&{0}\\{-1}&{1}\end{array}\right] \\ &=\text{Not defined!}\end{aligned}\end{align} \nonumber \]

So we can’t compute \(A^{T}B^{T}\). Let’s finish by computing \(B^{T}A^{T}\):

\[\begin{align}\begin{aligned}B^{T}A^{T}&=\left[\begin{array}{cc}{1}&{1}\\{2}&{0}\\{-1}&{1}\end{array}\right]\left[\begin{array}{cc}{1}&{3}\\{2}&{4}\end{array}\right] \\ &=\left[\begin{array}{cc}{3}&{7}\\{2}&{6}\\{1}&{1}\end{array}\right]\end{aligned}\end{align} \nonumber \]

We may have suspected that \((AB)^{T}=A^{T}B^{T}\). We saw that this wasn’t the case, though – and not only was it not equal, the second product wasn’t even defined! Oddly enough, though, we saw that \((AB)^{T}=B^{T}A^{T}\).\(^{3}\) To help understand why this is true, look back at the work above and confirm the steps of each multiplication.

We have one more arithmetic operation to look at: the inverse.

Let

\[A=\left[\begin{array}{cc}{2}&{7}\\{1}&{4}\end{array}\right]. \nonumber \]

Find \((A^{-1})^{T}\) and \((A^{T})^{-1}\).

**Solution**

We first find \(A^{-1}\) and \(A^{T}\):

\[A^{-1}=\left[\begin{array}{cc}{4}&{-7}\\{-1}&{2}\end{array}\right]\quad\text{and}\quad A^{T}=\left[\begin{array}{cc}{2}&{1}\\{7}&{4}\end{array}\right]. \nonumber \]

Finding \((A^{-1})^{T}\):

\[\begin{align}\begin{aligned}(A^{-1})^{T}&=\left[\begin{array}{cc}{4}&{-7}\\{-1}&{2}\end{array}\right]^{T} \\ &=\left[\begin{array}{cc}{4}&{-1}\\{-7}&{2}\end{array}\right]\end{aligned}\end{align} \nonumber \]

Finding \((A^{T})^{-1}\):

\[\begin{align}\begin{aligned}(A^{T})^{-1}&=\left[\begin{array}{cc}{2}&{1}\\{7}&{4}\end{array}\right]^{-1} \\ &=\left[\begin{array}{cc}{4}&{-1}\\{-7}&{2}\end{array}\right]\end{aligned}\end{align} \nonumber \]

It seems that “the inverse of the transpose is the transpose of the inverse."\(^{4}\)

We have just looked at some examples of how the transpose operation interacts with matrix arithmetic operations.\(^{5}\) We now give a theorem that tells us that what we saw wasn’t a coincidence, but rather is always true.

Let \(A\) and \(B\) be matrices where the following operations are defined. Then:

- \((A+B)^{T}=A^{T}+B^{T}\) and \((A-B)^{T}=A^{T}-B^{T}\)
- \((kA)^{T}=kA^{T}\)
- \((AB)^{T}=B^{T}A^{T}\)
- \((A^{-1})^{T}=(A^{T})^{-1}\)
- \((A^{T})^{T}=A\)

We included in the theorem two ideas we didn’t discuss already. First, that \((kA)^{T}=kA^{T}\). This is probably obvious. It doesn’t matter when you multiply a matrix by a scalar when dealing with transposes.

The second “new” item is that \((A^{T})^{T}=A\). That is, if we take the transpose of a matrix, then take its transpose again, what do we have? The original matrix.

Now that we know some properties of the transpose operation, we are tempted to play around with it and see what happens. For instance, if \(A\) is an \(m\times n\) matrix, we know that \(A^{T}\) is an \(n\times m\) matrix. So no matter what matrix \(A\) we start with, we can always perform the multiplication \(AA^{T}\) (and also \(A^{T}A\)) and the result is a square matrix!

Another thing to ask ourselves as we “play around” with the transpose: suppose \(A\) is a square matrix. Is there anything special about \(A+A^{T}\)? The following example has us try out these ideas.

Let

\[A=\left[\begin{array}{ccc}{2}&{1}&{3}\\{2}&{-1}&{1}\\{1}&{0}&{1}\end{array}\right]. \nonumber \]

Find \(AA^{T}\), \(A+A^{T}\) and \(A-A^{T}\).

**Solution**

Finding \(AA^{T}\):

\[\begin{align}\begin{aligned}AA^{T}&=\left[\begin{array}{ccc}{2}&{1}&{3}\\{2}&{-1}&{1}\\{1}&{0}&{1}\end{array}\right]\left[\begin{array}{ccc}{2}&{2}&{1}\\{1}&{-1}&{0}\\{3}&{1}&{1}\end{array}\right] \\ &=\left[\begin{array}{ccc}{14}&{6}&{5}\\{6}&{4}&{3}\\{5}&{3}&{2}\end{array}\right]\end{aligned}\end{align} \nonumber \]

Finding \(A+A^{T}\):

\[\begin{align}\begin{aligned}A+A^{T}&=\left[\begin{array}{ccc}{2}&{1}&{3}\\{2}&{-1}&{1}\\{1}&{0}&{1}\end{array}\right]+\left[\begin{array}{ccc}{2}&{2}&{1}\\{1}&{-1}&{0}\\{3}&{1}&{1}\end{array}\right] \\ &=\left[\begin{array}{ccc}{2}&{3}&{4}\\{3}&{-2}&{1}\\{4}&{1}&{2}\end{array}\right]\end{aligned}\end{align} \nonumber \]

Finding \(A-A^{T}\):

\[\begin{align}\begin{aligned}A-A^{T}&=\left[\begin{array}{ccc}{2}&{1}&{3}\\{2}&{-1}&{1}\\{1}&{0}&{1}\end{array}\right]-\left[\begin{array}{ccc}{2}&{2}&{1}\\{1}&{-1}&{0}\\{3}&{1}&{1}\end{array}\right] \\ &=\left[\begin{array}{ccc}{0}&{-1}&{2}\\{1}&{0}&{1}\\{-2}&{-1}&{0}\end{array}\right]\end{aligned}\end{align} \nonumber \]

Let’s look at the matrices we’ve formed in this example. First, consider \(AA^{T}\). Something seems to be nice about this matrix – look at the location of the 6’s, the 5’s and the 3’s. More precisely, let’s look at the transpose of \(AA^{T}\). We should notice that if we take the transpose of this matrix, we have the very same matrix. That is,

\[\left(\left[\begin{array}{ccc}{14}&{6}&{5}\\{6}&{4}&{3}\\{5}&{3}&{2}\end{array}\right]\right)^{T}=\left[\begin{array}{ccc}{14}&{6}&{5}\\{6}&{4}&{3}\\{5}&{3}&{2}\end{array}\right]! \nonumber \]

We’ll formally define this in a moment, but a matrix that is equal to its transpose is called *symmetric*.

Look at the next part of the example; what do we notice about \(A+A^{T}\)? We should see that it, too, is symmetric. Finally, consider the last part of the example: do we notice anything about \(A-A^{T}\)?

We should immediately notice that it is not symmetric, although it does seem “close.” Instead of it being equal to its transpose, we notice that this matrix is the *opposite* of its transpose. We call this type of matrix *skew symmetric*.\(^{6}\) We formally define these matrices here.

A matrix \(A\) is *symmetric *if \(A^{T}=A\).

A matrix \(A\) is *skew symmetric* if \(A^{T}=-A\).

Note that in order for a matrix to be either symmetric or skew symmetric, it must be square.

So why was \(AA^{T}\) symmetric in our previous example? Did we just luck out?\(^{7}\) Let’s take the transpose of \(AA^{T}\) and see what happens.

\[\begin{align}\begin{aligned} (AA^{T})^{T}&=(A^{T})^{T}(A)^{T} &\text{transpose multiplication rule} \\ &=AA^{T} &(A^{T})^{T}=A\end{aligned}\end{align} \nonumber \]

We have just *proved* that no matter what matrix \(A\) we start with, the matrix \(AA^{T}\) will be symmetric. Nothing in our string of equalities even demanded that \(A\) be a square matrix; it is always true.

We can do a similar proof to show that as long as \(A\) is square, \(A+A^{T}\) is a symmetric matrix.\(^{8}\) We’ll instead show here that if \(A\) is a square matrix, then \(A-A^{T}\) is skew symmetric.

\[\begin{align}\begin{aligned} (A-A^{T})^{T}&=A^{T}-(A^{T})^{T} &\text{transpose subtraction rule} \\ &=A^{T}-A \\ &=-(A-A^{T})\end{aligned}\end{align} \nonumber \]

So we took the transpose of \(A-A^{T}\) and we got \(-(A-A^{T})\); this is the definition of being skew symmetric.

We’ll take what we learned from Example \(\PageIndex{7}\) and put it in a box. (We’ve already proved most of this is true; the rest we leave to solve in the Exercises.)

**Symmetric and Skew Symmetric Matrices**

- Given any matrix \(A\), the matrices \(AA^{T}\) and \(A^{T}A\) are symmetric.
- Let \(A\) be a square matrix. The matrix \(A + A^{T}\) is symmetric.
- Let \(A\) be a square matrix. The matrix \(A − A^{T}\) is skew symmetric.

Why do we care about the transpose of a matrix? Why do we care about symmetric matrices?

There are two answers that each answer both of these questions. First, we are interested in the tranpose of a matrix and symmetric matrices because they are interesting.\(^{9}\) One particularly interesting thing about symmetric and skew symmetric matrices is this: consider the sum of \((A+A^{T})\) and \((A-A^{T})\):

\[(A+A^{T})+(A-A_{T})=2A. \nonumber \]

This gives us an idea: if we were to multiply both sides of this equation by \(\frac12\), then the right hand side would just be \(A\). This means that

\[\begin{align}\begin{aligned} A&=\underbrace{\frac{1}{2}(A+A^{T})}+\underbrace{\frac{1}{2}(A-A^{T})}.\\ &\quad\text{symmetric}\qquad\text{skew symmetric}\end{aligned}\end{align} \nonumber \]

That is, any matrix \(A\) can be written as the sum of a symmetric and skew symmetric matrix. That’s interesting.

The second reason we care about them is that they are very useful and important in various areas of mathematics. The transpose of a matrix turns out to be an important operation; symmetric matrices have many nice properties that make solving certain types of problems possible.

Most of this text focuses on the preliminaries of matrix algebra, and the actual uses are beyond our current scope. One easy to describe example is curve fitting. Suppose we are given a large set of data points that, when plotted, look roughly quadratic. How do we find the quadratic that “best fits” this data? The solution can be found using matrix algebra, and specifically a matrix called the *pseudoinverse*. If \(A\) is a matrix, the pseudoinverse of \(A\) is the matrix \(A^{\dagger}=(A^{T}A)^{-1}A^{T}\) (assuming that the inverse exists). We aren’t going to worry about what all the above means; just notice that it has a cool sounding name and the transpose appears twice.

In the next section we’ll learn about the trace, another operation that can be performed on a matrix that is relatively simple to compute but can lead to some deep results.

## Footnotes

[1] Remember, this is what mathematicians do. We learn something new, and then we ask lots of questions about it. Often the first questions we ask are along the lines of “How does this new thing relate to the old things I already know about?”

[2] This is kind of fun to say, especially when said fast. Regardless of how fast we say it, we should think about this statement. The “is” represents “equals.” The stuff before “is” equals the stuff afterwards.

[3] Then again, maybe this isn’t all that “odd.” It is reminiscent of the fact that, when invertible, \((AB)^{-1}=B^{-1}A^{-1}\).

[4] Again, we should think about this statement. The part before “is” states that we take the transpose of a matrix, then find the inverse. The part after “is” states that we find the inverse of the matrix, then take the transpose. Since these two statements are linked by an “is,” they are equal.

[5] These examples don’t *prove* anything, other than it worked in specific examples.

[6] Some mathematicians use the term *antisymmetric*

[7] Of course not.

[8] Why do we say that \(A\) has to be square?

[9] Or: “neat,” “cool,” “bad,” “wicked,” “phat,” “fo-shizzle.”