10.2: Change of Basis Transformation

Last updated
Save as PDF

Page ID: 248

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Recall that we can associate a matrix \(A\in \mathbb{F}^{n\times n}\) to every operator \(T\in\mathcal{L}(V,V)\). More precisely, the \(j^{\text{th}}\) column of the matrix \(A=M(T)\) with respect to a basis \(e=(e_1,\ldots,e_n)\) is obtained by expanding \(Te_j\) in terms of the basis \(e\). If the basis \(e\) is orthonormal, then the coefficient of \(e_i\) is just the inner product of the vector with \(e_i\).
Hence,

\begin{equation*}
M(T) = (\inner{Te_j}{e_i})_{1\le i,j\le n},
\end{equation*}
where \(i\) is the row index and \(j\) is the column index of the matrix.

Conversely, if \(A\in \mathbb{F}^{n\times n}\) is a matrix, then we can associate a linear operator \(T\in\mathcal{L}(V,V)\) to \(A\) by setting
\begin{equation*}
\begin{split}
Tv &= \sum_{j=1}^n \inner{v}{e_j} Te_j
= \sum_{j=1}^n \sum_{i=1}^n \inner{Te_j}{e_i} \inner{v}{e_j} e_i\\
&= \sum_{i=1}^n \left( \sum_{j=1}^n a_{ij} \inner{v}{e_j} \right) e_i
= \sum_{i=1} (A [v]_e)_i e_i,
\end{split}
\end{equation*}

where \((A[v]_e)_i\) denotes the \(i^{\text{th}}\) component of the column vector \(A [v]_e\). With this construction, we have \(M(T)=A\).
The coefficients of \(Tv\) in the basis \((e_1,\ldots,e_n)\) are recorded by the column vector obtained by multiplying the \(n\times n\) matrix \(A\) with the \(n\times 1\) column vector \([v]_e\) whose components \(([v]_e)_j=\inner{v}{e_j}\).

Example 10.2.1. Given

\begin{equation*}
A = \begin{bmatrix} 1 & -i\\ i & 1 \end{bmatrix},
\end{equation*}

we can define \(T\in \mathcal{L}(V,V)\) with respect to the canonical basis as follows:

\begin{equation*}
T \begin{bmatrix} z_1\\ z_2 \end{bmatrix} = \begin{bmatrix} 1&-i\\ i&1\end{bmatrix}
\begin{bmatrix} z_1\\ z_2 \end{bmatrix} = \begin{bmatrix} z_1 -iz_2\\ iz_1+z_2\end{bmatrix}.
\end{equation*}

Suppose that we want to use another orthonormal basis \(f=(f_1,\ldots,f_n)\) for \(V\). Then, as before, we have \(v = \sum_{i=1}^n \inner{v}{f_i} f_i.\) Comparing this with \(v = \sum_{j=1}^n \inner{v}{e_j} e_j\), we find that

\begin{equation*}
v = \sum_{i,j=1}^n \inner{\inner{v}{e_j}e_j}{f_i} f_i
= \sum_{i=1}^n \left( \sum_{j=1}^n \inner{e_j}{f_i}\inner{v}{e_j}\right) f_i.
\end{equation*}

Hence,

\begin{equation*}
[v]_f = S [v]_e,
\end{equation*}

where

\begin{equation*}
S=(s_{ij})_{i,j=1}^n \qquad \text{with \(s_{ij}=\inner{e_j}{f_i}\).}
\end{equation*}

The \(j^{\text{th}}\) column of \(S\) is given by the coefficients of the expansion of \(e_j\) in terms of the basis \(f=(f_1,\ldots,f_n)\). The matrix \(S\) describes a linear map in \(\mathcal{L}(\mathbb{F}^n)\), which is called the change of basis transformation.

We may also interchange the role of bases \(e\) and \(f\). In this case, we obtain the
matrix \(R=(r_{ij})_{i,j=1}^n\), where
\begin{equation*}
r_{ij} = \inner{f_j}{e_i}.
\end{equation*}

Then, by the uniqueness of the expansion in a basis, we obtain
\begin{equation*}
[v]_e = R[v]_f
\end{equation*}
so that
\begin{equation*}
RS[v]_e = [v]_e, \qquad \text{for all \(v\in V\).}
\end{equation*}

Since this equation is true for all \([v]_e\in \mathbb{F}^n\), it follows that either \(RS=I\) or \(R=S^{-1}\). In particular, \(S\) and \(R\) are invertible. We can also check this explicitly by using the properties of orthonormal bases. Namely,
\begin{equation*}
\begin{split}
(RS)_{ij} &= \sum_{k=1}^n r_{ik} s_{kj} = \sum_{k=1}^n \inner{f_k}{e_i}\inner{e_j}{f_k}\\
&= \sum_{k=1}^n \inner{e_j}{f_k} \overline{\inner{e_i}{f_k}}
= \inner{[e_j]_f}{[e_i]_f}_{\mathbb{F}^n} = \delta_{ij}.
\end{split}
\end{equation*}

Matrix \(S\) (and similarly also \(R\)) has the interesting property that its columns are orthonormal to one another. This follows from the fact that the columns are the coordinates of orthonormal vectors with respect to another orthonormal basis. A similar statement holds for the rows of \(S\) (and similarly also \(R\)).

Example 10.2.2. Let \(V=\mathbb{C}^2\), and choose the orthonormal bases \(e=(e_1,e_2)\) and \(f=(f_1,f_2)\) with
\begin{align*}
e_1 &= \begin{bmatrix} 1\\ 0 \end{bmatrix}, &
e_2 &= \begin{bmatrix} 0\\ 1 \end{bmatrix},\\
f_1 &= \frac{1}{\sqrt{2}} \begin{bmatrix} 1\\ 1\end{bmatrix}, &
f_2 &= \frac{1}{\sqrt{2}} \begin{bmatrix} -1\\ 1\end{bmatrix}.
\end{align*}
Then
\begin{equation*}
S = \begin{bmatrix} \inner{e_1}{f_1} & \inner{e_2}{f_1}\\
\inner{e_1}{f_2} & \inner{e_2}{f_2} \end{bmatrix}
= \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1\\ -1 & 1 \end{bmatrix}
\end{equation*}
and
\begin{equation*}
R = \begin{bmatrix} \inner{f_1}{e_1} & \inner{f_2}{e_1}\\
\inner{f_1}{e_2} & \inner{f_2}{e_2} \end{bmatrix}
= \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & -1\\ 1 & 1 \end{bmatrix}.
\end{equation*}
One can then check explicitly that indeed
\begin{equation*}
RS = \frac{1}{2} \begin{bmatrix} 1 & -1\\ 1 & 1 \end{bmatrix}
\begin{bmatrix} 1 & 1\\ -1 & 1 \end{bmatrix}
= \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = I.
\end{equation*}

So far we have only discussed how the coordinate vector of a given vector \(v\in V\) changes under the change of basis from \(e\) to \(f\). The next question we can ask is how the matrix \(M(T)\) of an operator \(T\in \mathcal{L}(V)\) changes if we change the basis. Let \(A\) be the matrix of \(T\) with respect to the basis \(e=(e_1,\ldots,e_n)\), and let \(B\) be the matrix for \(T\) with respect to the basis \(f=(f_1,\ldots,f_n)\). How do we determine \(B\) from \(A\)? Note that

\begin{equation*}
[Tv]_e = A[v]_e
\end{equation*}
so that
\begin{equation*}
[Tv]_f = S [Tv]_e = SA [v]_e = SAR[v]_f = SAS^{-1} [v]_f.
\end{equation*}

This implies that
\begin{equation*}
B = SAS^{-1}.
\end{equation*}

Example 10.2.3. Continuing Example 10.2.2, let
\begin{equation*}
A = \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}
\end{equation*}
be the matrix of a linear operator with respect to the basis \(e\). Then the matrix \(B\) with respect to the basis \(f\) is given by
\begin{equation*}
B = SAS^{-1} = \frac{1}{2} \begin{bmatrix} 1 & 1\\ -1 & 1 \end{bmatrix}
\begin{bmatrix} 1 & 1\\ 1 & 1 \end{bmatrix}
\begin{bmatrix} 1 & -1\\ 1 & 1 \end{bmatrix}
= \frac{1}{2} \begin{bmatrix} 1 & 1\\ -1 & 1 \end{bmatrix}
\begin{bmatrix} 2 & 0\\ 2 & 0 \end{bmatrix}
= \begin{bmatrix} 2 & 0\\ 0 & 0 \end{bmatrix}.
\end{equation*}

Contributors

Both hardbound and softbound versions of this textbook are available online at WorldScientific.com.