3.8: Matrix exponentials

Last updated
Save as PDF

Page ID: 368

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Definition

In this section we present a different way of finding the fundamental matrix solution of a system. Suppose that we have the constant coefficient equation

\[\vec{x}' = P \vec{x} \nonumber \]

as usual. Now suppose that this was one equation (\(P\) is a number or a \( 1 \times 1 \) matrix). Then the solution to this would be

\[\vec{x} = e^{Pt}. \nonumber \]

That doesn’t make sense if \(P\) is a larger matrix, but essentially the same computation that led to the above works for matrices when we define \(e^{Pt}\) properly. First let us write down the Taylor series for \(e^{at} \)for some number \(a\).

\[e^{at} = 1 + at + \frac{(at)^2}{2} + \frac{(at)^3}{6} + \frac{(at)^4}{24} + \cdots = \sum_{k=0}^{\infty} \frac{(at)^k}{k!} \nonumber \]

Recall \( k! = 1 \cdot 2\cdot 3 \cdots k \) is the factorial, and \(0! = 1 \). We differentiate this series term by term

\[\frac{\mathrm{d}}{\mathrm{d} t} \left(e^{at}\right) = a + a^2 t + \frac{a^3 t^2}{2} + \frac{a^4 t^3}{6} + \cdots = a \left( 1 + at \frac{(at)^2}{2} + \frac{(at)^3}{6} + \cdots \right) = a e^{at}. \nonumber \]

Maybe we can try the same trick with matrices. Suppose that for an \( n \times n \) matrix \(A\) we define the matrix exponential as

\[ e^A \stackrel{\text{def}}{=} \mathit{I} + A + \frac{1}{2} A^2 + \frac{1}{6} A^3 + \cdots + \frac{1}{k!} A^k + \cdots \nonumber \]

Let us not worry about convergence. The series really does always converge. We usually write \(Pt \)as \(tP\) by convention when \(P\) is a matrix. With this small change and by the exact same calculation as above we have that

\[ \frac{\mathrm{d}}{\mathrm{d}t} \left ( e^{tP} \right ) = P e^{tP}. \nonumber \]

Now \(P\) and hence \( e^{tP} \) is an \( n \times n \) matrix. What we are looking for is a vector. We note that in the \( 1 \times 1 \) case we would at this point multiply by an arbitrary constant to get the general solution. In the matrix case we multiply by a column vector \( \vec{c} \).

Theorem \(\PageIndex{1}\)

Let \(P\) be an \(n \times n\) matrix. Then the general solution to \( \vec{x}' = P \vec{x} \) is

\[\vec{x} = e^{tP} \vec{c} , \nonumber \]

where \( \vec{c} \) is an arbitrary constant vector. In fact \( \vec{x}(0) = \vec{c} \).

Let us check.

\[ \frac{\mathrm{d}}{\mathrm{d}t} \vec{x} = \frac{\mathrm{d}}{\mathrm{d}t} \left( e^{tP} \vec{c} \right) = P e^{tP} \vec{c} = P \vec{x}. \nonumber \]

Hence \(e^{tP} \)is the fundamental matrix solution of the homogeneous system. If we find a way to compute the matrix exponential, we will have another method of solving constant coefficient homogeneous systems. It also makes it easy to solve for initial conditions. To solve \(\vec{x}' = A \vec{x} \), \(\vec{x}(0) = \vec{b} \) we take the solution

\[ \vec{x} = e^{tA} \vec{b} \nonumber \]

This equation follows because \(e^{0A} = \mathit{I} \), so \(\vec{x}(0) = e^{0A}\vec{b} = \vec{b} \).

We mention a drawback of matrix exponentials. In general \(e^{A+B} \neq e^A e^B \). The trouble is that matrices do not commute, that is, in general \(AB \neq BA\). If you try to prove \( e^{A+B} \neq e^A e^B \) using the Taylor series, you will see why the lack of commutativity becomes a problem. However, it is still true that if \(AB=BA\), that is, if \(A\) and \(B\) commute, then \(e^{A+B} = e^A e^B \). We will find this fact useful. Let us restate this as a theorem to make a point.

Theorem \(\PageIndex{2}\)

If \(AB = BA\), then \(e^{A+B} = e^Ae^B\). Otherwise \(e^{A+B} \neq e^Ae^B\) in general.

Simple cases

In some instances it may work to just plug into the series definition. Suppose the matrix is diagonal. For example, \( D = \begin{bmatrix} a&0 \\ 0&b \end{bmatrix} \). Then

\[D^k = \begin{bmatrix} a^k & 0 \\ 0&b^k \end{bmatrix} \nonumber \]

and

\[\begin{align}\begin{aligned} e^D &= \mathit{I} +D+\frac{1}{2}D^2+\frac{1}{6}D^3+\cdots \\ &=\begin{bmatrix}1&0\\0&1\end{bmatrix}+\begin{bmatrix}a&0\\0&b\end{bmatrix}+\frac{1}{2}\begin{bmatrix}a^2&0\\0&b^2\end{bmatrix}+\frac{1}{6}\begin{bmatrix}a^3&0\\0&b^3\end{bmatrix}+\cdots = \begin{bmatrix}e^a&0\\0&e^b\end{bmatrix}\end{aligned}\end{align} \nonumber \]

So by this rationale we have that

\[e^{\mathit{I}} =\begin{bmatrix}e&0\\0&e \end{bmatrix} \quad\text{and}\quad e^{a\mathit{I}} =\begin{bmatrix} e^a&0\\0&e^a\end{bmatrix} \nonumber \]

This makes exponentials of certain other matrices easy to compute. Notice for example that the matrix \(A=\begin{bmatrix}5&4\\-1&1\end{bmatrix} \) can be written as \(3\mathit{I+B}\) where \(\mathit{B} =\begin{bmatrix}2&4\\-1&-2\end{bmatrix} \). Notice that \(\mathit{B^2} =\begin{bmatrix}0&0\\0&0\end{bmatrix} \). So \(\mathit{B^k} = 0 \) for all \(k \geq 2 \). Therefore, \(e^B = \mathit{I+B} \). Suppose we actually want to compute \(e^{tA} \). The matrices \(3\mathit{tI} \) and \(tB\) commute (exercise: check this) and \(e^{tB}=\mathit{I+tB} \), since \( \left( tB \right)^2 = t^2 B^2 = 0 \). We write

\[\begin{align}\begin{aligned} e^{tA} &= e^{\mathit{3t I}+t\mathit{B}} = e^{3t\mathit{I}} e^{t\mathit{B}} = \begin{bmatrix} e^{3t}&0\\0&e^{3t}\end{bmatrix} \left(I +tB \right) \\ &= \begin{bmatrix} e^{3t}&0\\0&e^{3t}\end{bmatrix} \begin{bmatrix}1+2t&4t\\-t&1-2t\end{bmatrix} =\begin{bmatrix}\left(1+2t \right)e^{3t}&4te^{3t}\\-te^{3t}&\left(1-2t \right) e^{3t} \end{bmatrix}\end{aligned}\end{align} \nonumber \]

So we have found the fundamental matrix solution for the system \(\vec{x}'= A\vec{x}\). Note that this matrix has a repeated eigenvalue with a defect; there is only one eigenvector for the eigenvalue 3. So we have found a perhaps easier way to handle this case. In fact, if a matrix \(A\) is \(2\times 2\) and has an eigenvalue \(\lambda\) of multiplicity 2, then either \(A\) is diagonal, or \(A =\lambda\mathit{I} +B \) where \( B^2=0 \). This is a good exercise.

Exercise \(\PageIndex{1}\)

Suppose that \(A\) is \(2\times 2\) and \(\lambda \) is the only eigenvalue. Then show that \( \left( A -\lambda \mathit{I} \right)^2 =0 \). Then we can write \(A = \lambda\mathit{I} + B\), where \(B^2 = 0 \). Hint: First write down what does it mean for the eigenvalue to be of multiplicity \(2\). You will get an equation for the entries. Now compute the square of \( B\).

Matrices \(B\) such that \(B^k = 0\) for some \(k\) are called nilpotent. Computation of the matrix exponential for nilpotent matrices is easy by just writing down the first \(k\) terms of the Taylor series.

General Matrices

In general, the exponential is not as easy to compute as above. We usually cannot write a matrix as a sum of commuting matrices where the exponential is simple for each one. But fear not, it is still not too difficult provided we can find enough eigenvectors. First we need the following interesting result about matrix exponentials. For two square matrices \(A\) and \(B\), with \(B\) invertible, we have

\[ e^{BAB^{-1}} = Be^AB^{-1}. \nonumber \]

This can be seen by writing down the Taylor series. First note that

\[\left(B A B^{-1} \right)^2 = B A B^{-1} B A B^{-1} = BA\mathit{I}AB^{-1} = BA^2B^{-1} \nonumber \]

And hence by the same reasoning \( \left( BAB^{-1} \right)^k = B A^k B^{-1} \). Now write down the Taylor series for \(e^{BAB^{-1}}\).

\[\begin{align}\begin{aligned} e^{BAB^{-1}} &= \mathit{I} + BAB^{-1} +\frac{1}{2}\left(BAB^{-1}\right)^2+ \frac{1}{6} \left( BAB^{-1} \right)^3 +\cdots \\ &= BB^{-1} +BAB^{-1} +\frac{1}{2}BA^2 B^{-1} +\frac{1}{6}BA^3 B^{-1} +\cdots \\ &=B\left( \mathit{I} + A +\frac{1}{2}A^2+\frac{1}{6}A^3 +\cdots \right)B^{-1} \\ &= Be^A B^{-1}.\end{aligned}\end{align} \nonumber \]

Given a square matrix \(A\), we can sometimes write \(A=EDE^{-1}\), where \(D\) is diagonal and \(E\) invertible. This procedure is called diagonalization. If we can do that, the computation of the exponential becomes easy. Adding \(t\) into the mix we see that we can then easily compute the exponential

\[ e^{tA}=Ee^{tD}E^{-1}. \nonumber \]

To diagonalize \(A\) we will need \(n\) linearly independent eigenvectors of \(A\). Otherwise this method of computing the exponential does not work and we need to be trickier, but we will not get into such details. We let \(E\) be the matrix with the eigenvectors as columns. Let \(\lambda_1,\:\lambda_{2}, \cdots, \lambda_n \) be the eigenvalues and let \(\vec{v}_1,\:\vec{v_{2}}, \cdots, \vec{v}_n \) be the eigenvectors, then \( E=\begin{bmatrix} \vec{v}_1~~ \vec{v}_2~~ \cdots~~ \vec{v}_n \end{bmatrix} \). Let \(D\) be the diagonal matrix with the eigenvalues on the main diagonal. That is

\[D= \begin{bmatrix} \lambda_1&0 &\cdots &0\\0&\lambda_2& \cdots& 0\\ \vdots&\vdots & \ddots &\vdots \\0&0&\cdots&\lambda_{n} \end{bmatrix} \nonumber \]

We compute

\[\begin{align}\begin{aligned} AE &= A\begin{bmatrix} \vec{v}_1&\vec{v}_2&\cdots &\vec{v}_n \end{bmatrix} \\ &= \begin{bmatrix} A\vec{v}_1&A\vec{v}_2&\cdots&A\vec{v}_3\end{bmatrix} \\ &=\begin{bmatrix} \lambda_1\vec{v}_1&\lambda_2\vec{v}_2 &\cdots& \lambda_n\vec{v}_n \end{bmatrix} \\ &=\begin{bmatrix} \vec{v}_1&\vec{v}_2&\cdots &\vec{v}_n\end{bmatrix} D \\ &=ED.\end{aligned}\end{align} \nonumber \]

The columns of \(E\) are linearly independent as these are linearly independent eigenvectors of \(A\). Hence \(E\) is invertible. Since \(AE=ED\), we right multiply by \(E^{-1}\) and we get

\[ A=EDE^{-1}. \nonumber \]

This means that . \(e^A = E e^D E^{-1} \) Multiplying the matrix by \(t\) we obtain

\[\label{eq:21} e^{tA} = E e^{tD} E^{-1} = E \begin{bmatrix}e^{\lambda_1 t}&0&\cdots&0\\ 0&e^{\lambda_2 t}&\cdots &0 \\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&e^{\lambda_n t } \end{bmatrix} E^{-1} \]

The formula \(\eqref{eq:21}\), therefore, gives the formula for computing the fundamental matrix solution \(e^{tA} \) for the system \(\vec{x}' = A\vec{x} \), in the case where we have \(n\) linearly independent eigenvectors.

Notice that this computation still works when the eigenvalues and eigenvectors are complex, though then you will have to compute with complex numbers. It is clear from the definition that if \(A\) is real, then \(e^{tA} \) is real. So you will only need complex numbers in the computation and you may need to apply Euler’s formula to simplify the result. If simplified properly the final matrix will not have any complex numbers in it.

Example \(\PageIndex{1}\)

Compute the fundamental matrix solution using the matrix exponentials for the system

\[\begin{bmatrix} x\\y \end{bmatrix}' = \begin{bmatrix} 1&2\\ 2&1 \end{bmatrix} \begin{bmatrix} x\\y \end{bmatrix}. \nonumber \]

Then compute the particular solution for the initial conditions \(x(0) = 4 \) and \(y(0) = 2 \).

Let \( A \) be the coefficient matrix \(\begin{bmatrix} 1&2 \\ 2&1 \end{bmatrix} \). We first compute (exercise) that the eigenvalues are 3 and -1 and corresponding eigenvectors are \(\begin{bmatrix} 1\\1 \end{bmatrix} \) and \(\begin{bmatrix} 1\\-1 \end{bmatrix} \). Hence the diagonalization of \(A\) is

\[\underbrace{ \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} }_{A} = \underbrace{ \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix} }_{E} \underbrace{ \begin{bmatrix} 3 & 0 \\ 0 & -1 \end{bmatrix} }_{D} \underbrace{ \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}^{-1} }_{E^{-1}} . \nonumber \]

We write

\[\begin{align}\begin{aligned} e^{t A} = E e^{tD} E^{-1} & = \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix} \begin{bmatrix} e^{3t} & 0 \\ 0 & e^{-t} \end{bmatrix} \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}^{-1} \\ & = \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix} \begin{bmatrix} e^{3t} & 0 \\ 0 & e^{-t} \end{bmatrix} \frac{-1}{2} \begin{bmatrix} -1 & -1 \\ -1 & 1 \end{bmatrix} \\ & = \frac{-1}{2} \begin{bmatrix} e^{3t} & e^{-t} \\ e^{3t} & -e^{-t} \end{bmatrix} \begin{bmatrix} -1 & -1 \\ -1 & 1 \end{bmatrix} \\ & = \frac{-1}{2} \begin{bmatrix} -e^{3t}-e^{-t} & -e^{3t}+e^{-t} \\ -e^{3t}+e^{-t} & -e^{3t}-e^{-t} \end{bmatrix} = \begin{bmatrix} \frac{e^{3t}+e^{-t}}{2} & \frac{e^{3t}-e^{-t}}{2} \\ \frac{e^{3t}-e^{-t}}{2} & \frac{e^{3t}+e^{-t}}{2} \end{bmatrix} . \end{aligned}\end{align} \nonumber \]

The initial conditions are \(x(0) = 4 \) and \(y(0) = 2 \). Hence, by the property that \(e^{0A} = \mathit{I} \) we find that the particular solution we are looking for is \(e^{tA} \vec{b} \) where \(\vec{b} \) is \(\begin{bmatrix}4\\2 \end{bmatrix} \). Then the particular solution we are looking for is

\[\begin{bmatrix} x\\y\end{bmatrix} =\begin{bmatrix}\frac{e^{3t}+e^{-t}}{2}&\frac{e^{3t}-e^{-t}}{2}\\ \frac{e^{3t}-e^{-t}}{2}&\frac{e^{3t}+e^{-t}}{2} \end{bmatrix} \begin{bmatrix} 4\\2\end{bmatrix} =\begin{bmatrix}2e^{3t}+2e^{-t}+e^{3t}-e^{-t} \\ 2e^{3t}-2e^{-t}+e^{3t} +e^{-t} \end{bmatrix} = \begin{bmatrix} 3e^{3t}+e^{-t} \\ 3e^{3t}-e^{-t} \end{bmatrix} \nonumber \]

Fundamental Matrix Solutions

We note that if you can compute the fundamental matrix solution in a different way, you can use this to find the matrix exponential \( e^{tA} \). The fundamental matrix solution of a system of ODEs is not unique. The exponential is the fundamental matrix solution with the property that for \(t = 0\) we get the identity matrix. So we must find the right fundamental matrix solution. Let \(X\) be any fundamental matrix solution to \( \vec{x}' = A \vec{x} \). Then we claim

\[e^{tA} = X(t) [X(0)]^{-1}. \nonumber \]

Clearly, if we plug \(t = 0\) into \( X(t)[X(0)]^{-1} \) we get the identity. We can multiply a fundamental matrix solution on the right by any constant invertible matrix and we still get a fundamental matrix solution. All we are doing is changing what the arbitrary constants are in the general solution \(\vec{x}(t) = X(t)\vec{c} \).

Approximations

If you think about it, the computation of any fundamental matrix solution \(X\) using the eigenvalue method is just as difficult as the computation of \( e^{tA} \). So perhaps we did not gain much by this new tool. However, the Taylor series expansion actually gives us a very easy way to approximate solutions, which the eigenvalue method did not.

The simplest thing we can do is to just compute the series up to a certain number of terms. There are better ways to approximate the exponential\(^{1}\). In many cases however, few terms of the Taylor series give a reasonable approximation for the exponential and may suffice for the application. For example, let us compute the first \(4\) terms of the series for the matrix \(A=\left[\begin{array}{c}{1}&{2}\\{2}&{1}\end{array}\right]\).

\[\begin{gathered} e^{tA} \approx I + tA + \frac{t^2}{2}A^2 + \frac{t^3}{6}A^3 = I + t \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} + t^2 \begin{bmatrix} \frac{5}{2} & 2 \\ 2 & \frac{5}{2} \end{bmatrix} + t^3 \begin{bmatrix} \frac{13}{6} & \frac{7}{3} \\ \frac{7}{3} & \frac{13}{6} \end{bmatrix} = \\ = \begin{bmatrix} 1 + t + \frac{5}{2}\, t^2 + \frac{13}{6}\, t^3 & 2\,t + 2\, t^2 + \frac{7}{3}\, t^3 \\ 2\,t + 2\, t^2 + \frac{7}{3}\, t^3 & 1 + t + \frac{5}{2}\, t^2 + \frac{13}{6}\, t^3 \end{bmatrix} .\end{gathered} \nonumber \]

Just like the scalar version of the Taylor series approximation, the approximation will be better for small \(t\) and worse for larger \(t\). For larger \(t\), we will generally have to compute more terms. Let us see how we stack up against the real solution with \(t = 0.1\). The approximate solution is approximately (rounded to \(8\) decimal places)

\[e^{0.1A} \approx \mathit{I} + 0.1A + \frac{0.1^2}{2} + \frac{0.1^3}{6} A^3 = \begin{bmatrix} 1.12716667& 0.22233333\\0.22233333&1.12716667 \end{bmatrix} \nonumber \]

And plugging \(t = 0.1\) into the real solution (rounded to \(8\) decimal places) we get

\[e^{0.1A} = \begin{bmatrix} 1.12734811&0.22251069\\ 0.22251069&1.12734811 \end{bmatrix} \nonumber \]

Not bad at all! Although if we take the same approximation for \(t = 1\) we get

\[ \mathit{I} + A +\frac{1}{2} A^2 +\frac{1}{6} A^3 = \begin{bmatrix} 6.66666667&6.33333333\\ 6.33333333&6.66666667\end{bmatrix} \nonumber \]

while the real value is (again rounded to \(8\) decimal places)

\[ e^A = \begin{bmatrix}10.22670818&9.85882874\\9.85882874&10.22670818\end{bmatrix} \nonumber \]

So the approximation is not very good once we get up to \(t=1\). To get a good approximation at \(t=1\) (say up to \(2\) decimal places) we would need to go up to the \(11^{th} \) power (exercise).

Footnotes

[1] C. Moler and C.F. Van Loan, Nineteen Dubious Ways to Compute the Exponential of a Matrix, Twenty-Five Years Later, SIAM Review 45 (1), 2003, 3–49