Preface

Last updated

May 24, 2024
Save as PDF
- Licensing
- 1: Introduction and Review

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\id}{\mathrm{id}}$ $\newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$ $\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$ $\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\id}{\mathrm{id}}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\kernel}{\mathrm{null}\,}$

$\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$

$\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$

$\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$ $\newcommand{\AA}{\unicode[.8,0]{x212B}}$

$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vectorC}[1]{\textbf{#1}}$

$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$

$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$

$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\avec}{\mathbf a}$

$\newcommand{\bvec}{\mathbf b}$

$\newcommand{\cvec}{\mathbf c}$

$\newcommand{\dvec}{\mathbf d}$

$\newcommand{\dtil}{\widetilde{\mathbf d}}$

$\newcommand{\evec}{\mathbf e}$

$\newcommand{\fvec}{\mathbf f}$

$\newcommand{\nvec}{\mathbf n}$

$\newcommand{\pvec}{\mathbf p}$

$\newcommand{\qvec}{\mathbf q}$

$\newcommand{\svec}{\mathbf s}$

$\newcommand{\tvec}{\mathbf t}$

$\newcommand{\uvec}{\mathbf u}$

$\newcommand{\vvec}{\mathbf v}$

$\newcommand{\wvec}{\mathbf w}$

$\newcommand{\xvec}{\mathbf x}$

$\newcommand{\yvec}{\mathbf y}$

$\newcommand{\zvec}{\mathbf z}$

$\newcommand{\rvec}{\mathbf r}$

$\newcommand{\mvec}{\mathbf m}$

$\newcommand{\zerovec}{\mathbf 0}$

$\newcommand{\onevec}{\mathbf 1}$

$\newcommand{\real}{\mathbb R}$

$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$

$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$

$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$

$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$

$\newcommand{\bcal}{\cal B}$

$\newcommand{\ccal}{\cal C}$

$\newcommand{\scal}{\cal S}$

$\newcommand{\wcal}{\cal W}$

$\newcommand{\ecal}{\cal E}$

$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$

$\newcommand{\gray}[1]{\color{gray}{#1}}$

$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$

$\newcommand{\rank}{\operatorname{rank}}$

$\newcommand{\row}{\text{Row}}$

$\newcommand{\col}{\text{Col}}$

$\renewcommand{\row}{\text{Row}}$

$\newcommand{\nul}{\text{Nul}}$

$\newcommand{\var}{\text{Var}}$

$\newcommand{\corr}{\text{corr}}$

$\newcommand{\len}[1]{\left|#1\right|}$

$\newcommand{\bbar}{\overline{\bvec}}$

$\newcommand{\bhat}{\widehat{\bvec}}$

$\newcommand{\bperp}{\bvec^\perp}$

$\newcommand{\xhat}{\widehat{\xvec}}$

$\newcommand{\vhat}{\widehat{\vvec}}$

$\newcommand{\uhat}{\widehat{\uvec}}$

$\newcommand{\what}{\widehat{\wvec}}$

$\newcommand{\Sighat}{\widehat{\Sigma}}$

$\newcommand{\lt}{<}$

$\newcommand{\gt}{>}$

$\newcommand{\amp}{&}$

$\definecolor{fillinmathshade}{gray}{0.9}$

1 Introduction $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots \ldots . \ldots \ldots . \ldots \ldots$

$1.1$ Review of the First Course $\ldots \ldots \ldots \ldots \ldots \ldots \ldots, \ldots \ldots \ldots, \ldots \ldots, \ldots \ldots$

1.1.1 First Order Differential Equations .............. 2

1.1.2 Second Order Linear Differential Equations ......... 7

1.1.3 Constant Coefficient Equations ................ 8

1.1.4 Method of Undetermined Coefficients ............ 10

$1.1 .5$ Cauchy-Euler Equations ...................... 13

$1.2$ Overview of the Course $\ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots \ldots \ldots . \ldots \ldots . \ldots$

$1.3$ Appendix: Reduction of Order and Complex Roots $\ldots \ldots \ldots .17$

$\mathrm{~ P r o b l e m s ~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~}$

$2 \quad$ Systems of Differential Equations $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots$

$2.1$ Introduction $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots \ldots$

$2.2$ Equilibrium Solutions and Nearby Behaviors ............ 25

2.2.1 Polar Representation of Spirals ................. . 40

$2.3$ Matrix Formulation ............................ 43

$2.4 \mathrm{~ E i g e n v a l u e ~ P r o b l e m s ~ . . . . . . . . . . . . . . . . . . . . . . . . . .}$

$2.5$ Solving Constant Coefficient Systems in 2D ............. 45

$2.6$ Examples of the Matrix Method ..................... 48 $4 . \ldots .42$

2.6.1 Planar Systems - Summary ................... 52

$2.7$ Theory of Homogeneous Constant Coefficient Systems ....... 52

$2.8$ Nonhomogeneous Systems $\ldots \ldots \ldots \ldots \ldots . \ldots \ldots . \ldots$

$2.9 \mathrm{~ A p p l i c a t i o n s ~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~}$

2.9.1 Spring-Mass Systems $\ldots . \ldots \ldots \ldots \ldots . \ldots \ldots . \ldots . \ldots .6 \ldots$

$2.9 .2$ Electrical Circuits $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots$

$2.9 .3$ Love Affairs $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots$

$2.9 .4$ Predator Prey Models . . . . . . . . . . . . . . . . 72

2.9.5 Mixture Problems ........................ 73

2.9.6 Chemical Kinetics . . . . . . . . . . . . . . . . . 75

$2.9 .7$ Epidemics $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots$

2.10 Appendix: Diagonalization and Linear Systems . . . . . . . . 77 Problems $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots \ldots \ldots . \ldots \ldots . \ldots \ldots . \ldots \ldots . \ldots \ldots . \ldots \ldots \ldots$

$3 \mathrm{~ N o n l i n e a r ~ S y s t e m s ~ . . . . . . . . . . . . . . . . . . . . . . . . .}$

$3.1$ Introduction $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots \ldots . \ldots . \ldots . \ldots . \ldots . \ldots$

$3.2$ Autonomous First Order Equations .................... 90

$3.3$ Solution of the Logistic Equation .................... 92

$3.4$ Bifurcations for First Order Equations . . ............. . 94

$3.5$ Nonlinear Pendulum ........................... . 98

3.5.1 In Search of Solutions $\ldots . \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots . \ldots . \ldots$

$3.6$ The Stability of Fixed Points in Nonlinear Systems $\ldots \ldots . \ldots 102$

$3.7$ Nonlinear Population Models . . . . . . . . . . . . . . . . . . 107

$3.8$ Limit Cycles ................................. . . . . . . . . . . . . . . .

$3.9 \quad$ Nonautonomous Nonlinear Systems . . . . . . . . . . . . . . . 117

3.9.1 Maple Code for Phase Plane Plots . . . . . . . . . . . . 122

$3.10$ Appendix: Period of the Nonlinear Pendulum .............. . . 124

Problems ....................................127

4 Boundary Value Problems . . . . . . . . . . . . . . . . . 131

$4.1$ Introduction $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots \ldots$

$4.2$ Partial Differential Equations $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots 133$

$4.2 .1$ Solving the Heat Equation ..................... . . 134

$4.3$ Connections to Linear Algebra . . . . . . . . . . . . . . . . 137

4.3.1 Eigenfunction Expansions for PDEs $\ldots \ldots \ldots . \ldots \ldots .137$

4.3.2 Eigenfunction Expansions for Nonhomogeneous ODEs . 140

$4.3 .3 \mathrm{~ L i n e a r ~ V e c t o r ~ S p a c e s ~ . . . . . . . . . . . . . . . . . . . . . .}$

Problems ..................................... 147

$5 \quad$ Fourier Series $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots, \ldots \ldots, \ldots, \ldots, \ldots, \ldots, \ldots, \ldots, \ldots, \ldots, \ldots$

$5.1$ Introduction . . . . . . . . . . . . . . . . . . . . . . . 149

$5.2$ Fourier Trigonometric Series . . . . . . . . . . . . . . . . 154

$5.3$ Fourier Series Over Other Intervals $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots$

$5.3 .1$ Fourier Series on $[a, b] \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots, \ldots \ldots$

$5.4$ Sine and Cosine Series $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots . \ldots . \ldots . \ldots$

$5.5$ Appendix: The Gibbs Phenomenon . . . . . . . . . . . . . 175

$\mathrm{~ P r o b l e m s ~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~}$

6 Sturm-Liouville Eigenvalue Problems $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots$

$6.1$ Introduction $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots \ldots . \ldots . \ldots . \ldots . \ldots$

6.2 Properties of Sturm-Liouville Eigenvalue Problems . . . . . . . 189

$6.2 .1$ Adjoint Operators .......................... . 191

$6.2 .2$ Lagrange’s and Green’s Identities . . . . . . . . . . . . 193

$6.2 .3$ Orthogonality and Reality .................... 194

$6.2 .4$ The Rayleigh Quotient $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots$

$6.3$ The Eigenfunction Expansion Method ................ . 197

$6.4$ The Fredholm Alternative Theorem . . . . . . . . . . . . . 199 Problems $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots \ldots \ldots . \ldots \ldots . \ldots \ldots . \ldots . \ldots \ldots$

7 Special Functions . . . . . . . . . . . . . . . . . . . . . . . 205

$7.1 \mathrm{~ C l a s s i c a l ~ O r t h o g o n a l ~ P o l y n o m i a l s ~ . . . . . . . . . . . . . . . . . . . . . ~ . ~ . ~}$

$7.2$ Legendre Polynomials ......... . . . . . . . . . . . . . . . . . 209

$7.2 .1$ The Rodrigues Formula . . . . . . . . . . . . . . . . 211

$7.2 .2$ Three Term Recursion Formula $\ldots \ldots \ldots \ldots \ldots . \ldots . \ldots . \ldots 213$

$7.2 .3$ The Generating Function ....................214

$7.2 .4$ Eigenfunction Expansions ................... 218

$7.3$ Gamma Function ............................... 221

$7.4 \mathrm{~ B e s s e l ~ F u n c t i o n s ~ . . . . . . . . . . . . . . . . . . . . . . . .}$

$7.5$ Hypergeometric Functions . . . . . . . . . . . . . . . . . 227

$7.6$ Appendix: The Binomial Expansion . . . . . . . . . . . . . . . 229

Problems $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots \ldots \ldots . \ldots$

$8 \quad$ Green’s Functions $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots$

$8.1$ The Method of Variation of Parameters $\ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots 238$

$8.2$ Initial and Boundary Value Green’s Functions . . . . . . . . . . 243

8.2.1 Initial Value Green’s Function ................ . . 244

$8.2 .2$ Boundary Value Green’s Function .............. . 247

$8.3$ Properties of Green’s Functions $\ldots \ldots \ldots \ldots . \ldots \ldots . \ldots . \ldots . \ldots . \ldots .256$

$\mathrm{~ 8 . 3 . 1 ~ T h e ~ D i r a c ~ D e l t a ~ F u n c t i o n ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ . ~ 2 5}$

8.3.2 Green’s Function Differential Equation ............ 259

8.4 Series Representations of Green’s Functions . . . . . . . . . 262

Problems $\ldots \ldots \ldots \ldots \ldots \ldots \ldots . \ldots \ldots . \ldots . \ldots . \ldots . \ldots . \ldots . \ldots . \ldots$

Introduction

These are notes for a second course in differential equations originally taught in the Spring semester of 2005 at the University of North Carolina Wilmington to upper level and first year graduate students and later updated in Fall 2007 and Fall 2008. It is assumed that you have had an introductory course in differential equations. However, we will begin this chapter with a review of some of the material from your first course in differential equations and then give an overview of the material we are about to cover.

Typically an introductory course in differential equations introduces students to analytical solutions of first order differential equations which are separable, first order linear differential equations, and sometimes to some other special types of equations. Students then explore the theory of second order differential equations generally restricted the study of exact solutions of constant coefficient linear differential equations or even equations of the CauchyEuler type. These are later followed by the study of special techniques, such as power series methods or Laplace transform methods. If time permits, ones explores a few special functions, such as Legendre polynomials and Bessel functions, while exploring power series methods for solving differential equations.

More recently, variations on this inventory of topics have been introduced through the early introduction of systems of differential equations, qualitative studies of these systems and a more intense use of technology for understanding the behavior of solutions of differential equations. This is typically done at the expense of not covering power series methods, special functions, or Laplace transforms. In either case, the types of problems solved are initial value problems in which the differential equation to be solved is accompanied by a set of initial conditions.

In this course we will assume some exposure to the overlap of these two approaches. We will first give a quick review of the solution of separable and linear first order equations. Then we will review second order linear differential equations and Cauchy-Euler equations. This will then be followed by an overview of some of the topics covered. As with any course in differential equa- tions, we will emphasize analytical, graphical and (sometimes) approximate solutions of differential equations. Throughout we will present applications from physics, chemistry and biology.

$1.1$ Review of the First Course

In this section we review a few of the solution techniques encountered in a first course in differential equations. We will not review the basic theory except in possible references as reminders as to what we are doing.

We first recall that an $n$ -th order ordinary differential equation is an equation for an unknown function $y(x)$ that expresses a relationship between the unknown function and its first $n$ derivatives. One could write this generally

$F\left(y^{(n)}(x), y^{(n-1)}(x), \ldots, y^{\prime}(x), y(x), x\right)=0 . \nonumber$

Here $y^{(n)}(x)$ represents the $n$ th derivative of $y(x)$ .

An initial value problem consists of the differential equation plus the values of the first $n-1$ derivatives at a particular value of the independent variable, say $x_{0}$ :

$y^{(n-1)}\left(x_{0}\right)=y_{n-1}, \quad y^{(n-2)}\left(x_{0}\right)=y_{n-2}, \quad \cdots, \quad y\left(x_{0}\right)=y_{0} \nonumber$

A linear nth order differential equation takes the form

$\left.a_{n}(x) y^{(n)}(x)+a_{n-1}(x) y^{(n-1)}(x)+\ldots+a_{1}(x) y^{\prime}(x)+a_{0}(x) y(x)\right)=f(x) . \nonumber$

If $f(x) \equiv 0$ , then the equation is said to be homogeneous, otherwise it is nonhomogeneous.

First Order Differential Equations

Typically, the first differential equations encountered are first order equations. A first order differential equation takes the form

$F\left(y^{\prime}, y, x\right)=0 \nonumber$

There are two general forms for which one can formally obtain a solution. The first is the separable case and the second is a first order equation. We indicate that we can formally obtain solutions, as one can display the needed integration that leads to a solution. However, the resulting integrals are not always reducible to elementary functions nor does one obtain explicit solutions when the integrals are doable.

A first order equation is separable if it can be written the form

$\dfrac{d y}{d x}=f(x) g(y) \nonumber$

Special cases result when either $f(x)=1$ or $g(y)=1$ . In the first case the equation is said to be autonomous.

The general solution to equation (1.5) is obtained in terms of two integrals:

$\int \dfrac{d y}{g(y)}=\int f(x) d x+C, \nonumber$

where $C$ is an integration constant. This yields a 1-parameter family of solutions to the differential equation corresponding to different values of $C$ . If one can solve (1.6) for $y(x)$ , then one obtains an explicit solution. Otherwise, one has a family of implicit solutions. If an initial condition is given as well, then one might be able to find a member of the family that satisfies this condition, which is often called a particular solution.

Example 1.1. $y^{\prime}=2 x y, y(0)=2$ .

Applying (1.6), one has

$\int \dfrac{d y}{y}=\int 2 x d x+C \nonumber$

Integrating yields

$\ln |y|=x^{2}+C . \nonumber$

Exponentiating, one obtains the general solution,

$y(x)=\pm e^{x^{2}+C}=A e^{x^{2}} . \nonumber$

Here we have defined $A=\pm e^{C}$ . Since $C$ is an arbitrary constant, $A$ is an arbitrary constant. Several solutions in this 1-parameter family are shown in Figure 1.1.

Next, one seeks a particular solution satisfying the initial condition. For $y(0)=2$ , one finds that $A=2$ . So, the particular solution satisfying the initial conditions is $y(x)=2 e^{x^{2}}$ .

Example 1.2. $y y^{\prime}=-x$ .

Following the same procedure as in the last example, one obtains:

$\int y d y=-\int x d x+C \Rightarrow y^{2}=-x^{2}+A, \quad \text { where } \quad A=2 C . \nonumber$

Thus, we obtain an implicit solution. Writing the solution as $x^{2}+y^{2}=A$ , we see that this is a family of circles for $A>0$ and the origin for $A=0$ . Plots of some solutions in this family are shown in Figure 1.2.

Figure 1.1. Plots of solutions from the 1-parameter family of solutions of Example $1.1$ for several initial conditions.

In this case one seeks an integrating factor, $\mu(x)$ , which is a function that one can multiply through the equation making the left side a perfect derivative. Thus, obtaining,

$\dfrac{d}{d x}[\mu(x) y(x)]=\mu(x) q(x) . \nonumber$

The integrating factor that works is $\mu(x)=\exp \left(\int^{x} p(\xi) d \xi\right)$ . One can show this by expanding the derivative in Equation (1.8),

$\mu(x) y^{\prime}(x)+\mu^{\prime}(x) y(x)=\mu(x) q(x) \nonumber$

and comparing this equation to the one obtained from multiplying (1.7) by $\mu(x)$ :

$\mu(x) y^{\prime}(x)+\mu(x) p(x) y(x)=\mu(x) q(x) . \nonumber$

Note that these last two equations would be the same if

$\dfrac{d \mu(x)}{d x}=\mu(x) p(x) . \nonumber$

This is a separable first order equation whose solution is the above given form for the integrating factor,

Figure 1.2. Plots of solutions of Example $1.2$ for several initial conditions.

$\mu(x)=\exp \left(\int^{x} p(\xi) d \xi\right) \nonumber$

Equation (1.8) is easily integrated to obtain

$y(x)=\dfrac{1}{\mu(x)}\left[\int^{x} \mu(\xi) q(\xi) d \xi+C\right] . \nonumber$

Example 1.3. $x y^{\prime}+y=x, \quad x>0, y(1)=0$ .

One first notes that this is a linear first order differential equation. Solving for $y^{\prime}$ , one can see that the original equation is not separable. However, it is not in the standard form. So, we first rewrite the equation as

$\dfrac{d y}{d x}+\dfrac{1}{x} y=1 . \nonumber$

Noting that $p(x)=\dfrac{1}{x}$ , we determine the integrating factor

$\mu(x)=\exp \left[\int^{x} \dfrac{d \xi}{\xi}\right]=e^{\ln x}=x . \nonumber$

Multiplying equation (1.13) by $\mu(x)=x$ , we actually get back the original equation! In this case we have found that $x y^{\prime}+y$ must have been the derivative of something to start. In fact, $(x y)^{\prime}=x y^{\prime}+x$ . Therefore, equation (1.8) becomes

$(x y)^{\prime}=x \nonumber$

Integrating one obtains

$x y=\dfrac{1}{2} x^{2}+C, \nonumber$

$y(x)=\dfrac{1}{2} x+\dfrac{C}{x} . \nonumber$

Inserting the initial condition into this solution, we have $0=\dfrac{1}{2}+C$ . Therefore, $C=-\dfrac{1}{2}$ . Thus, the solution of the initial value problem is $y(x)=\dfrac{1}{2}\left(x-\dfrac{1}{x}\right)$

Example 1.4. $(\sin x) y^{\prime}+(\cos x) y=x^{2} \sin x$ .

Actually, this problem is easy if you realize that

$\dfrac{d}{d x}((\sin x) y)=(\sin x) y^{\prime}+(\cos x) y \nonumber$

But, we will go through the process of finding the integrating factor for practice.

First, rewrite the original differential equation in standard form:

$y^{\prime}+(\cot x) y=x^{2} \nonumber$

Then, compute the integrating factor as

$\mu(x)=\exp \left(\int^{x} \cot \xi d \xi\right)=e^{-\ln (\sin x)}=\dfrac{1}{\sin x} \nonumber$

Using the integrating factor, the original equation becomes

$\dfrac{d}{d x}((\sin x) y)=x^{2} . \nonumber$

Integrating, we have

$y \sin x=\dfrac{1}{3} x^{3}+C \nonumber$

So, the solution is

$y=\left(\dfrac{1}{3} x^{3}+C\right) \csc x \nonumber$

There are other first order equations that one can solve for closed form solutions. However, many equations are not solvable, or one is simply interested in the behavior of solutions. In such cases one turns to direction fields. We will return to a discussion of the qualitative behavior of differential equations later in the course.

Second Order Linear Differential Equations

Second order differential equations are typically harder than first order. In most cases students are only exposed to second order linear differential equations. A general form for a second order linear differential equation is given by

$a(x) y^{\prime \prime}(x)+b(x) y^{\prime}(x)+c(x) y(x)=f(x) . \nonumber$

One can rewrite this equation using operator terminology. Namely, one first defines the differential operator $L=a(x) D^{2}+b(x) D+c(x)$ , where $D=\dfrac{d}{d x}$ . Then equation (1.14) becomes

$L y=f \nonumber$

The solutions of linear differential equations are found by making use of the linearity of $L$ . Namely, we consider the vector space ${ }^{1}$ consisting of realvalued functions over some domain. Let $f$ and $g$ be vectors in this function space. $L$ is a linear operator if for two vectors $f$ and $g$ and scalar $a$ , we have that

a. $L(f+g)=L f+L g$

b. $L(a f)=a L f$ .

One typically solves (1.14) by finding the general solution of the homogeneous problem,

$L y_{h}=0 \nonumber$

and a particular solution of the nonhomogeneous problem,

$L y_{p}=f \text {. } \nonumber$

Then the general solution of (1.14) is simply given as $y=y_{h}+y_{p}$ . This is true because of the linearity of $L$ . Namely,

$\begin{aligned} L y &=L\left(y_{h}+y_{p}\right) \\[4pt] &=L y_{h}+L y_{p} \\[4pt] &=0+f=f \end{aligned} \nonumber$

There are methods for finding a particular solution of a differential equation. These range from pure guessing to the Method of Undetermined Coefficients, or by making use of the Method of Variation of Parameters. We will review some of these methods later.

Determining solutions to the homogeneous problem, $L y_{h}=0$ , is not always easy. However, others have studied a variety of second order linear equations

${ }^{1}$ We assume that the reader has been introduced to concepts in linear algebra. Late in the text we will recall the definition of a vector space and see that linear algebra is in the background of the study of many concepts in the solution of differential equations. and have saved us the trouble for some of the differential equations that often appear in applications.

Again, linearity is useful in producing the general solution of a homogeneous linear differential equation. If $y_{1}$ and $y_{2}$ are solutions of the homogeneous equation, then the linear combination $y=c_{1} y_{1}+c_{2} y_{2}$ is also a solution of the homogeneous equation. In fact, if $y_{1}$ and $y_{2}$ are linearly independent, ${ }^{2}$ then $y=c_{1} y_{1}+c_{2} y_{2}$ is the general solution of the homogeneous problem. As you may recall, linear independence is established if the Wronskian of the solutions in not zero. In this case, we have

$W\left(y_{1}, y_{2}\right)=y_{1}(x) y_{2}^{\prime}(x)-y_{1}^{\prime}(x) y_{2}(x) \neq 0 \nonumber$

Constant Coefficient Equations

The simplest and most seen second order differential equations are those with constant coefficients. The general form for a homogeneous constant coefficient second order linear differential equation is given as

$a y^{\prime \prime}(x)+b y^{\prime}(x)+c y(x)=0, \nonumber$

where $a, b$ , and $c$ are constants.

Solutions to (1.18) are obtained by making a guess of $y(x)=e^{r x}$ . Inserting this guess into (1.18) leads to the characteristic equation

$a r^{2}+b r+c=0 . \nonumber$

The roots of this equation in turn lead to three types of solution depending upon the nature of the roots as shown below.

Example 1.5. $y^{\prime \prime}-y^{\prime}-6 y=0 y(0)=2, y^{\prime}(0)=0$ .

The characteristic equation for this problem is $r^{2}-r-6=0$ . The roots of this equation are found as $r=-2,3$ . Therefore, the general solution can be quickly written down:

$y(x)=c_{1} e^{-2 x}+c_{2} e^{3 x} . \nonumber$

Note that there are two arbitrary constants in the general solution. Therefore, one needs two pieces of information to find a particular solution. Of course, we have the needed information in the form of the initial conditions.

One also needs to evaluate the first derivative

$y^{\prime}(x)=-2 c_{1} e^{-2 x}+3 c_{2} e^{3 x} \nonumber$

${ }^{2}$ Recall, a set of functions $\left\{y_{i}(x)\right\}_{i=1}^{n}$ is a linearly independent set if and only if

$c_{1} y\left(1(x)+\ldots+c_{n} y_{n}(x)=0\right. \nonumber$

implies $c_{i}=0$ , for $i=1, \ldots, n$ . in order to attempt to satisfy the initial conditions. Evaluating $y$ and $y^{\prime}$ at $x=0$ yields

$\begin{aligned} &2=c_{1}+c_{2} \\[4pt] &0=-2 c_{1}+3 c_{2} \end{aligned} \nonumber$

These two equations in two unknowns can readily be solved to give $c_{1}=6 / 5$ and $c_{2}=4 / 5$ . Therefore, the solution of the initial value problem is obtained as $y(x)=\dfrac{6}{5} e^{-2 x}+\dfrac{4}{5} e^{3 x}$ .

Classification of Roots of the Characteristic Equation for Second Order Constant Coefficient ODEs

Real, distinct roots $r_{1}, r_{2}$ . In this case the solutions corresponding to each root are linearly independent. Therefore, the general solution is simply $y(x)=c_{1} e^{r_{1} x}+c_{2} e^{r_{2} x}$ .
Real, equal roots $r_{1}=r_{2}=r$ . In this case the solutions corresponding to each root are linearly dependent. To find a second linearly independent solution, one uses the Method of Reduction of Order. This gives the second solution as $x e^{r x}$ . Therefore, the general solution is found as $y(x)=\left(c_{1}+c_{2} x\right) e^{r x}$ . [This is covered in the appendix to this chapter.]
Complex conjugate roots $r_{1}, r_{2}=\alpha \pm i \beta$ . In this case the solutions corresponding to each root are linearly independent. Making use of Euler’s identity, $e^{i \theta}=\cos (\theta)+i \sin (\theta)$ , these complex exponentials can be rewritten in terms of trigonometric functions. Namely, one has that $e^{\alpha x} \cos (\beta x)$ and $e^{\alpha x} \sin (\beta x)$ are two linearly independent solutions. Therefore, the general solution becomes $y(x)=e^{\alpha x}\left(c_{1} \cos (\beta x)+c_{2} \sin (\beta x)\right)$ . [This is covered in the appendix to this chapter.]

Example 1.6. $y^{\prime \prime}+6 y^{\prime}+9 y=0$ .

In this example we have $r^{2}+6 r+9=0$ . There is only one root, $r=-3$ . Again, the solution is easily obtained as $y(x)=\left(c_{1}+c_{2} x\right) e^{-3 x}$ .

Example 1.7. $y^{\prime \prime}+4 y=0$ .

The characteristic equation in this case is $r^{2}+4=0$ . The roots are pure imaginary roots, $r=\pm 2 i$ and the general solution consists purely of sinusoidal functions: $y(x)=c_{1} \cos (2 x)+c_{2} \sin (2 x)$

Example 1.8. $y^{\prime \prime}+2 y^{\prime}+4 y=0$ .

The characteristic equation in this case is $r^{2}+2 r+4=0$ . The roots are complex, $r=-1 \pm \sqrt{3} i$ and the general solution can be written as $y(x)=$ $\left[c_{1} \cos (\sqrt{3} x)+c_{2} \sin (\sqrt{3} x)\right] e^{-x}$ One of the most important applications of the equations in the last two examples is in the study of oscillations. Typical systems are a mass on a spring, or a simple pendulum. For a mass $m$ on a spring with spring constant $k>0$ , one has from Hooke’s law that the position as a function of time, $x(t)$ , satisfies the equation

$m x^{\prime \prime}+k x=0 . \nonumber$

This constant coefficient equation has pure imaginary roots $(\alpha=0)$ and the solutions are pure sines and cosines. Such motion is called simple harmonic motion.

Adding a damping term and periodic forcing complicates the dynamics, but is nonetheless solvable. The next example shows a forced harmonic oscillator.

Example 1.9. $y^{\prime \prime}+4 y=\sin x$ .

This is an example of a nonhomogeneous problem. The homogeneous problem was actually solved in Example 1.7. According to the theory, we need only seek a particular solution to the nonhomogeneous problem and add it to the solution of the last example to get the general solution.

The particular solution can be obtained by purely guessing, making an educated guess, or using the Method of Variation of Parameters. We will not review all of these techniques at this time. Due to the simple form of the driving term, we will make an intelligent guess of $y_{p}(x)=A \sin x$ and determine what $A$ needs to be. Recall, this is the Method of Undetermined Coefficients which we review in the next section. Inserting our guess in the equation gives $(-A+4 A) \sin x=\sin x$ . So, we see that $A=1 / 3$ works. The general solution of the nonhomogeneous problem is therefore $y(x)=$ $c_{1} \cos (2 x)+c_{2} \sin (2 x)+\dfrac{1}{3} \sin x$

Method of Undetermined Coefficients

To date, we only know how to solve constant coefficient, homogeneous equations. How does one solve a nonhomogeneous equation like that in Equation $(1.14)$

$a(x) y^{\prime \prime}(x)+b(x) y^{\prime}(x)+c(x) y(x)=f(x) \nonumber$

Recall, that one solves this equation by finding the general solution of the homogeneous problem,

$L y_{h}=0 \nonumber$

and a particular solution of the nonhomogeneous problem,

$L y_{p}=f \nonumber$

Then the general solution of (1.14) is simply given as $y=y_{h}+y_{p}$ . So, how do we find the particular solution? You could guess a solution, but that is not usually possible without a little bit of experience. So we need some other methods. There are two main methods. In the first case, the Method of Undetermined Coefficients, one makes an intelligent guess based on the form of $f(x)$ . In the second method, one can systematically develop the particular solution. We will come back to this method the Method of Variation of Parameters, later in the book.

Let’s solve a simple differential equation highlighting how we can handle nonhomogeneous equations.

Example 1.10. Consider the equation

$y^{\prime \prime}+2 y^{\prime}-3 y=4 \nonumber$

The first step is to determine the solution of the homogeneous equation. Thus, we solve

$y_{h}^{\prime \prime}+2 y_{h}^{\prime}-3 y_{h}=0 . \nonumber$

The characteristic equation is $r^{2}+2 r-3=0$ . The roots are $r=1,-3$ . So, we can immediately write the solution

$y_{h}(x)=c_{1} e^{x}+c_{2} e^{-3 x} . \nonumber$

The second step is to find a particular solution of (1.22). What possible function can we insert into this equation such that only a 4 remains? If we try something proportional to $x$ , then we are left with a linear function after inserting $x$ and its derivatives. Perhaps a constant function you might think. $y=4$ does not work. But, we could try an arbitrary constant, $y=A$ .

Let’s see. Inserting $y=A$ into $(1.22)$ , we obtain

$-3 A=4 . \nonumber$

Ah ha! We see that we can choose $A=-\dfrac{4}{3}$ and this works. So, we have a particular solution, $y_{p}(x)=-\dfrac{4}{3}$ . This step is done.

Combining our two solutions, we have the general solution to the original nonhomogeneous equation (1.22). Namely,

$y(x)=y_{h}(x)+y_{p}(x)=c_{1} e^{x}+c_{2} e^{-3 x}-\dfrac{4}{3} \nonumber$

Insert this solution into the equation and verify that it is indeed a solution. If we had been given initial conditions, we could now use them to determine our arbitrary constants.

What if we had a different source term? Consider the equation

$y^{\prime \prime}+2 y^{\prime}-3 y=4 x . \nonumber$

The only thing that would change is our particular solution. So, we need a guess. We know a constant function does not work by the last example. So, let’s try $y_{p}=A x$ . Inserting this function into Equation (??), we obtain

$2 A-3 A x=4 x . \nonumber$

Picking $A=-4 / 3$ would get rid of the $x$ terms, but will not cancel everything. We still have a constant left. So, we need something more general.

Let’s try a linear function, $y_{p}(x)=A x+B$ . Then we get after substitution $\operatorname{into}(1.24)$

$2 A-3(A x+B)=4 x . \nonumber$

Equating the coefficients of the different powers of $x$ on both sides, we find a system of equations for the undetermined coefficients:

$\begin{array}{r} 2 A-3 B=0 \\[4pt] -3 A=4 . \end{array} \nonumber$

These are easily solved to obtain

$\begin{aligned} &A=-\dfrac{4}{3} \\[4pt] &B=\dfrac{2}{3} A=-\dfrac{8}{9} \end{aligned} \nonumber$

So, our particular solution is

$y_{p}(x)=-\dfrac{4}{3} x-\dfrac{8}{9} . \nonumber$

This gives the general solution to the nonhomogeneous problem as

$y(x)=y_{h}(x)+y_{p}(x)=c_{1} e^{x}+c_{2} e^{-3 x}-\dfrac{4}{3} x-\dfrac{8}{9} \nonumber$

There are general forms that you can guess based upon the form of the driving term, $f(x)$ . Some examples are given in Table 1.1.4. More general applications are covered in a standard text on differential equations. However, the procedure is simple. Given $f(x)$ in a particular form, you make an appropriate guess up to some unknown parameters, or coefficients. Inserting the guess leads to a system of equations for the unknown coefficients. Solve the system and you have your solution. This solution is then added to the general solution of the homogeneous differential equation.

$f(x)$	Guess
$a_{n} x^{n}+a_{n-1} x^{n-1}+\cdots+a_{1} x+a_{0}+A_{n} x^{n}+A_{n-1} x^{n-1}+\cdots+A_{1} x+A_{0}$
$a e^{b x}$	$A e^{b x}$
$a \cos \omega x+b \sin \omega x$	$A \cos \omega x+B \sin \omega x$

Example 1.11. As a final example, let’s consider the equation

$y^{\prime \prime}+2 y^{\prime}-3 y=2 e^{-3 x} \nonumber$

According to the above, we would guess a solution of the form $y_{p}=A e^{-3 x}$ . Inserting our guess, we find

$0=2 e^{-3 x} \nonumber$

Oops! The coefficient, $A$ , disappeared! We cannot solve for it. What went wrong?

The answer lies in the general solution of the homogeneous problem. Note that $e^{x}$ and $e^{-3 x}$ are solutions to the homogeneous problem. So, a multiple of $e^{-3 x}$ will not get us anywhere. It turns out that there is one further modification of the method. If our driving term contains terms that are solutions of the homogeneous problem, then we need to make a guess consisting of the smallest possible power of $x$ times the function which is no longer a solution of the homogeneous problem. Namely, we guess $y_{p}(x)=A x e^{-3 x}$ . We compute the derivative of our guess, $y_{p}^{\prime}=A(1-3 x) e^{-3 x}$ and $y_{p}^{\prime \prime}=A(9 x-6) e^{-3 x}$ . Inserting these into the equation, we obtain

$[(9 x-6)+2(1-3 x)-3 x] A e^{-3 x}=2 e^{-3 x} \nonumber$

$-4 A=2 . \nonumber$

So, $A=-1 / 2$ and $y_{p}(x)=-\dfrac{1}{2} x e^{-3 x}$ .

Modified Method of Undetermined Coefficients

In general, if any term in the guess $y_{p}(x)$ is a solution of the homogeneous equation, then multiply the guess by $x^{k}$ , where $k$ is the smallest positive integer such that no term in $x^{k} y_{p}(x)$ is a solution of the homogeneous problem.

Cauchy-Euler Equations

Another class of solvable linear differential equations that is of interest are the Cauchy-Euler type of equations. These are given by

$a x^{2} y^{\prime \prime}(x)+b x y^{\prime}(x)+c y(x)=0 . \nonumber$

Note that in such equations the power of $x$ in each of the coefficients matches the order of the derivative in that term. These equations are solved in a manner similar to the constant coefficient equations.

One begins by making the guess $y(x)=x^{r}$ . Inserting this function and its derivatives,

$y^{\prime}(x)=r x^{r-1}, \quad y^{\prime \prime}(x)=r(r-1) x^{r-2}, \nonumber$

into Equation (1.28), we have

$[a r(r-1)+b r+c] x^{r}=0 \nonumber$

Since this has to be true for all $x$ in the problem domain, we obtain the characteristic equation

$a r(r-1)+b r+c=0 \nonumber$

Just like the constant coefficient differential equation, we have a quadratic equation and the nature of the roots again leads to three classes of solutions. These are shown below. Some of the details are provided in the next section.

Classification of Roots of the Characteristic Equation for Cauchy-Euler Differential Equations

Real, distinct roots $r_{1}, r_{2}$ . In this case the solutions corresponding to each root are linearly independent. Therefore, the general solution is simply $y(x)=c_{1} x^{r_{1}}+c_{2} x^{r_{2}}$ .
Real, equal roots $r_{1}=r_{2}=r$ . In this case the solutions corresponding to each root are linearly dependent. To find a second linearly independent solution, one uses the Method of Reduction of Order. This gives the second solution as $x^{r} \ln |x|$ . Therefore, the general solution is found as $y(x)=\left(c_{1}+c_{2} \ln |x|\right) x^{r}$ .
Complex conjugate roots $r_{1}, r_{2}=\alpha \pm i \beta$ . In this case the solutions corresponding to each root are linearly independent. These complex exponentials can be rewritten in terms of trigonometric functions. Namely, one has that $x^{\alpha} \cos (\beta \ln |x|)$ and $x^{\alpha} \sin (\beta \ln |x|)$ are two linearly independent solutions. Therefore, the general solution becomes $y(x)=$ $x^{\alpha}\left(c_{1} \cos (\beta \ln |x|)+c_{2} \sin (\beta \ln |x|)\right)$ .

Example 1.12. $x^{2} y^{\prime \prime}+5 x y^{\prime}+12 y=0$

As with the constant coefficient equations, we begin by writing down the characteristic equation. Doing a simple computation,

$\begin{aligned} 0 &=r(r-1)+5 r+12 \\[4pt] &=r^{2}+4 r+12 \\[4pt] &=(r+2)^{2}+8 \\[4pt] -8 &=(r+2)^{2} \end{aligned} \nonumber$

one determines the roots are $r=-2 \pm 2 \sqrt{2} i$ . Therefore, the general solution is $y(x)=\left[c_{1} \cos (2 \sqrt{2} \ln |x|)+c_{2} \sin (2 \sqrt{2} \ln |x|)\right] x^{-2}$ Example 1.13. $t^{2} y^{\prime \prime}+3 t y^{\prime}+y=0, \quad y(1)=0, y^{\prime}(1)=1$ .

For this example the characteristic equation takes the form

$r(r-1)+3 r+1=0, \nonumber$

$r^{2}+2 r+1=0 . \nonumber$

There is only one real root, $r=-1$ . Therefore, the general solution is

$y(t)=\left(c_{1}+c_{2} \ln |t|\right) t^{-1} . \nonumber$

However, this problem is an initial value problem. At $t=1$ we know the values of $y$ and $y^{\prime}$ . Using the general solution, we first have that

$0=y(1)=c_{1} \nonumber$

Thus, we have so far that $y(t)=c_{2} \ln |t| t^{-1}$ . Now, using the second condition and

$y^{\prime}(t)=c_{2}(1-\ln |t|) t^{-2}, \nonumber$

we have

$1=y(1)=c_{2} \text {. } \nonumber$

Therefore, the solution of the initial value problem is $y(t)=\ln |t| t^{-1}$ .

Nonhomogeneous Cauchy-Euler Equations We can also solve some nonhomogeneous Cauchy-Euler equations using the Method of Undetermined Coefficients. We will demonstrate this with a couple of examples.

Example 1.14. Find the solution of $x^{2} y^{\prime \prime}-x y^{\prime}-3 y=2 x^{2}$ .

First we find the solution of the homogeneous equation. The characteristic equation is $r^{2}-2 r-3=0$ . So, the roots are $r=-1,3$ and the solution is $y_{h}(x)=c_{1} x^{-1}+c_{2} x^{3}$

We next need a particular solution. Let’s guess $y_{p}(x)=A x^{2}$ . Inserting the guess into the nonhomogeneous differential equation, we have

$\begin{aligned} 2 x^{2} &=x^{2} y^{\prime \prime}-x y^{\prime}-3 y=2 x^{2} \\[4pt] &=2 A x^{2}-2 A x^{2}-3 A x^{2} \\[4pt] &=-3 A x^{2} \end{aligned} \nonumber$

So, $A=-2 / 3$ . Therefore, the general solution of the problem is

$y(x)=c_{1} x^{-1}+c_{2} x^{3}-\dfrac{2}{3} x^{2} . \nonumber$

Example 1.15. Find the solution of $x^{2} y^{\prime \prime}-x y^{\prime}-3 y=2 x^{3}$ .

In this case the nonhomogeneous term is a solution of the homogeneous problem, which we solved in the last example. So, we will need a modification of the method. We have a problem of the form

$a x^{2} y^{\prime \prime}+b x y^{\prime}+c y=d x^{r} \nonumber$

where $r$ is a solution of $a r(r-1)+b r+c=0$ . Let’s guess a solution of the form $y=A x^{r} \ln x$ . Then one finds that the differential equation reduces to $A x^{r}(2 a r-a+b)=d x^{r}$ . [You should verify this for yourself.]

With this in mind, we can now solve the problem at hand. Let $y_{p}=$ $A x^{3} \ln x$ . Inserting into the equation, we obtain $4 A x^{3}=2 x^{3}$ , or $A=1 / 2$ . The general solution of the problem can now be written as

$y(x)=c_{1} x^{-1}+c_{2} x^{3}+\dfrac{1}{2} x^{3} \ln x . \nonumber$

Overview of the Course

For the most part, your first course in differential equations was about solving initial value problems. When second order equations did not fall into the above cases, then you might have learned how to obtain approximate solutions using power series methods, or even finding new functions from these methods. In this course we will explore two broad topics: systems of differential equations and boundary value problems.

We will see that there are interesting initial value problems when studying systems of differential equations. In fact, many of the second order equations that you have seen in the past can be written as a system of two first order equations. For example, the equation for simple harmonic motion,

$x^{\prime \prime}+\omega^{2} x=0 \nonumber$

can be written as the system

$\begin{gathered} x^{\prime}=y \\[4pt] y^{\prime}=-\omega^{2} x \end{gathered} . \nonumber$

Just note that $x^{\prime \prime}=y^{\prime}=-\omega^{2} x$ . Of course, one can generalize this to systems with more complicated right hand sides. The behavior of such systems can be fairly interesting and these systems result from a variety of physical models.

In the second part of the course we will explore boundary value problems. Often these problems evolve from the study of partial differential equations. Such examples stem from vibrating strings, temperature distributions, bending beams, etc. Boundary conditions are conditions that are imposed at more than one point, while for initial value problems the conditions are specified at one point. For example, we could take the oscillation equation above and ask when solutions of the equation would satisfy the conditions $x(0)=0$ and $x(1)=0$ . The general solution, as we have determined earlier, is

$x(t)=c_{1} \cos \omega t+c_{2} \sin \omega t \nonumber$

Requiring $x(0)=0$ , we find that $c_{1}=0$ , leaving $x(t)=c_{2} \sin \omega t$ . Also imposing that $0=x(1)=c_{2} \sin \omega$ , we are forced to make $\omega=n \pi$ , for $n=1,2, \ldots$ . (Making $c_{2}=0$ would not give a nonzero solution of the problem.) Thus, there are an infinite number of solutions possible, if we have the freedom to choose our $\omega$ . In the second half of the course we will investigate techniques for solving boundary value problems and look at several applications, including seeing the connections with partial differential equations and Fourier series.

Appendix: Reduction of Order and Complex Roots

In this section we provide some of the details leading to the general forms for the constant coefficient and Cauchy-Euler differential equations. In the first subsection we review how the Method of Reduction of Order is used to obtain the second linearly independent solutions for the case of one repeated root. In the second subsection we review how the complex solutions can be used to produce two linearly independent real solutions.

Method of Reduction of Order

First we consider constant coefficient equations. In the case when there is a repeated real root, one has only one independent solution, $y_{1}(x)=e^{r x}$ . The question is how does one obtain the second solution? Since the solutions are independent, we must have that the ratio $y_{2}(x) / y_{1}(x)$ is not a constant. So, we guess the form $y_{2}(x)=v(x) y_{1}(x)=v(x) e^{r x}$ . For constant coefficient second order equations, we can write the equation as

$(D-r)^{2} y=0 \nonumber$

where $D=\dfrac{d}{d x}$

We now insert $y_{2}(x)$ into this equation. First we compute

$(D-r) v e^{r x}=v^{\prime} e^{r x} . \nonumber$

Then,

$(D-r)^{2} v e^{r x}=(D-r) v^{\prime} e^{r x}=v^{\prime \prime} e^{r x} . \nonumber$

So, if $y_{2}(x)$ is to be a solution to the differential equation, $(D-r)^{2} y_{2}=0$ , then $v^{\prime \prime}(x) e^{r x}=0$ for all $x$ . So, $v^{\prime \prime}(x)=0$ , which implies that

$v(x)=a x+b \nonumber$

So,

$y_{2}(x)=(a x+b) e^{r x} . \nonumber$

Without loss of generality, we can take $b=0$ and $a=1$ to obtain the second linearly independent solution, $y_{2}(x)=x e^{r x}$ . Deriving the solution for Case 2 for the Cauchy-Euler equations is messier, but works in the same way. First note that for the real root, $r=r_{1}$ , the characteristic equation has to factor as $\left(r-r_{1}\right)^{2}=0$ . Expanding, we have

$r^{2}-2 r_{1} r+r_{1}^{2}=0 \nonumber$

The general characteristic equation is

$\operatorname{ar}(r-1)+b r+c=0 \nonumber$

Rewriting this, we have

$r^{2}+\left(\dfrac{b}{a}-1\right) r+\dfrac{c}{a}=0 . \nonumber$

Comparing equations, we find

$\dfrac{b}{a}=1-2 r_{1}, \quad \dfrac{c}{a}=r_{1}^{2} \nonumber$

So, the general Cauchy-Euler equation in this case takes the form

$x^{2} y^{\prime \prime}+\left(1-2 r_{1}\right) x y^{\prime}+r_{1}^{2} y=0 . \nonumber$

Now we seek the second linearly independent solution in the form $y_{2}(x)=$ $v(x) x^{r_{1}}$ . We first list this function and its derivatives,

$\begin{aligned} y_{2}(x) &=v x^{r_{1}} \\[4pt] y_{2}^{\prime}(x) &=\left(x v^{\prime}+r_{1} v\right) x^{r_{1}-1}, \\[4pt] y_{2}^{\prime \prime}(x) &=\left(x^{2} v^{\prime \prime}+2 r_{1} x v^{\prime}+r_{1}\left(r_{1}-1\right) v\right) x^{r_{1}-2} . \end{aligned} \nonumber$

Inserting these forms into the differential equation, we have

$\begin{aligned} 0 &=x^{2} y^{\prime \prime}+\left(1-2 r_{1}\right) x y^{\prime}+r_{1}^{2} y \\[4pt] &=\left(x v^{\prime \prime}+v^{\prime}\right) x^{r_{1}+1} \end{aligned} \nonumber$

Thus, we need to solve the equation

$x v^{\prime \prime}+v^{\prime}=0, \nonumber$

$\dfrac{v^{\prime \prime}}{v^{\prime}}=-\dfrac{1}{x} \nonumber$

Integrating, we have

$\ln \left|v^{\prime}\right|=-\ln |x|+C . \nonumber$

Exponentiating, we have one last differential equation to solve,

$v^{\prime}=\dfrac{A}{x} . \nonumber$

Thus,

$v(x)=A \ln |x|+k . \nonumber$

So, we have found that the second linearly independent equation can be written as

$y_{2}(x)=x^{r_{1}} \ln |x| \nonumber$

Complex Roots

When one has complex roots in the solution of constant coefficient equations, one needs to look at the solutions

$y_{1,2}(x)=e^{(\alpha \pm i \beta) x} \nonumber$

We make use of Euler’s formula

$e^{i \beta x}=\cos \beta x+i \sin \beta x . \nonumber$

Then the linear combination of $y_{1}(x)$ and $y_{2}(x)$ becomes

$\begin{aligned} A e^{(\alpha+i \beta) x}+B e^{(\alpha-i \beta) x} &=e^{\alpha x}\left[A e^{i \beta x}+B e^{-i \beta x}\right] \\[4pt] &=e^{\alpha x}[(A+B) \cos \beta x+i(A-B) \sin \beta x] \\[4pt] & \equiv e^{\alpha x}\left(c_{1} \cos \beta x+c_{2} \sin \beta x\right) . \end{aligned} \nonumber$

Thus, we see that we have a linear combination of two real, linearly independent solutions, $e^{\alpha x} \cos \beta x$ and $e^{\alpha x} \sin \beta x$ .

When dealing with the Cauchy-Euler equations, we have solutions of the form $y(x)=x^{\alpha+i \beta}$ . The key to obtaining real solutions is to first recall that

$x^{y}=e^{\ln x^{y}}=e^{y \ln x} . \nonumber$

Thus, a power can be written as an exponential and the solution can be written as

$y(x)=x^{\alpha+i \beta}=x^{\alpha} e^{i \beta \ln x}, \quad x>0 . \nonumber$

We can now find two real, linearly independent solutions, $x^{\alpha} \cos (\beta \ln |x|)$ and $x^{\alpha} \sin (\beta \ln |x|)$ following the same steps as above for the constant coefficient case.

Problems

1.1. Find all of the solutions of the first order differential equations. When an initial condition is given, find the particular solution satisfying that condition.
a. $\dfrac{d y}{d x}=\dfrac{\sqrt{1-y^{2}}}{x}$
b. $x y^{\prime}=y(1-2 y), \quad y(1)=2$ .
c. $y^{\prime}-(\sin x) y=\sin x$ .
d. $x y^{\prime}-2 y=x^{2}, y(1)=1$ .
e. $\dfrac{d s}{d t}+2 s=s t^{2}, \quad, s(0)=1$ .
f. $x^{\prime}-2 x=t e^{2 t}$ .

1.2. Find all of the solutions of the second order differential equations. When an initial condition is given, find the particular solution satisfying that condition.
a. $y^{\prime \prime}-9 y^{\prime}+20 y=0$
b. $y^{\prime \prime}-3 y^{\prime}+4 y=0, \quad y(0)=0, \quad y^{\prime}(0)=1$ .
c. $x^{2} y^{\prime \prime}+5 x y^{\prime}+4 y=0, \quad x>0$ .
d. $x^{2} y^{\prime \prime}-2 x y^{\prime}+3 y=0, \quad x>0$ .

1.3. Consider the differential equation

$\dfrac{d y}{d x}=\dfrac{x}{y}-\dfrac{x}{1+y} . \nonumber$

a. Find the 1-parameter family of solutions (general solution) of this equation.

b. Find the solution of this equation satisfying the initial condition $y(0)=1$ . Is this a member of the 1-parameter family?

1.4. The initial value problem

$\dfrac{d y}{d x}=\dfrac{y^{2}+x y}{x^{2}}, \quad y(1)=1 \nonumber$

does not fall into the class of problems considered in our review. However, if one substitutes $y(x)=x z(x)$ into the differential equation, one obtains an equation for $z(x)$ which can be solved. Use this substitution to solve the initial value problem for $y(x)$

1.5. Consider the nonhomogeneous differential equation $x^{\prime \prime}-3 x^{\prime}+2 x=6 e^{3 t}$ .

a. Find the general solution of the homogenous equation.

b. Find a particular solution using the Method of Undetermined Coefficients by guessing $x_{p}(t)=A e^{3 t}$ .

c. Use your answers in the previous parts to write down the general solution for this problem.

1.6. Find the general solution of each differential equation. When an initial condition is given, find the particular solution satisfying that condition.

a. $y^{\prime \prime}-3 y^{\prime}+2 y=20 e^{-2 x}, \quad y(0)=0, \quad y^{\prime}(0)=6$ .

b. $y^{\prime \prime}+y=2 \sin 3 x$ . c. $y^{\prime \prime}+y=1+2 \cos x$ .

d. $x^{2} y^{\prime \prime}-2 x y^{\prime}+2 y=3 x^{2}-x, \quad x>0$

1.7. Verify that the given function is a solution and use Reduction of Order to find a second linearly independent solution.

a. $x^{2} y^{\prime \prime}-2 x y^{\prime}-4 y=0, \quad y_{1}(x)=x^{4}$ .

b. $x y^{\prime \prime}-y^{\prime}+4 x^{3} y=0, \quad y_{1}(x)=\sin \left(x^{2}\right)$ .

1.8. A certain model of the motion of a tossed whiffle ball is given by

$m x^{\prime \prime}+c x^{\prime}+m g=0, \quad x(0)=0, \quad x^{\prime}(0)=v_{0} . \nonumber$

Here $m$ is the mass of the ball, $g=9.8 \mathrm{~m} / \mathrm{s}^{2}$ is the acceleration due to gravity and $c$ is a measure of the damping. Since there is no $x$ term, we can write this as a first order equation for the velocity $v(t)=x^{\prime}(t)$ :

$m v^{\prime}+c v+m g=0 . \nonumber$

a. Find the general solution for the velocity $v(t)$ of the linear first order differential equation above.

b. Use the solution of part a to find the general solution for the position $x(t)$ .

c. Find an expression to determine how long it takes for the ball to reach it’s maximum height?

d. Assume that $c / m=10 \mathrm{~s}^{-1}$ . For $v_{0}=5,10,15,20 \mathrm{~m} / \mathrm{s}$ , plot the solution, $x(t)$ , versus the time.

e. From your plots and the expression in part c, determine the rise time. Do these answers agree?

f. What can you say about the time it takes for the ball to fall as compared to the rise time?

Systems of Differential Equations

Introduction

In this chapter we will begin our study of systems of differential equations. After defining first order systems, we will look at constant coefficient systems and the behavior of solutions for these systems. Also, most of the discussion will focus on planar, or two dimensional, systems. For such systems we will be able to look at a variety of graphical representations of the family of solutions and discuss the qualitative features of systems we can solve in preparation for the study of systems whose solutions cannot be found in an algebraic form.

A general form for first order systems in the plane is given by a system of two equations for unknowns $x(t)$ and $y(t)$ :

$\begin{aligned} &x^{\prime}(t)=P(x, y, t) \\[4pt] &y^{\prime}(t)=Q(x, y, t) \end{aligned} \nonumber$

An autonomous system is one in which there is no explicit time dependence:

$\begin{aligned} &x^{\prime}(t)=P(x, y) \\[4pt] &y^{\prime}(t)=Q(x, y) \end{aligned} \nonumber$

Otherwise the system is called nonautonomous.

A linear system takes the form

$\begin{aligned} x^{\prime} &=a(t) x+b(t) y+e(t) \\[4pt] y^{\prime} &=c(t) x+d(t) y+f(t) \end{aligned} \nonumber$

A homogeneous linear system results when $e(t)=0$ and $f(t)=0$ .

A linear, constant coefficient system of first order differential equations is given by

$\begin{aligned} &x^{\prime}=a x+b y+e \\[4pt] &y^{\prime}=c x+d y+f \end{aligned} \nonumber$

We will focus on linear, homogeneous systems of constant coefficient first order differential equations:

As we will see later, such systems can result by a simple translation of the unknown functions. These equations are said to be coupled if either $b \neq 0$ or $c \neq 0$ .

We begin by noting that the system (2.5) can be rewritten as a second order constant coefficient linear differential equation, which we already know how to solve. We differentiate the first equation in system system (2.5) and systematically replace occurrences of $y$ and $y^{\prime}$ , since we also know from the first equation that $y=\dfrac{1}{b}\left(x^{\prime}-a x\right)$ . Thus, we have

$\begin{aligned} x^{\prime \prime} &=a x^{\prime}+b y^{\prime} \\[4pt] &=a x^{\prime}+b(c x+d y) \\[4pt] &=a x^{\prime}+b c x+d\left(x^{\prime}-a x\right) . \end{aligned} \nonumber$

Rewriting the last line, we have

$x^{\prime \prime}-(a+d) x^{\prime}+(a d-b c) x=0 .$

This is a linear, homogeneous, constant coefficient ordinary differential equation. We know that we can solve this by first looking at the roots of the characteristic equation

$r^{2}-(a+d) r+a d-b c=0 \nonumber$

and writing down the appropriate general solution for $x(t)$ . Then we can find $y(t)$ using Equation (2.5):

$y=\dfrac{1}{b}\left(x^{\prime}-a x\right) . \nonumber$

We now demonstrate this for a specific example.

Example 2.1. Consider the system of differential equations

$\begin{aligned} &x^{\prime}=-x+6 y \\[4pt] &y^{\prime}=x-2 y . \end{aligned} \nonumber$

Carrying out the above outlined steps, we have that $x^{\prime \prime}+3 x^{\prime}-4 x=0$ . This can be shown as follows:

$\begin{aligned} x^{\prime \prime} &=-x^{\prime}+6 y^{\prime} \\[4pt] &=-x^{\prime}+6(x-2 y) \\[4pt] &=-x^{\prime}+6 x-12\left(\dfrac{x^{\prime}+x}{6}\right) \\[4pt] &=-3 x^{\prime}+4 x \end{aligned} \nonumber$

The resulting differential equation has a characteristic equation of $r^{2}+3 r-$ $4=0$ . The roots of this equation are $r=1,-4$ . Therefore, $x(t)=c_{1} e^{t}+c_{2} e^{-4 t} .$ But, we still need $y(t)$ . From the first equation of the system we have

$y(t)=\dfrac{1}{6}\left(x^{\prime}+x\right)=\dfrac{1}{6}\left(2 c_{1} e^{t}-3 c_{2} e^{-4 t}\right) . \nonumber$

Thus, the solution to our system is

$\begin{aligned} &x(t)=c_{1} e^{t}+c_{2} e^{-4 t}, \\[4pt] &y(t)=\dfrac{1}{3} c_{1} e^{t}-\dfrac{1}{2} c_{2} e^{-4 t} \end{aligned} \nonumber$

Sometimes one needs initial conditions. For these systems we would specify conditions like $x(0)=x_{0}$ and $y(0)=y_{0}$ . These would allow the determination of the arbitrary constants as before.

Example 2.2. Solve

$\begin{aligned} &x^{\prime}=-x+6 y \\[4pt] &y^{\prime}=x-2 y \end{aligned} \nonumber$

given $x(0)=2, y(0)=0$ .

We already have the general solution of this system in (2.11). Inserting the initial conditions, we have

$\begin{aligned} &2=c_{1}+c_{2}, \\[4pt] &0=\dfrac{1}{3} c_{1}-\dfrac{1}{2} c_{2} . \end{aligned} \nonumber$

Solving for $c_{1}$ and $c_{2}$ gives $c_{1}=6 / 5$ and $c_{2}=4 / 5$ . Therefore, the solution of the initial value problem is

$\begin{aligned} &x(t)=\dfrac{2}{5}\left(3 e^{t}+2 e^{-4 t}\right) \\[4pt] &y(t)=\dfrac{2}{5}\left(e^{t}-e^{-4 t}\right) \end{aligned} \nonumber$

$2.2$ Equilibrium Solutions and Nearby Behaviors

In studying systems of differential equations, it is often useful to study the behavior of solutions without obtaining an algebraic form for the solution. This is done by exploring equilibrium solutions and solutions nearby equilibrium solutions. Such techniques will be seen to be useful later in studying nonlinear systems.

We begin this section by studying equilibrium solutions of system (2.4). For equilibrium solutions the system does not change in time. Therefore, equilibrium solutions satisfy the equations $x^{\prime}=0$ and $y^{\prime}=0$ . Of course, this can only happen for constant solutions. Let $x_{0}$ and $y_{0}$ be the (constant) equilibrium solutions. Then, $x_{0}$ and $y_{0}$ must satisfy the system

$\begin{aligned} &0=a x_{0}+b y_{0}+e \\[4pt] &0=c x_{0}+d y_{0}+f \end{aligned} \nonumber$

This is a linear system of nonhomogeneous algebraic equations. One only has a unique solution when the determinant of the system is not zero, i.e., $a d-b c \neq 0$ . Using Cramer’s (determinant) Rule for solving such systems, we have

$x_{0}=-\dfrac{\left|\begin{array}{ll} e & b \\[4pt] f & d \end{array}\right|}{\left|\begin{array}{ll} a & b \\[4pt] c & d \end{array}\right|}, \quad y_{0}=-\dfrac{\left|\begin{array}{ll} a & e \\[4pt] c & f \end{array}\right|}{\left|\begin{array}{ll} a & b \\[4pt] c & d \end{array}\right|} . \quad \text { (2.16) } \nonumber$

If the system is homogeneous, $e=f=0$ , then we have that the origin is the equilibrium solution; i.e., $\left(x_{0}, y_{0}\right)=(0,0)$ . Often we will have this case since one can always make a change of coordinates from $(x, y)$ to $(u, v)$ by $u=x-x_{0}$ and $v=y-y_{0}$ . Then, $u_{0}=v_{0}=0$ .

Next we are interested in the behavior of solutions near the equilibrium solutions. Later this behavior will be useful in analyzing more complicated nonlinear systems. We will look at some simple systems that are readily solved.

Example 2.3. Stable Node $(\operatorname{sink})$

Consider the system

$\begin{aligned} &x^{\prime}=-2 x \\[4pt] &y^{\prime}=-y \end{aligned} \nonumber$

This is a simple uncoupled system. Each equation is simply solved to give

$x(t)=c_{1} e^{-2 t} \text { and } y(t)=c_{2} e^{-t} . \nonumber$

In this case we see that all solutions tend towards the equilibrium point, $(0,0)$ . This will be called a stable node, or a sink.

Before looking at other types of solutions, we will explore the stable node in the above example. There are several methods of looking at the behavior of solutions. We can look at solution plots of the dependent versus the independent variables, or we can look in the $x y$ -plane at the parametric curves $(x(t), y(t))$

Solution Plots: One can plot each solution as a function of $t$ given a set of initial conditions. Examples are are shown in Figure $2.1$ for several initial conditions. Note that the solutions decay for large $t$ . Special cases result for various initial conditions. Note that for $t=0, x(0)=c_{1}$ and $y(0)=c_{2}$ . (Of course, one can provide initial conditions at any $t=t_{0}$ . It is generally easier to pick $t=0$ in our general explanations.) If we pick an initial condition with $c_{1}=0$ , then $x(t)=0$ for all $t$ . One obtains similar results when setting $y(0)=0$

Figure 2.1. Plots of solutions of Example $2.3$ for several initial conditions.

Phase Portrait: There are other types of plots which can provide additional information about our solutions even if we cannot find the exact solutions as we can for these simple examples. In particular, one can consider the solutions $x(t)$ and $y(t)$ as the coordinates along a parameterized path, or curve, in the plane: $\mathbf{r}=(x(t), y(t))$ Such curves are called trajectories or orbits. The $x y$ -plane is called the phase plane and a collection of such orbits gives a phase portrait for the family of solutions of the given system.

One method for determining the equations of the orbits in the phase plane is to eliminate the parameter $t$ between the known solutions to get a relationship between $x$ and $y$ . In the above example we can do this, since the solutions are known. In particular, we have

$x=c_{1} e^{-2 t}=c_{1}\left(\dfrac{y}{c_{2}}\right)^{2} \equiv A y^{2} . \nonumber$

Another way to obtain information about the orbits comes from noting that the slopes of the orbits in the $x y$ -plane are given by $d y / d x$ . For autonomous systems, we can write this slope just in terms of $x$ and $y$ . This leads to a first order differential equation, which possibly could be solved analytically, solved numerically, or just used to produce a direction field. We will see that direction fields are useful in determining qualitative behaviors of the solutions without actually finding explicit solutions.

First we will obtain the orbits for Example $2.3$ by solving the corresponding slope equation. First, recall that for trajectories defined parametrically by $x=x(t)$ and $y=y(t)$ , we have from the Chain Rule for $y=y(x(t))$ that

$\dfrac{d y}{d t}=\dfrac{d y}{d x} \dfrac{d x}{d t} \nonumber$

Therefore,

$\dfrac{d y}{d x}=\dfrac{\dfrac{d y}{d t}}{\dfrac{d x}{d t}} \nonumber$

For the system in (2.17) we use Equation (2.18) to obtain the equation for the slope at a point on the orbit:

$\dfrac{d y}{d x}=\dfrac{y}{2 x} \nonumber$

The general solution of this first order differential equation is found using separation of variables as $x=A y^{2}$ for $A$ an arbitrary constant. Plots of these solutions in the phase plane are given in Figure 2.2. [Note that this is the same form for the orbits that we had obtained above by eliminating $t$ from the solution of the system.]

Once one has solutions to differential equations, we often are interested in the long time behavior of the solutions. Given a particular initial condition $\left(x_{0}, y_{0}\right)$ , how does the solution behave as time increases? For orbits near an equilibrium solution, do the solutions tend towards, or away from, the equilibrium point? The answer is obvious when one has the exact solutions $x(t)$ and $y(t)$ . However, this is not always the case. Let’s consider the above example for initial conditions in the first quadrant of the phase plane. For a point in the first quadrant we have that

$d x / d t=-2 x<0 \nonumber$

meaning that as $t \rightarrow \infty, x(t)$ get more negative. Similarly,

$d y / d t=-y<0, \nonumber$

indicates that $y(t)$ is also getting smaller for this problem. Thus, these orbits tend towards the origin as $t \rightarrow \infty$ . This qualitative information was obtained without relying on the known solutions to the problem.

Direction Fields: Another way to determine the behavior of our system is to draw the direction field. Recall that a direction field is a vector field in which one plots arrows in the direction of tangents to the orbits. This is done because the slopes of the tangent lines are given by $d y / d x$ . For our system $(2.5)$ , the slope is

$\dfrac{d y}{d x}=\dfrac{a x+b y}{c x+d y} . \nonumber$

In general, for nonautonomous systems, we obtain a first order differential equation of the form

$\dfrac{d y}{d x}=F(x, y) . \nonumber$

This particular equation can be solved by the reader. See homework problem $2.2$

Example 2.4. Draw the direction field for Example $2.3$ .

We can use software to draw direction fields. However, one can sketch these fields by hand. we have that the slope of the tangent at this point is given by

$\dfrac{d y}{d x}=\dfrac{-y}{-2 x}=\dfrac{y}{2 x} \nonumber$

For each point in the plane one draws a piece of tangent line with this slope. In Figure $2.3$ we show a few of these. For $(x, y)=(1,1)$ the slope is $d y / d x=1 / 2$ . So, we draw an arrow with slope $1 / 2$ at this point. From system (2.17), we have that $x^{\prime}$ and $y^{\prime}$ are both negative at this point. Therefore, the vector points down and to the left.

We can do this for several points, as shown in Figure 2.3. Sometimes one can quickly sketch vectors with the same slope. For this example, when $y=0$ , the slope is zero and when $x=0$ the slope is infinite. So, several vectors can be provided. Such vectors are tangent to curves known as isoclines in which $\dfrac{d y}{d x}=$ constant.

It is often difficult to provide an accurate sketch of a direction field. Computer software can be used to provide a better rendition. For Example $2.3$ the direction field is shown in Figure 2.4. Looking at this direction field, one can begin to "see" the orbits by following the tangent vectors.

Figure 2.3. A sketch of several tangent vectors for Example 2.3.

Figure 2.4. Direction field for Example 2.3.

Of course, one can superimpose the orbits on the direction field. This is shown in Figure 2.5. Are these the patterns you saw in Figure $2.4 ?$

In this example we see all orbits "flow" towards the origin, or equilibrium point. Again, this is an example of what is called a stable node or a sink. (Imagine what happens to the water in a sink when the drain is unplugged.)

Example 2.5. Saddle

Figure 2.5. Phase portrait for Example $2.3$ .

$y^{\prime}=y \nonumber$

This is another uncoupled system. The solutions are again simply gotten by integration. We have that $x(t)=c_{1} e^{-t}$ and $y(t)=c_{2} e^{t}$ . Here we have that $x$ decays as $t$ gets large and $y$ increases as $t$ gets large. In particular, if one picks initial conditions with $c_{2}=0$ , then orbits follow the $x$ -axis towards the origin. For initial points with $c_{1}=0$ , orbits originating on the $y$ -axis will flow away from the origin. Of course, in these cases the origin is an equilibrium point and once at equilibrium, one remains there.

In fact, there is only one line on which to pick initial conditions such that the orbit leads towards the equilibrium point. No matter how small $c_{2}$ is, sooner, or later, the exponential growth term will dominate the solution. One can see this behavior in Figure 2.6.

Similar to the first example, we can look at a variety of plots. These are given by Figures $2.6-2.7$ . The orbits can be obtained from the system as

$\dfrac{d y}{d x}=\dfrac{d y / d t}{d x / d t}=-\dfrac{y}{x} \nonumber$

The solution is $y=\dfrac{A}{x}$ . For different values of $A \neq 0$ we obtain a family of hyperbolae. These are the same curves one might obtain for the level curves of a surface known as a saddle surface, $z=x y$ . Thus, this type of equilibrium point is classified as a saddle point. From the phase portrait we can verify that there are many orbits that lead away from the origin (equilibrium point), but there is one line of initial conditions that leads to the origin and that is the $x$ -axis. In this case, the line of initial conditions is given by the $x$ -axis.

Figure 2.6. Plots of solutions of Example $2.5$ for several initial conditions.

Figure 2.7. Phase portrait for Example 2.5, a saddle.

Example 2.6. Unstable Node (source)

$\begin{aligned} &x^{\prime}=2 x \\[4pt] &y^{\prime}=y \end{aligned} \nonumber$

This example is similar to Example 2.3. The solutions are obtained by replacing $t$ with $-t$ . The solutions, orbits and direction fields can be seen in Figures 2.8-2.9. This is once again a node, but all orbits lead away from the equilibrium point. It is called an unstable node or a source.

Figure 2.8. Plots of solutions of Example $2.6$ for several initial conditions.

Example 2.7. Center

$\begin{aligned} &x^{\prime}=y \\[4pt] &y^{\prime}=-x \end{aligned} \nonumber$

This system is a simple, coupled system. Neither equation can be solved without some information about the other unknown function. However, we can differentiate the first equation and use the second equation to obtain

$x^{\prime \prime}+x=0 . \nonumber$

We recognize this equation from the last chapter as one that appears in the study of simple harmonic motion. The solutions are pure sinusoidal oscillations:

$x(t)=c_{1} \cos t+c_{2} \sin t, \quad y(t)=-c_{1} \sin t+c_{2} \cos t . \nonumber$

In the phase plane the trajectories can be determined either by looking at the direction field, or solving the first order equation

Figure 2.9. Phase portrait for Example 2.6, an unstable node or source.

$\dfrac{d y}{d x}=-\dfrac{x}{y} \text {. } \nonumber$

Performing a separation of variables and integrating, we find that

$x^{2}+y^{2}=C . \nonumber$

Thus, we have a family of circles for $C>0$ . (Can you prove this using the general solution?) Looking at the results graphically in Figures 2.10-2.11 confirms this result. This type of point is called a center.

Example 2.8. Focus (spiral)

$\begin{aligned} &x^{\prime}=\alpha x+y \\[4pt] &y^{\prime}=-x . \end{aligned} \nonumber$

In this example, we will see an additional set of behaviors of equilibrium points in planar systems. We have added one term, $\alpha x$ , to the system in Example 2.7. We will consider the effects for two specific values of the parameter: $\alpha=0.1,-0.2$ . The resulting behaviors are shown in the remaining graphs. We see orbits that look like spirals. These orbits are stable and unstable spirals (or foci, the plural of focus.)

We can understand these behaviors by once again relating the system of first order differential equations to a second order differential equation. Using our usual method for obtaining a second order equation form a system, we find that $x(t)$ satisfies the differential equation

Figure 2.10. Plots of solutions of Example $2.7$ for several initial conditions.

Figure 2.11. Phase portrait for Example 2.7, a center.

$x^{\prime \prime}-\alpha x^{\prime}+x=0 \nonumber$

We recall from our first course that this is a form of damped simple harmonic motion. We will explore the different types of solutions that will result for various $\alpha$ ’s.

Figure 2.12. Plots of solutions of Example $2.8$ for several initial conditions with $\alpha=0.1$

Figure 2.13. Plots of solutions of Example $2.8$ for several initial conditions with $\alpha=-0.2$

The characteristic equation is $r^{2}-\alpha r+1=0$ . The solution of this quadratic equation is

$r=\dfrac{\alpha \pm \sqrt{\alpha^{2}-4}}{2} \nonumber$

There are five special cases to consider as shown below.

$\text { Classification of Solutions of } x^{\prime \prime}-\alpha x^{\prime}+x=0 \nonumber$

$\alpha=-2$ . There is one real solution. This case is called critical damping since the solution $r=-1$ leads to exponential decay. The solution is $x(t)=\left(c_{1}+c_{2} t\right) e^{-t}$ .
$\alpha<-2$ . There are two real, negative solutions, $r=-\mu,-\nu$ , $\mu, \nu>0$ . The solution is $x(t)=c_{1} e^{-\mu t}+c_{2} e^{-\nu t}$ . In this case we have what is called overdamped motion. There are no oscillations
$-2<\alpha<0$ . There are two complex conjugate solutions $r=$ $\alpha / 2 \pm i \beta$ with real part less than zero and $\beta=\dfrac{\sqrt{4-\alpha^{2}}}{2}$ . The solution is $x(t)=\left(c_{1} \cos \beta t+c_{2} \sin \beta t\right) e^{\alpha t / 2}$ . Since $\alpha<0$ , this consists of a decaying exponential times oscillations. This is often called an underdamped oscillation.
$\alpha=0$ . This leads to simple harmonic motion.
$0<\alpha<2$ . This is similar to the underdamped case, except $\alpha>0$ . The solutions are growing oscillations.
$\alpha=2$ . There is one real solution. The solution is $x(t)=\left(c_{1}+\right.$ $\left.c_{2} t\right) e^{t}$ . It leads to unbounded growth in time.
For $\alpha>2$ . There are two real, positive solutions $r=\mu, \nu>0$ . The solution is $x(t)=c_{1} e^{\mu t}+c_{2} e^{\nu t}$ , which grows in time.

For $\alpha<0$ the solutions are losing energy, so the solutions can oscillate with a diminishing amplitude. For $\alpha>0$ , there is a growth in the amplitude, which is not typical. Of course, there can be overdamped motion if the magnitude of $\alpha$ is too large.

Example 2.9. Degenerate Node

$\begin{aligned} &x^{\prime}=-x \\[4pt] &y^{\prime}=-2 x-y \end{aligned} \nonumber$

For this example, we write out the solutions. While it is a coupled system, only the second equation is coupled. There are two possible approaches.

a. We could solve the first equation to find $x(t)=c_{1} e^{-t}$ . Inserting this solution into the second equation, we have

$y^{\prime}+y=-2 c_{1} e^{-t} \nonumber$

This is a relatively simple linear first order equation for $y=y(t)$ . The integrating factor is $\mu=e^{t}$ . The solution is found as $y(t)=\left(c_{2}-2 c_{1} t\right) e^{-t}$ .

b. Another method would be to proceed to rewrite this as a second order equation. Computing $x^{\prime \prime}$ does not get us very far. So, we look at

Figure 2.14. Phase portrait for Example $2.8$ with $\alpha=0.1$ . This is an unstable focus, or spiral.

$\begin{aligned} y^{\prime \prime} &=-2 x^{\prime}-y^{\prime} \\[4pt] &=2 x-y^{\prime} \\[4pt] &=-2 y^{\prime}-y \end{aligned} \nonumber$

Therefore, $y$ satisfies

$y^{\prime \prime}+2 y^{\prime}+y=0 . \nonumber$

The characteristic equation has one real root, $r=-1$ . So, we write

$y(t)=\left(k_{1}+k_{2} t\right) e^{-t} . \nonumber$

This is a stable degenerate node. Combining this with the solution $x(t)=$ $c_{1} e^{-t}$ , we can show that $y(t)=\left(c_{2}-2 c_{1} t\right) e^{-t}$ as before.

In Figure $2.16$ we see several orbits in this system. It differs from the stable node show in Figure $2.2$ in that there is only one direction along which the orbits approach the origin instead of two. If one picks $c_{1}=0$ , then $x(t)=0$ and $y(t)=c_{2} e^{-t}$ . This leads to orbits running along the $y$ -axis as seen in the figure.

Figure 2.15. Phase portrait for Example $2.8$ with $\alpha=-0.2$ . This is a stable focus, or spiral.

In this last example, we have a coupled set of equations. We rewrite it as a second order differential equation:

$\begin{aligned} x^{\prime \prime} &=2 x^{\prime}-y^{\prime} \\[4pt] &=2 x^{\prime}-(-2 x+y) \\[4pt] &=2 x^{\prime}+2 x+\left(x^{\prime}-2 x\right)=3 x^{\prime} \end{aligned} \nonumber$

So, the second order equation is

$x^{\prime \prime}-3 x^{\prime}=0 \nonumber$

and the characteristic equation is $0=r(r-3)$ . This gives the general solution as

$x(t)=c_{1}+c_{2} e^{3} t \nonumber$

and thus

$y=2 x-x^{\prime}=2\left(c_{1}+c_{2}^{3} t\right)-\left(3 c_{2} e^{3 t}\right)=2 c_{1}-c_{2} e^{3 t} \nonumber$

In Figure $2.17$ we show the direction field. The constant slope field seen in this example is confirmed by a simple computation:

$\dfrac{d y}{d x}=\dfrac{-2 x+y}{2 x-y}=-1 \nonumber$

Furthermore, looking at initial conditions with $y=2 x$ , we have at $t=0$ ,

Figure 2.16. Plots of solutions of Example $2.9$ for several initial conditions.

$2 c_{1}-c_{2}=2\left(c_{1}+c_{2}\right) \quad \Rightarrow \quad c_{2}=0 \nonumber$

Therefore, points on this line remain on this line forever, $(x, y)=\left(c_{1}, 2 c_{1}\right)$ . This line of fixed points is called a line of equilibria.

Polar Representation of Spirals

In the examples with a center or a spiral, one might be able to write the solutions in polar coordinates. Recall that a point in the plane can be described by either Cartesian $(x, y)$ or polar $(r, \theta)$ coordinates. Given the polar form, one can find the Cartesian components using

$x=r \cos \theta \text { and } y=r \sin \theta . \nonumber$

Given the Cartesian coordinates, one can find the polar coordinates using

$r^{2}=x^{2}+y^{2} \text { and } \tan \theta=\dfrac{y}{x} . \nonumber$

Since $x$ and $y$ are functions of $t$ , then naturally we can think of $r$ and $\theta$ as functions of $t$ . The equations that they satisfy are obtained by differentiating the above relations with respect to $t$ .

Figure 2.17. Plots of direction field of Example 2.10.

Differentiating the first equation in $(2.27)$ gives

$r r^{\prime}=x x^{\prime}+y y^{\prime} . \nonumber$

Inserting the expressions for $x^{\prime}$ and $y^{\prime}$ from system $2.5$ , we have

$r r^{\prime}=x(a x+b y)+y(c x+d y) . \nonumber$

In some cases this may be written entirely in terms of $r$ ’s. Similarly, we have that

$\theta^{\prime}=\dfrac{x y^{\prime}-y x^{\prime}}{r^{2}} \nonumber$

which the reader can prove for homework.

In summary, when converting first order equations from rectangular to polar form, one needs the relations below.

Time Derivatives of Polar Variables

$\begin{aligned} r^{\prime} &=\dfrac{x x^{\prime}+y y^{\prime}}{r} \\[4pt] \theta^{\prime} &=\dfrac{x y^{\prime}-y x^{\prime}}{r^{2}} \end{aligned} \nonumber$

Example 2.11. Rewrite the following system in polar form and solve the resulting system.

$\begin{aligned} &x^{\prime}=a x+b y \\[4pt] &y^{\prime}=-b x+a y \end{aligned} \nonumber$

We first compute $r^{\prime}$ and $\theta^{\prime}$ :

$\begin{gathered} r r^{\prime}=x x^{\prime}+y y^{\prime}=x(a x+b y)+y(-b x+a y)=a r^{2} \\[4pt] r^{2} \theta^{\prime}=x y^{\prime}-y x^{\prime}=x(-b x+a y)-y(a x+b y)=-b r^{2} . \end{gathered} \nonumber$

This leads to simpler system

$\begin{aligned} &r^{\prime}=a r \\[4pt] &\theta^{\prime}=-b \end{aligned} \nonumber$

This system is uncoupled. The second equation in this system indicates that we traverse the orbit at a constant rate in the clockwise direction. Solving these equations, we have that $r(t)=r_{0} e^{a t}, \quad \theta(t)=\theta_{0}-b t$ . Eliminating $t$ between these solutions, we finally find the polar equation of the orbits:

$r=r_{0} e^{-a\left(\theta-\theta_{0}\right) t / b} \nonumber$

If you graph this for $a \neq 0$ , you will get stable or unstable spirals.

Example 2.12. Consider the specific system

$\begin{aligned} &x^{\prime}=-y+x \\[4pt] &y^{\prime}=x+y . \end{aligned} \nonumber$

In order to convert this system into polar form, we compute

$\begin{gathered} r r^{\prime}=x x^{\prime}+y y^{\prime}=x(-y+x)+y(x+y)=r^{2} \\[4pt] r^{2} \theta^{\prime}=x y^{\prime}-y x^{\prime}=x(x+y)-y(-y+x)=r^{2} \end{gathered} \nonumber$

This leads to simpler system

$\begin{aligned} &r^{\prime}=r \\[4pt] &\theta^{\prime}=1 \end{aligned} \nonumber$

Solving these equations yields

$r(t)=r_{0} e^{t}, \quad \theta(t)=t+\theta_{0} . \nonumber$

Eliminating $t$ from this solution gives the orbits in the phase plane, $r(\theta)=$ $r_{0} e^{\theta-\theta_{0}}$ A more complicated example arises for a nonlinear system of differential equations. Consider the following example.

Example 2.13.

$\begin{aligned} &x^{\prime}=-y+x\left(1-x^{2}-y^{2}\right) \\[4pt] &y^{\prime}=x+y\left(1-x^{2}-y^{2}\right) \end{aligned} \nonumber$

Transforming to polar coordinates, one can show that In order to convert this system into polar form, we compute

$r^{\prime}=r\left(1-r^{2}\right), \quad \theta^{\prime}=1 \nonumber$

This uncoupled system can be solved and such nonlinear systems will be studied in the next chapter.

Matrix Formulation

We have investigated several linear systems in the plane and in the next chapter we will use some of these ideas to investigate nonlinear systems. We need a deeper insight into the solutions of planar systems. So, in this section we will recast the first order linear systems into matrix form. This will lead to a better understanding of first order systems and allow for extensions to higher dimensions and the solution of nonhomogeneous equations later in this chapter.

We start with the usual homogeneous system in Equation (2.5). Let the unknowns be represented by the vector

$\mathbf{x}(t)=\left(\begin{array}{c} x(t) \\[4pt] y(t) \end{array}\right) \nonumber$

Then we have that

$\mathbf{x}^{\prime}=\left(\begin{array}{l} x^{\prime} \\[4pt] y^{\prime} \end{array}\right)=\left(\begin{array}{c} a x+b y \\[4pt] c x+d y \end{array}\right)=\left(\begin{array}{ll} a & b \\[4pt] c & d \end{array}\right)\left(\begin{array}{l} x \\[4pt] y \end{array}\right) \equiv A \mathbf{x} \nonumber$

Here we have introduced the coefficient matrix $A$ . This is a first order vector differential equation,

$\mathbf{x}^{\prime}=A \mathbf{x} \nonumber$

Formerly, we can write the solution as

$\mathbf{x}=\mathbf{x}_{0} e^{A t} \nonumber$

$\overline{1}$ The exponential of a matrix is defined using the Maclaurin series expansion We would like to investigate the solution of our system. Our investigations will lead to new techniques for solving linear systems using matrix methods.

We begin by recalling the solution to the specific problem (2.12). We obtained the solution to this system as

$\begin{gathered} x(t)=c_{1} e^{t}+c_{2} e^{-4 t}, \\[4pt] y(t)=\dfrac{1}{3} c_{1} e^{t}-\dfrac{1}{2} c_{2} e^{-4 t} \end{gathered} \nonumber$

This can be rewritten using matrix operations. Namely, we first write the solution in vector form.

$\begin{aligned} \mathbf{x} &=\left(\begin{array}{c} x(t) \\[4pt] y(t) \end{array}\right) \\[4pt] &=\left(\begin{array}{c} c_{1} e^{t}+c_{2} e^{-4 t} \\[4pt] \dfrac{1}{3} c_{1} e^{t}-\dfrac{1}{2} c_{2} e^{-4 t} \end{array}\right) \\[4pt] &=\left(\begin{array}{c} c_{1} e^{t} \\[4pt] \dfrac{1}{3} c_{1} e^{t} \end{array}\right)+\left(\begin{array}{c} c_{2} e^{-4 t} \\[4pt] -\dfrac{1}{2} c_{2} e^{-4 t} \end{array}\right) \\[4pt] &=c_{1}\left(\begin{array}{c} 1 \\[4pt] \dfrac{1}{3} \end{array}\right) e^{t}+c_{2}\left(\begin{array}{c} 1 \\[4pt] -\dfrac{1}{2} \end{array}\right) e^{-4 t} \end{aligned} \nonumber$

We see that our solution is in the form of a linear combination of vectors of the form

$\mathbf{x}=\mathbf{v} e^{\lambda t} \nonumber$

with $\mathbf{v}$ a constant vector and $\lambda$ a constant number. This is similar to how we began to find solutions to second order constant coefficient equations. So, for the general problem (2.3) we insert this guess. Thus,

$\begin{aligned} \mathbf{x}^{\prime} &=A \mathbf{x} \Rightarrow \\[4pt] \lambda \mathbf{v} e^{\lambda t} &=A \mathbf{v} e^{\lambda t} . \end{aligned} \nonumber$

For this to be true for all $t$ , we have that

$A \mathbf{v}=\lambda \mathbf{v} . \nonumber$

This is an eigenvalue problem. $A$ is a $2 \times 2$ matrix for our problem, but could easily be generalized to a system of $n$ first order differential equations. We will confine our remarks for now to planar systems. However, we need to recall how to solve eigenvalue problems and then see how solutions of eigenvalue problems can be used to obtain solutions to our systems of differential equations..

$e^{x}=\sum_{k=0}^{\infty}=1+x+\dfrac{x^{2}}{2 !}+\dfrac{x^{3}}{3 !}+\cdots \nonumber$

So, we define

$e^{A}=\sum_{k=0}^{\infty}=I+A+\dfrac{A^{2}}{2 !}+\dfrac{A^{3}}{3 !}+\cdots \nonumber$

In general, it is difficult computing $e^{A}$ unless $A$ is diagonal.

$2.4$ Eigenvalue Problems

We seek nontrivial solutions to the eigenvalue problem

$A \mathbf{v}=\lambda \mathbf{v} . \nonumber$

We note that $\mathbf{v}=\mathbf{0}$ is an obvious solution. Furthermore, it does not lead to anything useful. So, it is called a trivial solution. Typically, we are given the matrix $A$ and have to determine the eigenvalues, $\lambda$ , and the associated eigenvectors, $\mathbf{v}$ , satisfying the above eigenvalue problem. Later in the course we will explore other types of eigenvalue problems.

For now we begin to solve the eigenvalue problem for $\mathbf{v}=\left(\begin{array}{l}v_{1} \\[4pt] v_{2}\end{array}\right)$ . Inserting this into Equation (2.39), we obtain the homogeneous algebraic system

$\begin{aligned} &(a-\lambda) v_{1}+b v_{2}=0 \\[4pt] &c v_{1}+(d-\lambda) v_{2}=0 . \end{aligned} \nonumber$

The solution of such a system would be unique if the determinant of the system is not zero. However, this would give the trivial solution $v_{1}=0, v_{2}=0$ . To get a nontrivial solution, we need to force the determinant to be zero. This yields the eigenvalue equation

$0=\left|\begin{array}{cc} a-\lambda & b \\[4pt] c & d-\lambda \end{array}\right|=(a-\lambda)(d-\lambda)-b c . \nonumber$

This is a quadratic equation for the eigenvalues that would lead to nontrivial solutions. If we expand the right side of the equation, we find that

$\lambda^{2}-(a+d) \lambda+a d-b c=0 . \nonumber$

This is the same equation as the characteristic equation (2.8) for the general constant coefficient differential equation considered in the first chapter. Thus, the eigenvalues correspond to the solutions of the characteristic polynomial for the system.

Once we find the eigenvalues, then there are possibly an infinite number solutions to the algebraic system. We will see this in the examples.

So, the process is to

a) Write the coefficient matrix;

b) Find the eigenvalues from the equation $\operatorname{det}(A-\lambda I)=0$ ; and,

c) Find the eigenvectors by solving the linear system $(A-\lambda I) \mathbf{v}=0$ for each $\lambda$ .

Solving Constant Coefficient Systems in $2 \mathrm{D}$

Before proceeding to examples, we first indicate the types of solutions that could result from the solution of a homogeneous, constant coefficient system of first order differential equations. We begin with the linear system of differential equations in matrix form.

$\dfrac{d \mathbf{x}}{d t}=\left(\begin{array}{ll} a & b \\[4pt] c & d \end{array}\right) \mathbf{x}=A \mathbf{x} . \nonumber$

The type of behavior depends upon the eigenvalues of matrix $A$ . The procedure is to determine the eigenvalues and eigenvectors and use them to construct the general solution.

If we have an initial condition, $\mathbf{x}\left(t_{0}\right)=\mathbf{x}_{0}$ , we can determine the two arbitrary constants in the general solution in order to obtain the particular solution. Thus, if $\mathbf{x}_{1}(t)$ and $\mathbf{x}_{2}(t)$ are two linearly independent solutions ${ }^{2}$ , then the general solution is given as

$\mathbf{x}(t)=c_{1} \mathbf{x}_{1}(t)+c_{2} \mathbf{x}_{2}(t) \nonumber$

Then, setting $t=0$ , we get two linear equations for $c_{1}$ and $c_{2}$ :

$c_{1} \mathbf{x}_{1}(0)+c_{2} \mathbf{x}_{2}(0)=\mathbf{x}_{0} \nonumber$

The major work is in finding the linearly independent solutions. This depends upon the different types of eigenvalues that one obtains from solving the eigenvalue equation, $\operatorname{det}(A-\lambda I)=0$ . The nature of these roots indicate the form of the general solution. On the next page we summarize the classification of solutions in terms of the eigenvalues of the coefficient matrix. We first make some general remarks about the plausibility of these solutions and then provide examples in the following section to clarify the matrix methods for our two dimensional systems.

The construction of the general solution in Case I is straight forward. However, the other two cases need a little explanation.

${ }^{2}$ Recall that linear independence means $c_{1} \mathbf{x}_{1}(t)+c_{2} \mathbf{x}_{2}(t)=\mathbf{0}$ if and only if $c_{1}, c_{2}=$ 0 . The reader should derive the condition on the $\mathbf{x}_{i}$ for linear independence.

Classification of the Solutions for Two

Linear First Order Differential Equations

Case I: Two real, distinct roots.

Solve the eigenvalue problem $A \mathbf{v}=\lambda \mathbf{v}$ for each eigenvalue obtaining two eigenvectors $\mathbf{v}_{1}, \mathbf{v}_{2}$ . Then write the general solution as a linear combination $\mathbf{x}(t)=c_{1} e^{\lambda_{1} t} \mathbf{v}_{1}+c_{2} e^{\lambda_{2} t} \mathbf{v}_{2}$

Case II: One Repeated Root

Solve the eigenvalue problem $A \mathbf{v}=\lambda \mathbf{v}$ for one eigenvalue $\lambda$ , obtaining the first eigenvector $\mathbf{v}_{1}$ . One then needs a second linearly independent solution. This is obtained by solving the nonhomogeneous problem $A \mathbf{v}_{2}-\lambda \mathbf{v}_{2}=\mathbf{v}_{1}$ for $\mathbf{v}_{2}$ .

The general solution is then given by $\mathbf{x}(t)=c_{1} e^{\lambda t} \mathbf{v}_{1}+c_{2} e^{\lambda t}\left(\mathbf{v}_{2}+t \mathbf{v}_{1}\right)$ . 3. Case III: Two complex conjugate roots.

Solve the eigenvalue problem $A \mathbf{x}=\lambda \mathbf{x}$ for one eigenvalue, $\lambda=\alpha+i \beta$ , obtaining one eigenvector $\mathbf{v}$ . Note that this eigenvector may have complex entries. Thus, one can write the vector $\mathbf{y}(t)=e^{\lambda t} \mathbf{v}=e^{\alpha t}(\cos \beta t+$ $i \sin \beta t) \mathbf{v}$ . Now, construct two linearly independent solutions to the problem using the real and imaginary parts of $\mathbf{y}(t): \mathbf{y}_{1}(t)=\operatorname{Re}(\mathbf{y}(t))$ and $\mathbf{y}_{2}(t)=\operatorname{Im}(\mathbf{y}(t))$ . Then the general solution can be written as $\mathbf{x}(t)=c_{1} \mathbf{y}_{1}(t)+c_{2} \mathbf{y}_{2}(t)$

Let’s consider Case III. Note that since the original system of equations does not have any $i$ ’s, then we would expect real solutions. So, we look at the real and imaginary parts of the complex solution. We have that the complex solution satisfies the equation

$\dfrac{d}{d t}[\operatorname{Re}(\mathbf{y}(t))+i \operatorname{Im}(\mathbf{y}(t))]=A[\operatorname{Re}(\mathbf{y}(t))+i \operatorname{Im}(\mathbf{y}(t))] \nonumber$

Differentiating the sum and splitting the real and imaginary parts of the equation, gives

$\dfrac{d}{d t} \operatorname{Re}(\mathbf{y}(t))+i \dfrac{d}{d t} \operatorname{Im}(\mathbf{y}(t))=A[\operatorname{Re}(\mathbf{y}(t))]+i A[\operatorname{Im}(\mathbf{y}(t))] . \nonumber$

Setting the real and imaginary parts equal, we have

$\dfrac{d}{d t} \operatorname{Re}(\mathbf{y}(t))=A[\operatorname{Re}(\mathbf{y}(t))] \nonumber$

and

$\dfrac{d}{d t} \operatorname{Im}(\mathbf{y}(t))=A[\operatorname{Im}(\mathbf{y}(t))] \nonumber$

Therefore, the real and imaginary parts each are linearly independent solutions of the system and the general solution can be written as a linear combination of these expressions. We now turn to Case II. Writing the system of first order equations as a second order equation for $x(t)$ with the sole solution of the characteristic equation, $\lambda=\dfrac{1}{2}(a+d)$ , we have that the general solution takes the form

$x(t)=\left(c_{1}+c_{2} t\right) e^{\lambda t} . \nonumber$

This suggests that the second linearly independent solution involves a term of the form $v t e^{\lambda t}$ . It turns out that the guess that works is

$\mathbf{x}=t e^{\lambda t} \mathbf{v}_{1}+e^{\lambda t} \mathbf{v}_{2} \nonumber$

Inserting this guess into the system $\mathbf{x}^{\prime}=A \mathbf{x}$ yields

$\begin{aligned} \left(t e^{\lambda t} \mathbf{v}_{1}+e^{\lambda t} \mathbf{v}_{2}\right)^{\prime} &=A\left[t e^{\lambda t} \mathbf{v}_{1}+e^{\lambda t} \mathbf{v}_{2}\right] . \\[4pt] e^{\lambda t} \mathbf{v}_{1}+\lambda t e^{\lambda t} \mathbf{v}_{1}+\lambda e^{\lambda t} \mathbf{v}_{2} &=\lambda t e^{\lambda t} \mathbf{v}_{1}+e^{\lambda t} A \mathbf{v}_{2} . \\[4pt] e^{\lambda t}\left(\mathbf{v}_{1}+\lambda \mathbf{v}_{2}\right) &=e^{\lambda t} A \mathbf{v}_{2} . \end{aligned} \nonumber$

Noting this is true for all $t$ , we find that

$\mathbf{v}_{1}+\lambda \mathbf{v}_{2}=A \mathbf{v}_{2} . \nonumber$

Therefore,

$(A-\lambda I) \mathbf{v}_{2}=\mathbf{v}_{1} \text {. } \nonumber$

We know everything except for $\mathbf{v}_{2}$ . So, we just solve for it and obtain the second linearly independent solution.

Examples of the Matrix Method

Here we will give some examples for typical systems for the three cases mentioned in the last section.

Example 2.14. $A=\left(\begin{array}{ll}4 & 2 \\[4pt] 3 & 3\end{array}\right)$ .

Eigenvalues: We first determine the eigenvalues.

$0=\left|\begin{array}{cc} 4-\lambda & 2 \\[4pt] 3 & 3-\lambda \end{array}\right| \nonumber$

Therefore,

$\begin{aligned} &0=(4-\lambda)(3-\lambda)-6 \\[4pt] &0=\lambda^{2}-7 \lambda+6 \\[4pt] &0=(\lambda-1)(\lambda-6) \end{aligned} \nonumber$

The eigenvalues are then $\lambda=1,6$ . This is an example of Case I.

Eigenvectors: Next we determine the eigenvectors associated with each of these eigenvalues. We have to solve the system $A \mathbf{v}=\lambda \mathbf{v}$ in each case. Case $\lambda=1$

$\begin{gathered} \left(\begin{array}{ll} 4 & 2 \\[4pt] 3 & 3 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right) \\[4pt] \left(\begin{array}{ll} 3 & 2 \\[4pt] 3 & 2 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} 0 \\[4pt] 0 \end{array}\right) \end{gathered} \nonumber$

This gives $3 v_{1}+2 v_{2}=0$ . One possible solution yields an eigenvector of

$\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{c} 2 \\[4pt] -3 \end{array}\right) \nonumber$

Case $\lambda=6$

$\begin{aligned} &\left(\begin{array}{cc} 4 & 2 \\[4pt] 3 & 3 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=6\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right) \\[4pt] &\left(\begin{array}{cc} -2 & 2 \\[4pt] 3 & -3 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} 0 \\[4pt] 0 \end{array}\right) \end{aligned} \nonumber$

For this case we need to solve $-2 v_{1}+2 v_{2}=0$ . This yields

$\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} 1 \\[4pt] 1 \end{array}\right) \nonumber$

General Solution: We can now construct the general solution.

$\begin{aligned} \mathbf{x}(t) &=c_{1} e^{\lambda_{1} t} \mathbf{v}_{1}+c_{2} e^{\lambda_{2} t} \mathbf{v}_{2} \\[4pt] &=c_{1} e^{t}\left(\begin{array}{c} 2 \\[4pt] -3 \end{array}\right)+c_{2} e^{6 t}\left(\begin{array}{l} 1 \\[4pt] 1 \end{array}\right) \\[4pt] &=\left(\begin{array}{c} 2 c_{1} e^{t}+c_{2} e^{6 t} \\[4pt] -3 c_{1} e^{t}+c_{2} e^{6 t} \end{array}\right) . \end{aligned} \nonumber$

Example 2.15. $A=\left(\begin{array}{ll}3 & -5 \\[4pt] 1 & -1\end{array}\right)$ .

Eigenvalues: Again, one solves the eigenvalue equation.

$0=\left|\begin{array}{cc} 3-\lambda & -5 \\[4pt] 1 & -1-\lambda \end{array}\right| \nonumber$

Therefore,

$\begin{aligned} &0=(3-\lambda)(-1-\lambda)+5 \\[4pt] &0=\lambda^{2}-2 \lambda+2 \\[4pt] &\lambda=\dfrac{-(-2) \pm \sqrt{4-4(1)(2)}}{2}=1 \pm i \end{aligned} \nonumber$

The eigenvalues are then $\lambda=1+i, 1-i$ . This is an example of Case III.

Eigenvectors: In order to find the general solution, we need only find the eigenvector associated with $1+i$ .

$\begin{gathered} \left(\begin{array}{l} 3-5 \\[4pt] 1-1 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=(1+i)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right) \\[4pt] \left(\begin{array}{cc} 2-i & -5 \\[4pt] 1 & -2-i \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} 0 \\[4pt] 0 \end{array}\right) \end{gathered} \nonumber$

We need to solve $(2-i) v_{1}-5 v_{2}=0$ . Thus,

$\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{c} 2+i \\[4pt] 1 \end{array}\right) \nonumber$

Complex Solution: In order to get the two real linearly independent solutions, we need to compute the real and imaginary parts of $\mathbf{v} e^{\lambda t}$ .

$\begin{aligned} e^{\lambda t}\left(\begin{array}{c} 2+i \\[4pt] 1 \end{array}\right) &=e^{(1+i) t}\left(\begin{array}{c} 2+i \\[4pt] 1 \end{array}\right) \\[4pt] &=e^{t}(\cos t+i \sin t)\left(\begin{array}{c} 2+i \\[4pt] 1 \end{array}\right) \\[4pt] &=e^{t}\left(\begin{array}{c} (2+i)(\cos t+i \sin t) \\[4pt] \cos t+i \sin t \end{array}\right) \\[4pt] &=e^{t}\left(\begin{array}{c} (2 \cos t-\sin t)+i(\cos t+2 \sin t) \\[4pt] \cos t+i \sin t \end{array}\right)+i e^{t}\left(\begin{array}{c} \cos t+2 \sin t \\[4pt] \sin t \end{array}\right) \\[4pt] &=e^{t}\left(\begin{array}{c} 2 \cos t-\sin t \\[4pt] \cos t \end{array}\right) \end{aligned} \nonumber$

General Solution: Now we can construct the general solution.

$\begin{aligned} \mathbf{x}(t) &=c_{1} e^{t}\left(\begin{array}{l} 2 \cos t-\sin t \\[4pt] \cos t \end{array}\right)+c_{2} e^{t}\left(\begin{array}{c} \cos t+2 \sin t \\[4pt] \sin t \end{array}\right) \\[4pt] &=e^{t}\left(\begin{array}{c} c_{1}(2 \cos t-\sin t)+c_{2}(\cos t+2 \sin t) \\[4pt] c_{1} \cos t+c_{2} \sin t \end{array}\right) \end{aligned} \nonumber$

Note: This can be rewritten as

$\mathbf{x}(t)=e^{t} \cos t\left(\begin{array}{c} 2 c_{1}+c_{2} \\[4pt] c_{1} \end{array}\right)+e^{t} \sin t\left(\begin{array}{c} 2 c_{2}-c_{1} \\[4pt] c_{2} \end{array}\right) \nonumber$

Example 2.16. $A=\left(\begin{array}{cc}7 & -1 \\[4pt] 9 & 1\end{array}\right)$ .

Eigenvalues:

$0=\left|\begin{array}{cc} 7-\lambda & -1 \\[4pt] 9 & 1-\lambda \end{array}\right| \nonumber$

Therefore,

$\begin{aligned} &0=(7-\lambda)(1-\lambda)+9 \\[4pt] &0=\lambda^{2}-8 \lambda+16 \\[4pt] &0=(\lambda-4)^{2} \end{aligned} \nonumber$

There is only one real eigenvalue, $\lambda=4$ . This is an example of Case II.

Eigenvectors: In this case we first solve for $\mathbf{v}_{1}$ and then get the second linearly independent vector.

$\begin{aligned} &\left(\begin{array}{cc} 7 & -1 \\[4pt] 9 & 1 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=4\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right) \\[4pt] &\left(\begin{array}{ll} 3 & -1 \\[4pt] 9 & -3 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} 0 \\[4pt] 0 \end{array}\right) \end{aligned} \nonumber$

Therefore, we have

$3 v_{1}-v_{2}=0, \quad \Rightarrow \quad\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} 1 \\[4pt] 3 \end{array}\right) \text {. } \nonumber$

Second Linearly Independent Solution:

Now we need to solve $A \mathbf{v}_{2}-\lambda \mathbf{v}_{2}=\mathbf{v}_{1}$ .

Expanding the matrix product, we obtain the system of equations

$\begin{array}{r} 3 u_{1}-u_{2}=1 \\[4pt] 9 u_{1}-3 u_{2}=3 . \end{array} \nonumber$

The solution of this system is $\left(\begin{array}{l}u_{1} \\[4pt] u_{2}\end{array}\right)=\left(\begin{array}{l}1 \\[4pt] 2\end{array}\right)$ .

General Solution: We construct the general solution as

$\begin{aligned} \mathbf{y}(t) &=c_{1} e^{\lambda t} \mathbf{v}_{1}+c_{2} e^{\lambda t}\left(\mathbf{v}_{2}+t \mathbf{v}_{1}\right) \\[4pt] &=c_{1} e^{4 t}\left(\begin{array}{l} 1 \\[4pt] 3 \end{array}\right)+c_{2} e^{4 t}\left[\left(\begin{array}{l} 1 \\[4pt] 2 \end{array}\right)+t\left(\begin{array}{l} 1 \\[4pt] 3 \end{array}\right)\right] \\[4pt] &=e^{4 t}\left(\begin{array}{c} c_{1}+c_{2}(1+t) \\[4pt] 3 c_{1}+c_{2}(2+3 t) \end{array}\right) \end{aligned} \nonumber$

$\begin{aligned} & \left(\begin{array}{cc}7 & -1 \\[4pt]9 & 1\end{array}\right)\left(\begin{array}{l}u_{1} \\[4pt]u_{2}\end{array}\right)-4\left(\begin{array}{l}u_{1} \\[4pt]u_{2}\end{array}\right)=\left(\begin{array}{l}1 \\[4pt]3\end{array}\right) \\[4pt] & \left(\begin{array}{ll}3 & -1 \\[4pt]9 & -3\end{array}\right)\left(\begin{array}{l}u_{1} \\[4pt]u_{2}\end{array}\right)=\left(\begin{array}{l}1 \\[4pt]3\end{array}\right) \text {. } \end{aligned} \nonumber$

Planar Systems - Summary

The reader should have noted by now that there is a connection between the behavior of the solutions obtained in Section $2.2$ and the eigenvalues found from the coefficient matrices in the previous examples. Here we summarize some of these cases.

Table 2.1. List of typical behaviors in planar systems.

The connection, as we have seen, is that the characteristic equation for the associated second order differential equation is the same as the eigenvalue equation of the coefficient matrix for the linear system. However, one should be a little careful in cases in which the coefficient matrix in not diagonalizable. In Table $2.2$ are three examples of systems with repeated roots. The reader should look at these systems and look at the commonalities and differences in these systems and their solutions. In these cases one has unstable nodes, though they are degenerate in that there is only one accessible eigenvector.

Theory of Homogeneous Constant Coefficient Systems

There is a general theory for solving homogeneous, constant coefficient systems of first order differential equations. We begin by once again recalling the specific problem (2.12). We obtained the solution to this system as

$\begin{gathered} x(t)=c_{1} e^{t}+c_{2} e^{-4 t}, \\[4pt] y(t)=\dfrac{1}{3} c_{1} e^{t}-\dfrac{1}{2} c_{2} e^{-4 t} \end{gathered} \nonumber$

Table 2.2. Three examples of systems with a repeated root of $\lambda=2$ .

This time we rewrite the solution as

$\begin{aligned} \mathbf{x} &=\left(\begin{array}{c} c_{1} e^{t}+c_{2} e^{-4 t} \\[4pt] \dfrac{1}{3} c_{1} e^{t}-\dfrac{1}{2} c_{2} e^{-4 t} \end{array}\right) \\[4pt] &=\left(\begin{array}{cc} e^{t} & e^{-4 t} \\[4pt] \dfrac{1}{3} e^{t}-\dfrac{1}{2} e^{-4 t} \end{array}\right)\left(\begin{array}{c} c_{1} \\[4pt] c_{2} \end{array}\right) \\[4pt] & \equiv \Phi(t) \mathbf{C} \end{aligned} \nonumber$

Thus, we can write the general solution as a $2 \times 2$ matrix $\Phi$ times an arbitrary constant vector. The matrix $\Phi$ consists of two columns that are linearly independent solutions of the original system. This matrix is an example of what we will define as the Fundamental Matrix of solutions of the system. So, determining the Fundamental Matrix will allow us to find the general solution of the system upon multiplication by a constant matrix. In fact, we will see that it will also lead to a simple representation of the solution of the initial value problem for our system. We will outline the general theory.

Consider the homogeneous, constant coefficient system of first order differential equations

$\begin{aligned} \dfrac{d x_{1}}{d t} &=a_{11} x_{1}+a_{12} x_{2}+\ldots+a_{1 n} x_{n} \\[4pt] \dfrac{d x_{2}}{d t} &=a_{21} x_{1}+a_{22} x_{2}+\ldots+a_{2 n} x_{n} \\[4pt] & \vdots \\[4pt] \dfrac{d x_{n}}{d t} &=a_{n 1} x_{1}+a_{n 2} x_{2}+\ldots+a_{n n} x_{n} \end{aligned} \nonumber$

As we have seen, this can be written in the matrix form $\mathbf{x}^{\prime}=A \mathbf{x}$ , where

$\mathbf{x}=\left(\begin{array}{c} x_{1} \\[4pt] x_{2} \\[4pt] \vdots \\[4pt] x_{n} \end{array}\right) \nonumber$

and

$A=\left(\begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1 n} \\[4pt] a_{21} & a_{22} & \cdots & a_{2 n} \\[4pt] \vdots & \vdots & \ddots & \vdots \\[4pt] a_{n 1} & a_{n 2} & \cdots & a_{n n} \end{array}\right) \nonumber$

Now, consider $m$ vector solutions of this system: $\phi_{1}(t), \phi_{2}(t), \ldots \phi_{m}(t)$ . These solutions are said to be linearly independent on some domain if

$c_{1} \phi_{1}(t)+c_{2} \phi_{2}(t)+\ldots+c_{m} \phi_{m}(t)=0 \nonumber$

for all $t$ in the domain implies that $c_{1}=c_{2}=\ldots=c_{m}=0$ .

Let $\phi_{1}(t), \phi_{2}(t), \ldots \phi_{n}(t)$ be a set of $n$ linearly independent set of solutions of our system, called a fundamental set of solutions. We construct a matrix from these solutions using these solutions as the column of that matrix. We define this matrix to be the fundamental matrix solution. This matrix takes the form

$\Phi=\left(\begin{array}{lll} \phi_{1} & \ldots & \phi_{n} \end{array}\right)=\left(\begin{array}{cccc} \phi_{11} & \phi_{12} & \cdots & \phi_{1 n} \\[4pt] \phi_{21} & \phi_{22} & \cdots & \phi_{2 n} \\[4pt] \vdots & \vdots & \ddots & \vdots \\[4pt] \phi_{n 1} & \phi_{n 2} & \cdots & \phi_{n n} \end{array}\right) \nonumber$

What do we mean by a "matrix" solution? We have assumed that each $\phi_{k}$ is a solution of our system. Therefore, we have that $\phi_{k}^{\prime}=A \phi_{k}$ , for $k=1, \ldots, n$ . We say that $\Phi$ is a matrix solution because we can show that $\Phi$ also satisfies the matrix formulation of the system of differential equations. We can show this using the properties of matrices.

$\begin{aligned} \dfrac{d}{d t} \Phi &=\left(\phi_{1}^{\prime} \ldots \phi_{n}^{\prime}\right) \\[4pt] &=\left(A \phi_{1} \ldots A \phi_{n}\right) \\[4pt] &=A\left(\phi_{1} \ldots \phi_{n}\right) \\[4pt] &=A \Phi \end{aligned} \nonumber$

Given a set of vector solutions of the system, when are they linearly independent? We consider a matrix solution $\Omega(t)$ of the system in which we have $n$ vector solutions. Then, we define the Wronskian of $\Omega(t)$ to be

$W=\operatorname{det} \Omega(t) \nonumber$

If $W(t) \neq 0$ , then $\Omega(t)$ is a fundamental matrix solution. Before continuing, we list the fundamental matrix solutions for the set of examples in the last section. (Refer to the solutions from those examples.) Furthermore, note that the fundamental matrix solutions are not unique as one can multiply any column by a nonzero constant and still have a fundamental matrix solution.

Example 2.14 $A=\left(\begin{array}{ll}4 & 2 \\[4pt] 3 & 3\end{array}\right)$ .

$\Phi(t)=\left(\begin{array}{cc} 2 e^{t} & e^{6 t} \\[4pt] -3 e^{t} & e^{6 t} \end{array}\right) \nonumber$

We should note in this case that the Wronskian is found as

$\begin{aligned} W &=\operatorname{det} \Phi(t) \\[4pt] &=\left|\begin{array}{cc} 2 e^{t} & e^{6 t} \\[4pt] -3 e^{t} & e^{6 t} \end{array}\right| \\[4pt] &=5 e^{7 t} \neq 0 . \end{aligned} \nonumber$

Example 2.15 $A=\left(\begin{array}{ll}3 & -5 \\[4pt] 1 & -1\end{array}\right)$ .

$\Phi(t)=\left(\begin{array}{cc} e^{t}(2 \cos t-\sin t) & e^{t}(\cos t+2 \sin t) \\[4pt] e^{t} \cos t & e^{t} \sin t \end{array}\right) \nonumber$

Example $2.16 A=\left(\begin{array}{cc}7 & -1 \\[4pt] 9 & 1\end{array}\right)$ .

$\Phi(t)=\left(\begin{array}{cc} e^{4 t} & e^{4 t}(1+t) \\[4pt] 3 e^{4 t} & e^{4 t}(2+3 t) \end{array}\right) \nonumber$

So far we have only determined the general solution. This is done by the following steps:

Procedure for Determining the General Solution

Solve the eigenvalue problem $(A-\lambda I) \mathbf{v}=0$ .
Construct vector solutions from $e^{\lambda t}$ . The method depends if one has real or complex conjugate eigenvalues.
Form the fundamental solution matrix $\Phi(t)$ from the vector solution.
The general solution is given by $\mathbf{x}(t)=\Phi(t) \mathbf{C}$ for $\mathbf{C}$ an arbitrary constant vector.

We are now ready to solve the initial value problem:

$\mathbf{x}^{\prime}=A \mathbf{x}, \quad \mathbf{x}\left(t_{0}\right)=\mathbf{x}_{0} . \nonumber$

Starting with the general solution, we have that

$\mathbf{x}_{0}=\mathbf{x}\left(t_{0}\right)=\Phi\left(t_{0}\right) \mathbf{C} . \nonumber$

As usual, we need to solve for the $c_{k}$ ’s. Using matrix methods, this is now easy. Since the Wronskian is not zero, then we can invert $\Phi$ at any value of $t$ . So, we have

$\mathbf{C}=\Phi^{-1}\left(t_{0}\right) \mathbf{x}_{0} . \nonumber$

Putting $\mathbf{C}$ back into the general solution, we obtain the solution to the initial value problem:

$\mathbf{x}(t)=\Phi(t) \Phi^{-1}\left(t_{0}\right) \mathbf{x}_{0} \nonumber$

You can easily verify that this is a solution of the system and satisfies the initial condition at $t=t_{0}$ .

The matrix combination $\Phi(t) \Phi^{-1}\left(t_{0}\right)$ is useful. So, we will define the resulting product to be the principal matrix solution, denoting it by

$\Psi(t)=\Phi(t) \Phi^{-1}\left(t_{0}\right) . \nonumber$

Thus, the solution of the initial value problem is $\mathbf{x}(t)=\Psi(t) \mathbf{x}_{0}$ . Furthermore, we note that $\Psi(t)$ is a solution to the matrix initial value problem

$\mathbf{x}^{\prime}=A \mathbf{x}, \quad \mathbf{x}\left(t_{0}\right)=I, \nonumber$

where $I$ is the $n \times n$ identity matrix.

Matrix Solution of the Homogeneous Problem
In summary, the matrix solution of

$\dfrac{d \mathbf{x}}{d t}=A \mathbf{x}, \quad \mathbf{x}\left(t_{0}\right)=\mathbf{x}_{0} \nonumber$

$\mathbf{x}(t)=\Psi(t) \mathbf{x}_{0}=\Phi(t) \Phi^{-1}\left(t_{0}\right) \mathbf{x}_{0}, \nonumber$

is given by

Example 2.17. Let’s consider the matrix initial value problem

$\begin{aligned} &x^{\prime}=5 x+3 y \\[4pt] &y^{\prime}=-6 x-4 y \end{aligned} \nonumber$

satisfying $x(0)=1, y(0)=2$ . Find the solution of this problem.

We first note that the coefficient matrix is

$A=\left(\begin{array}{cc} 5 & 3 \\[4pt] -6 & -4 \end{array}\right) \nonumber$

The eigenvalue equation is easily found from

$\begin{aligned} 0 &=-(5-\lambda)(4+\lambda)+18 \\[4pt] &=\lambda^{2}-\lambda-2 \\[4pt] &=(\lambda-2)(\lambda+1) \end{aligned} \nonumber$

So, the eigenvalues are $\lambda=-1,2$ . The corresponding eigenvectors are found to be

$\mathbf{v}_{1}=\left(\begin{array}{c} 1 \\[4pt] -2 \end{array}\right), \quad \mathbf{v}_{2}=\left(\begin{array}{c} 1 \\[4pt] -1 \end{array}\right) \nonumber$

Now we construct the fundamental matrix solution. The columns are obtained using the eigenvectors and the exponentials, $e^{\lambda t}$ :

$\phi_{1}(t)=\left(\begin{array}{c} 1 \\[4pt] -2 \end{array}\right) e^{-t}, \quad \phi_{1}(t)=\left(\begin{array}{c} 1 \\[4pt] -1 \end{array}\right) e^{2 t} \nonumber$

So, the fundamental matrix solution is

$\Phi(t)=\left(\begin{array}{cc} e^{-t} & e^{2 t} \\[4pt] -2 e^{-t} & -e^{2 t} \end{array}\right) \nonumber$

The general solution to our problem is then

$\mathbf{x}(t)=\left(\begin{array}{cc} e^{-t} & e^{2 t} \\[4pt] -2 e^{-t} & -e^{2 t} \end{array}\right) \mathbf{C} \nonumber$

for $\mathbf{C}$ is an arbitrary constant vector.

In order to find the particular solution of the initial value problem, we need the principal matrix solution. We first evaluate $\Phi(0)$ , then we invert it:

$\Phi(0)=\left(\begin{array}{cc} 1 & 1 \\[4pt] -2 & -1 \end{array}\right) \quad \Rightarrow \quad \Phi^{-1}(0)=\left(\begin{array}{cc} -1 & -1 \\[4pt] 2 & 1 \end{array}\right) \nonumber$

The particular solution is then

Thus, $x(t)=-3 e^{-t}+4 e^{2 t}$ and $y(t)=6 e^{-t}-4 e^{2 t}$ .

Nonhomogeneous Systems

Before leaving the theory of systems of linear, constant coefficient systems, we will discuss nonhomogeneous systems. We would like to solve systems of the form

$\begin{aligned} & \mathbf{x}(t)=\left(\begin{array}{cc}e^{-t} & e^{2 t} \\[4pt]-2 e^{-t} & -e^{2 t}\end{array}\right)\left(\begin{array}{cc}-1 & -1 \\[4pt]2 & 1\end{array}\right)\left(\begin{array}{l}1 \\[4pt]2\end{array}\right) \\[4pt] & =\left(\begin{array}{cc}e^{-t} & e^{2 t} \\[4pt]-2 e^{-t} & -e^{2 t}\end{array}\right)\left(\begin{array}{c}-3 \\[4pt]4\end{array}\right) \\[4pt] & =\left(\begin{array}{c}-3 e^{-t}+4 e^{2 t} \\[4pt]6 e^{-t}-4 e^{2 t}\end{array}\right) \end{aligned} \nonumber$

$\mathbf{x}^{\prime}=A(t) \mathbf{x}+\mathbf{f}(t) \nonumber$

We will assume that we have found the fundamental matrix solution of the homogeneous equation. Furthermore, we will assume that $A(t)$ and $\mathbf{f}(t)$ are continuous on some common domain.

As with second order equations, we can look for solutions that are a sum of the general solution to the homogeneous problem plus a particular solution of the nonhomogeneous problem. Namely, we can write the general solution as

$\mathbf{x}(t)=\Phi(t) \mathbf{C}+\mathbf{x}_{p}(t), \nonumber$

where $\mathbf{C}$ is an arbitrary constant vector, $\Phi(t)$ is the fundamental matrix solution of $\mathbf{x}^{\prime}=A(t) \mathbf{x}$ , and

$\mathbf{x}_{p}^{\prime}=A(t) \mathbf{x}_{p}+\mathbf{f}(t) . \nonumber$

Such a representation is easily verified.

We need to find the particular solution, $\mathbf{x}_{p}(t)$ . We can do this by applying The Method of Variation of Parameters for Systems. We consider a solution in the form of the solution of the homogeneous problem, but replace the constant vector by unknown parameter functions. Namely, we assume that

$\mathbf{x}_{p}(t)=\Phi(t) \mathbf{c}(t) . \nonumber$

Differentiating, we have that

$\mathbf{x}_{p}^{\prime}=\Phi^{\prime} \mathbf{c}+\Phi \mathbf{c}^{\prime}=A \Phi \mathbf{c}+\Phi \mathbf{c}^{\prime} \nonumber$

$\mathbf{x}_{p}^{\prime}-A \mathbf{x}_{p}=\Phi \mathbf{c}^{\prime} . \nonumber$

But the left side is $\mathbf{f}$ . So, we have that,

$\Phi \mathbf{c}^{\prime}=\mathbf{f} \nonumber$

or, since $\Phi$ is invertible (why?),

$\mathbf{c}^{\prime}=\Phi^{-1} \mathbf{f} \nonumber$

In principle, this can be integrated to give c. Therefore, the particular solution can be written as

$\mathbf{x}_{p}(t)=\Phi(t) \int^{t} \Phi^{-1}(s) \mathbf{f}(s) d s . \nonumber$

This is the variation of parameters formula.

The general solution of Equation (2.70) has been found as

$\mathbf{x}(t)=\Phi(t) \mathbf{C}+\Phi(t) \int^{t} \Phi^{-1}(s) \mathbf{f}(s) d s . \nonumber$

We can use the general solution to find the particular solution of an initial value problem consisting of Equation (2.70) and the initial condition $\mathbf{x}\left(t_{0}\right)=$ $\mathbf{x}_{0}$ . This condition is satisfied for a solution of the form

$\mathbf{x}(t)=\Phi(t) \mathbf{C}+\Phi(t) \int_{t_{0}}^{t} \Phi^{-1}(s) \mathbf{f}(s) d s \nonumber$

provided

$\mathbf{x}_{0}=\mathbf{x}\left(t_{0}\right)=\Phi\left(t_{0}\right) \mathbf{C} . \nonumber$

This can be solved for $\mathbf{C}$ as in the last section. Inserting the solution back into the general solution $(2.73)$ , we have

$\mathbf{x}(t)=\Phi(t) \Phi^{-1}\left(t_{0}\right) \mathbf{x}_{0}+\Phi(t) \int_{t_{0}}^{t} \Phi^{-1}(s) \mathbf{f}(s) d s \nonumber$

This solution can be written a little neater in terms of the principal matrix solution, $\Psi(t)=\Phi(t) \Phi^{-1}\left(t_{0}\right)$ :

$\mathbf{x}(t)=\Psi(t) \mathbf{x}_{0}+\Psi(t) \int_{t_{0}}^{t} \Psi^{-1}(s) \mathbf{f}(s) d s \nonumber$

Finally, one further simplification occurs when $A$ is a constant matrix, which are the only types of problems we have solved in this chapter. In this case, we have that $\Psi^{-1}(t)=\Psi(-t)$ . So, computing $\Psi^{-1}(t)$ is relatively easy.

Example 2.18. $x^{\prime \prime}+x=2 \cos t, x(0)=4, x^{\prime}(0)=0$ . This example can be solved using the Method of Undetermined Coefficients. However, we will use the matrix method described in this section.

First, we write the problem in matrix form. The system can be written as

$\begin{gathered} x^{\prime}=y \\[4pt] y^{\prime}=-x+2 \cos t \end{gathered} \nonumber$

Thus, we have a nonhomogeneous system of the form

$\mathbf{x}^{\prime}=A \mathbf{x}+\mathbf{f}=\left(\begin{array}{cc} 0 & 1 \\[4pt] -1 & 0 \end{array}\right)\left(\begin{array}{l} x \\[4pt] y \end{array}\right)+\left(\begin{array}{c} 0 \\[4pt] 2 \cos t \end{array}\right) \nonumber$

Next we need the fundamental matrix of solutions of the homogeneous problem. We have that

$A=\left(\begin{array}{cc} 0 & 1 \\[4pt] -1 & 0 \end{array}\right) \text {. } \nonumber$

The eigenvalues of this matrix are $\lambda=\pm i$ . An eigenvector associated with $\lambda=i$ is easily found as $\left(\begin{array}{l}1 \\[4pt] i\end{array}\right)$ . This leads to a complex solution

$\left(\begin{array}{l} 1 \\[4pt] i \end{array}\right) e^{i t}=\left(\begin{array}{c} \cos t+i \sin t \\[4pt] i \cos t-\sin t \end{array}\right) . \nonumber$

From this solution we can construct the fundamental solution matrix

$\Phi(t)=\left(\begin{array}{cc} \cos t & \sin t \\[4pt] -\sin t & \cos t \end{array}\right) \nonumber$

So, the general solution to the homogeneous problem is

$\mathbf{x}_{h}=\Phi(t) \mathbf{C}=\left(\begin{array}{c} c_{1} \cos t+c_{2} \sin t \\[4pt] -c_{1} \sin t+c_{2} \cos t \end{array}\right) \nonumber$

Next we seek a particular solution to the nonhomogeneous problem. From Equation (2.73) we see that we need $\Phi^{-1}(s) \mathbf{f}(s)$ . Thus, we have

$\begin{aligned} \Phi^{-1}(s) \mathbf{f}(s) &=\left(\begin{array}{cc} \cos s & -\sin s \\[4pt] \sin s & \cos s \end{array}\right)\left(\begin{array}{c} 0 \\[4pt] 2 \cos s \end{array}\right) \\[4pt] &=\left(\begin{array}{c} -2 \sin s \cos s \\[4pt] 2 \cos ^{2} s \end{array}\right) \end{aligned} \nonumber$

We now compute

therefore, the general solution is

$\mathbf{x}=\left(\begin{array}{c} c_{1} \cos t+c_{2} \sin t \\[4pt] -c_{1} \sin t+c_{2} \cos t \end{array}\right)+\left(\begin{array}{c} t \sin t \\[4pt] \sin t+t \cos t \end{array}\right) \nonumber$

The solution to the initial value problem is

$\mathbf{x}=\left(\begin{array}{cc} \cos t & \sin t \\[4pt] -\sin t & \cos t \end{array}\right)\left(\begin{array}{l} 4 \\[4pt] 0 \end{array}\right)+\left(\begin{array}{c} t \sin t \\[4pt] \sin t+t \cos t \end{array}\right) \nonumber$

$\mathbf{x}=\left(\begin{array}{c} 4 \cos t+t \sin t \\[4pt] -3 \sin t+t \cos t \end{array}\right) \nonumber$

$2.9$ Applications

In this section we will describe several applications leading to systems of differential equations. In keeping with common practice in areas like physics, we will denote differentiation with respect to time as

$\begin{aligned} & =\left(\begin{array}{cc}\cos t & \sin t \\[4pt]-\sin t & \cos t\end{array}\right)\left(\begin{array}{c}-\sin ^{2} t \\[4pt]t+\dfrac{1}{2} \sin (2 t)\end{array}\right) \\[4pt] & =\left(\begin{array}{c}t \sin t \\[4pt]\sin t+t \cos t\end{array}\right) \text {. } \end{aligned} \nonumber$

$\dot{x}=\dfrac{d x}{d t} \nonumber$

We will look mostly at linear models and later modify some of these models to include nonlinear terms.

Spring-Mass Systems

There are many problems in physics that result in systems of equations. This is because the most basic law of physics is given by Newton’s Second Law, which states that if a body experiences a net force, it will accelerate. In particular, the net force is proportional to the acceleration with a proportionality constant called the mass, $m$ . This is summarized as

$\sum \mathbf{F}=m \mathbf{a} \nonumber$

Since $\mathbf{a}=\ddot{\mathbf{x}}$ , Newton’s Second Law is mathematically a system of second order differential equations for three dimensional problems, or one second order differential equation for one dimensional problems. If there are several masses, then we would naturally end up with systems no matter how many dimensions are involved.

A standard problem encountered in a first course in differential equations is that of a single block on a spring as shown in Figure 2.18. The net force in this case is the restoring force of the spring given by Hooke’s Law,

$F_{s}=-k x \nonumber$

where $k>0$ is the spring constant. Here $x$ is the elongation of the spring, or the displacement of the block from equilibrium. When $x$ is positive, the spring force is negative and when $x$ is negative the spring force is positive. We have depicted a horizontal system sitting on a frictionless surface.

A similar model can be provided for vertically oriented springs. Place the block on a vertically hanging spring. It comes to equilibrium, stretching the spring by $\ell_{0}$ . Newton’s Second Law gives

$-m g+k \ell_{0}=0 . \nonumber$

Now, pulling the mass further by $x_{0}$ , and releasing it, the mass begins to oscillate. Letting $x$ be the displacement from the new equilibrium, Newton’s Second Law now gives $m \ddot{x}=-m g+k\left(\ell_{0}-x\right)=-k x$ .

In both examples (a horizontally or vetically oscillating mass) Newton’s Second Law of motion reults in the differential equation

$m \ddot{x}+k x=0 . \nonumber$

This is the equation for simple harmonic motion which we have already encountered in Chapter $1 .$

This second order equation can be written as a system of two first order equations.

$\begin{aligned} &x^{\prime}=y \\[4pt] &y^{\prime}=-\dfrac{k}{m} x \end{aligned} \nonumber$

The coefficient matrix for this system is

$A=\left(\begin{array}{cc} 0 & 1 \\[4pt] -\omega^{2} & 0 \end{array}\right) \nonumber$

where $\omega^{2}=\dfrac{k}{m}$ . The eigenvalues of this system are $\lambda=\pm i \omega$ and the solutions are simple sines and cosines,

$\begin{aligned} &x(t)=c_{1} \cos \omega t+c_{2} \sin \omega t, \\[4pt] &y(t)=\omega\left(-c_{1} \sin \omega t+c_{2} \cos \omega t\right) . \end{aligned} \nonumber$

We further note that $\omega$ is called the angular frequency of oscillation and is given in $\mathrm{rad} / \mathrm{s}$ . The frequency of oscillation is

$f=\dfrac{\omega}{2 \pi} \nonumber$

It typically has units of $\mathrm{s}^{-1}$ , cps, or Hz. The multiplicative inverse has units of time and is called the period,

$T=\dfrac{1}{f} . \nonumber$

Thus, the period of oscillation for a mass $m$ on a spring with spring constant $k$ is given by

$T=2 \pi \sqrt{\dfrac{m}{k}} \nonumber$

Of course, we did not need to convert the last problem into a system. In fact, we had seen this equation in Chapter 1 . However, when one considers

Figure 2.19. Spring-Mass system for two masses and two springs.

more complicated spring-mass systems, systems of differential equations occur naturally. Consider two blocks attached with two springs as shown in Figure 2.19. In this case we apply Newton’s second law for each block.

First, consider the forces acting on the first block. The first spring is stretched by $x_{1}$ . This gives a force of $F_{1}=-k_{1} x_{1}$ . The second spring may also exert a force on the block depending if it is stretched, or not. If both blocks are displaced by the same amount, then the spring is not displaced. So, the amount by which the spring is displaced depends on the relative displacements of the two masses. This results in a second force of $F_{2}=k_{2}\left(x_{2}-x_{1}\right)$ .

There is only one spring connected to mass two. Again the force depends on the relative displacement of the masses. It is just oppositely directed to the force which mass one feels from this spring.

Combining these forces and using Newton’s Second Law for both masses, we have the system of second order differential equations

$\begin{aligned} &m_{1} \ddot{x}_{1}=-k_{1} x_{1}+k_{2}\left(x_{2}-x_{1}\right) \\[4pt] &m_{2} \ddot{x}_{2}=-k_{2}\left(x_{2}-x_{1}\right) \end{aligned} \nonumber$

One can rewrite this system of two second order equations as a system of four first order equations. This is done by introducing two new variables $x_{3}=\dot{x}_{1}$ and $x_{4}=\dot{x}_{2}$ . Note that these physically are the velocities of the two blocks.

The resulting system of first order equations is given as

$\begin{aligned} \dot{x}_{1} &=x_{3} \\[4pt] \dot{x}_{2} &=x_{4} \\[4pt] \dot{x}_{3} &=-\dfrac{k_{1}}{m_{1}} x_{1}+\dfrac{k_{2}}{m_{1}}\left(x_{2}-x_{1}\right) \\[4pt] \dot{x}_{4} &=-\dfrac{k_{2}}{m_{2}}\left(x_{2}-x_{1}\right) \end{aligned} \nonumber$

We can write our new system in matrix form as

$\left(\begin{array}{c} \dot{x}_{1} \\[4pt] \dot{x}_{2} \\[4pt] \dot{x}_{3} \\[4pt] \dot{x}_{4} \end{array}\right)=\left(\begin{array}{cccc} 0 & 0 & 1 & 0 \\[4pt] 0 & 0 & 0 & 1 \\[4pt] -\dfrac{k_{1}+k_{2}}{m_{1}} & \dfrac{k_{2}}{m_{1}} & 0 & 0 \\[4pt] \dfrac{k_{2}}{m_{2}} & -\dfrac{k_{2}}{m_{2}} & 0 & 0 \end{array}\right)\left(\begin{array}{l} x_{1} \\[4pt] x_{2} \\[4pt] x_{3} \\[4pt] x_{4} \end{array}\right) \nonumber$

Electrical Circuits

Another problem often encountered in a first year physics class is that of an LRC series circuit. This circuit is pictured in Figure $2.20$ . The resistor is a circuit element satisfying Ohm’s Law. The capacitor is a device that stores electrical energy and an inductor, or coil, stores magnetic energy.

The physics for this problem stems from Kirchoff’s Rules for circuits. Since there is only one loop, we will only need Kirchoff’s Loop Rule. Namely, the sum of the drops in electric potential are set equal to the rises in electric potential. The potential drops across each circuit element are given by

Resistor: $V_{R}=I R$ .
Capacitor: $V_{C}=\dfrac{q}{C}$ .
Inductor: $V_{L}=L \dfrac{d I}{d t}$ .

Adding these potential drops and setting the sum equal to the voltage supplied by the voltage source, $V(t)$ , we obtain

$I R+\dfrac{q}{C}+L \dfrac{d I}{d t}=V(t) \nonumber$

Furthermore, we recall that the current is defined as $I=\dfrac{d q}{d t}$ . where $q$ is the charge in the circuit. Since both $q$ and $I$ are unknown, we can replace the current by its expression in terms of the charge to obtain

$L \ddot{q}+R \dot{q}+\dfrac{1}{C} q=V(t) . \nonumber$

This is a second order differential equation for $q(t)$ . One can set up a system of equations and proceed to solve them. However, this is a constant coefficient differential equation and can also be solved using the methods in Chapter $1 .$

In the next examples we will look at special cases that arise for the series LRC circuit equation. These include $R C$ circuits, solvable by first order methods and $L C$ circuits, leading to oscillatory behavior.

Example 2.19. RC Circuits

We first consider the case of an RC circuit in which there is no inductor. Also, we will consider what happens when one charges a capacitor with a DC battery $\left(V(t)=V_{0}\right)$ and when one discharges a charged capacitor $(V(t)=0)$ .

For charging a capacitor, we have the initial value problem

$R \dfrac{d q}{d t}+\dfrac{q}{C}=V_{0}, \quad q(0)=0 \nonumber$

This equation is an example of a linear first order equation for $q(t)$ . However, we can also rewrite this equation and solve it as a separable equation, since $V_{0}$ is a constant. We will do the former only as another example of finding the integrating factor.

We first write the equation in standard form:

$\dfrac{d q}{d t}+\dfrac{q}{R C}=\dfrac{V_{0}}{R} . \nonumber$

The integrating factor is then

$\mu(t)=e^{\int \dfrac{d t}{R C}}=e^{t / R C} . \nonumber$

Thus,

$\dfrac{d}{d t}\left(q e^{t / R C}\right)=\dfrac{V_{0}}{R} e^{t / R C} \nonumber$

Integrating, we have

$q e^{t / R C}=\dfrac{V_{0}}{R} \int e^{t / R C} d t=C V_{0} e^{t / R C}+K \nonumber$

Note that we introduced the integration constant, $K$ . Now divide out the exponential to get the general solution:

$q=C V_{0}+K e^{-t / R C} \nonumber$

(If we had forgotten the $K$ , we would not have gotten a correct solution for the differential equation.)

Next, we use the initial condition to get our particular solution. Namely, setting $t=0$ , we have that

$0=q(0)=C V_{0}+K \nonumber$

So, $K=-C V_{0}$ . Inserting this into our solution, we have

$q(t)=C V_{0}\left(1-e^{-t / R C}\right) \nonumber$

Now we can study the behavior of this solution. For large times the second term goes to zero. Thus, the capacitor charges up, asymptotically, to the final value of $q_{0}=C V_{0}$ . This is what we expect, because the current is no longer flowing over $R$ and this just gives the relation between the potential difference across the capacitor plates when a charge of $q_{0}$ is established on the plates.

Charagiacgitor

Figure 2.21. The charge as a function of time for a charging capacitor with $R=2.00$ $\mathrm{k} \Omega, C=6.00 \mathrm{mF}$ , and $V_{0}=12 \mathrm{~V}$

Let’s put in some values for the parameters. We let $R=2.00 \mathrm{k} \Omega, C=6.00$ $\mathrm{mF}$ , and $V_{0}=12 \mathrm{~V}$ . A plot of the solution is given in Figure $2.21$ . We see that the charge builds up to the value of $C V_{0}=72 \mathrm{mC}$ . If we use a smaller resistance, $R=200 \Omega$ , we see in Figure $2.22$ that the capacitor charges to the same value, but much faster.

The rate at which a capacitor charges, or discharges, is governed by the time constant, $\tau=R C$ . This is the constant factor in the exponential. The larger it is, the slower the exponential term decays. If we set $t=\tau$ , we find that

$q(\tau)=C V_{0}\left(1-e^{-1}\right)=(1-0.3678794412 \ldots) q_{0} \approx 0.63 q_{0} \nonumber$

Thus, at time $t=\tau$ , the capacitor has almost charged to two thirds of its final value. For the first set of parameters, $\tau=12 \mathrm{~s}$ . For the second set, $\tau=1.2 \mathrm{~s}$ .

Charagiacgi tor

Figure 2.22. The charge as a function of time for a charging capacitor with $R=200$ $\Omega, C=6.00 \mathrm{mF}$ , and $V_{0}=12 \mathrm{~V}$ .

Now, let’s assume the capacitor is charged with charge $\pm q_{0}$ on its plates. If we disconnect the battery and reconnect the wires to complete the circuit, the charge will then move off the plates, discharging the capacitor. The relevant form of our initial value problem becomes

$R \dfrac{d q}{d t}+\dfrac{q}{C}=0, \quad q(0)=q_{0} \nonumber$

This equation is simpler to solve. Rearranging, we have

$\dfrac{d q}{d t}=-\dfrac{q}{R C} \nonumber$

This is a simple exponential decay problem, which you can solve using separation of variables. However, by now you should know how to immediately write down the solution to such problems of the form $y^{\prime}=k y$ . The solution is

$q(t)=q_{0} e^{-t / \tau}, \quad \tau=R C . \nonumber$

We see that the charge decays exponentially. In principle, the capacitor never fully discharges. That is why you are often instructed to place a shunt across a discharged capacitor to fully discharge it.

In Figure $2.23$ we show the discharging of our two previous RC circuits. Once again, $\tau=R C$ determines the behavior. At $t=\tau$ we have

$q(\tau)=q_{0} e^{-1}=(0.3678794412 \ldots) q_{0} \approx 0.37 q_{0} . \nonumber$

So, at this time the capacitor only has about a third of its original value.

Discolaøgíingor

$R=2000 \nonumber$

$\begin{aligned} & \mathrm{R}=200 \end{aligned} \nonumber$

Figure 2.23. The charge as a function of time for a discharging capacitor with $R=2.00$ $\mathrm{k} \Omega$ or $R=200 \Omega$ , and $C=6.00 \mathrm{mF}$ , and $q_{0}=72 \mathrm{mC}$ .

Example 2.20. LC Circuits

Another simple result comes from studying $L C$ circuits. We will now connect a charged capacitor to an inductor. In this case, we consider the initial value problem

$L \ddot{q}+\dfrac{1}{C} q=0, \quad q(0)=q_{0}, \dot{q}(0)=I(0)=0 \nonumber$

Dividing out the inductance, we have

$\ddot{q}+\dfrac{1}{L C} q=0 . \nonumber$

This equation is a second order, constant coefficient equation. It is of the same form as the one we saw earlier for simple harmonic motion of a mass on a spring. So, we expect oscillatory behavior. The characteristic equation is

$r^{2}+\dfrac{1}{L C}=0 . \nonumber$

The solutions are

$r_{1,2}=\pm \dfrac{i}{\sqrt{L C}} . \nonumber$

Thus, the solution of $(2.96)$ is of the form

$q(t)=c_{1} \cos (\omega t)+c_{2} \sin (\omega t), \quad \omega=(L C)^{-1 / 2} \nonumber$

Inserting the initial conditions yields

$q(t)=q_{0} \cos (\omega t) . \nonumber$

The oscillations that result are understandable. As the charge leaves the plates, the changing current induces a changing magnetic field in the inductor. The stored electrical energy in the capacitor changes to stored magnetic energy in the inductor. However, the process continues until the plates are charged with opposite polarity and then the process begins in reverse. The charged capacitor then discharges and the capacitor eventually returns to its original state and the whole system repeats this over and over.

The frequency of this simple harmonic motion is easily found. It is given

$f=\dfrac{\omega}{2 \pi}=\dfrac{1}{2 \pi} \dfrac{1}{\sqrt{L C}} . \nonumber$

This is called the tuning frequency because of its role in tuning circuits.

Of course, this is an ideal situation. There is always resistance in the circuit, even if only a small amount from the wires. So, we really need to account for resistance, or even add a resistor. This leads to a slightly more complicated system in which damping will be present. More complicated circuits are possible by looking at parallel connections, or other combinations, of resistors, capacitors and inductors. This will result in several equations for each loop in the circuit, leading to larger systems of differential equations. an example of another circuit setup is shown in Figure $2.24$ . This is not a problem that can be covered in the first year physics course.

Figure 2.24. A circuit with two loops containing several different circuit elements.

Figure 2.25. The previous parallel circuit with the directions indicated for traversing the loops in Kirchoff’s Laws.

We have three unknown functions for the charge. Once we know the charge functions, differentiation will yield the currents. However, we only have two equations. We need a third equation. This is found from Kirchoff’s Point (Junction) Rule. Consider the points A and B in Figure 2.25. Any charge (current) entering these junctions must be the same as the total charge (current) leaving the junctions. For point A we have

$I_{1}=I_{2}+I_{3}, \nonumber$

$\dot{q}_{1}=\dot{q}_{2}+\dot{q}_{3} . \nonumber$

Equations (2.100), (2.101), and (2.103) form a coupled system of differential equations for this problem. There are both first and second order derivatives involved. We can write the whole system in terms of charges as

$\begin{array}{r} R_{1} \dot{q}_{1}+\dfrac{q_{2}}{C}=V(t) \\[4pt] R_{2} \dot{q}_{3}+L \ddot{q}_{3}=\dfrac{q_{2}}{C} \\[4pt] \dot{q}_{1}=\dot{q}_{2}+\dot{q}_{3} \end{array} \nonumber$

The question is whether, or not, we can write this as a system of first order differential equations. Since there is only one second order derivative, we can introduce the new variable $q_{4}=\dot{q}_{3}$ . The first equation can be solved for $\dot{q}_{1}$ . The third equation can be solved for $\dot{q}_{2}$ with appropriate substitutions for the other terms. $\dot{q}_{3}$ is gotten from the definition of $q_{4}$ and the second equation can be solved for $\ddot{q}_{3}$ and substitutions made to obtain the system

$\begin{aligned} \dot{q}_{1} &=\dfrac{V}{R_{1}}-\dfrac{q_{2}}{R_{1} C} \\[4pt] \dot{q}_{2} &=\dfrac{V}{R_{1}}-\dfrac{q_{2}}{R_{1} C}-q_{4} \\[4pt] \dot{q}_{3} &=q_{4} \\[4pt] \dot{q}_{4} &=\dfrac{q_{2}}{L C}-\dfrac{R_{2}}{L} q_{4} \end{aligned} \nonumber$

So, we have a nonhomogeneous first order system of differential equations. In the last section we learned how to solve such systems.

Love Affairs

The next application is one that has been studied by several authors as a cute system involving relationships. One considers what happens to the affections that two people have for each other over time. Let $R$ denote the affection that Romeo has for Juliet and $J$ be the affection that Juliet has for Romeo. positive values indicate love and negative values indicate dislike.

One possible model is given by

$\begin{aligned} &\dfrac{d R}{d t}=b J \\[4pt] &\dfrac{d J}{d t}=c R \end{aligned} \nonumber$

with $b>0$ and $c<0$ . In this case Romeo loves Juliet the more she likes him. But Juliet backs away when she finds his love for her increasing.

A typical system relating the combined changes in affection can be modeled

$\begin{aligned} &\dfrac{d R}{d t}=a R+b J \\[4pt] &\dfrac{d J}{d t}=c R+d J \end{aligned} \nonumber$

Several scenarios are possible for various choices of the constants. For example, if $a>0$ and $b>0$ , Romeo gets more and more excited by Juliet’s love for him. If $c>0$ and $d<0$ , Juliet is being cautious about her relationship with Romeo. For specific values of the parameters and initial conditions, one can explore this match of an overly zealous lover with a cautious lover.

Predator Prey Models

Another common model studied is that of competing species. For example, we could consider a population of rabbits and foxes. Left to themselves, rabbits would tend to multiply, thus

$\dfrac{d R}{d t}=a R, \nonumber$

with $a>0$ . In such a model the rabbit population would grow exponentially. Similarly, a population of foxes would decay without the rabbits to feed on. So, we have that

$\dfrac{d F}{d t}=-b F \nonumber$

for $b>0$ .

Now, if we put these populations together on a deserted island, they would interact. The more foxes, the rabbit population would decrease. However, the more rabbits, the foxes would have plenty to eat and the population would thrive. Thus, we could model the competing populations as

$\begin{gathered} \dfrac{d R}{d t}=a R-c F, \\[4pt] \dfrac{d F}{d t}=-b F+d R, \end{gathered} \nonumber$

where all of the constants are positive numbers. Studying this coupled system would lead to as study of the dynamics of these populations. We will discuss other (nonlinear) systems in the next chapter.

Mixture Problems

There are many types of mixture problems. Such problems are standard in a first course on differential equations as examples of first order differential equations. Typically these examples consist of a tank of brine, water containing a specific amount of salt with pure water entering and the mixture leaving, or the flow of a pollutant into, or out of, a lake.

In general one has a rate of flow of some concentration of mixture entering a region and a mixture leaving the region. The goal is to determine how much stuff is in the region at a given time. This is governed by the equation

$\text { Rate of change of substance }=\text { Rate In }-\text { Rate Out. } \nonumber$

This can be generalized to the case of two interconnected tanks. We provide some examples.

Example 2.21. Single Tank Problem

A 50 gallon tank of pure water has a brine mixture with concentration of 2 pounds per gallon entering at the rate of 5 gallons per minute. [See Figure 2.26.] At the same time the well-mixed contents drain out at the rate of 5 gallons per minute. Find the amount of salt in the tank at time $t$ . In all such problems one assumes that the solution is well mixed at each instant of time.

Let $x(t)$ be the amount of salt at time $t$ . Then the rate at which the salt in the tank increases is due to the amount of salt entering the tank less that leaving the tank. To figure out these rates, one notes that $d x / d t$ has units of pounds per minute. The amount of salt entering per minute is given by the product of the entering concentration times the rate at which the brine enters. This gives the correct units:

$\left(2 \dfrac{\text { pounds }}{\text { gal }}\right)\left(5 \dfrac{\text { gal }}{\text { min }}\right)=10 \dfrac{\text { pounds }}{\text { min }} . \nonumber$

Similarly, one can determine the rate out as

$\left(\dfrac{x \text { pounds }}{50 \text { gal }}\right)\left(5 \dfrac{\text { gal }}{\mathrm{min}}\right)=\dfrac{x}{10} \dfrac{\text { pounds }}{\mathrm{min}} . \nonumber$

Thus, we have

$\dfrac{d x}{d t}=10-\dfrac{x}{10} \nonumber$

This equation is easily solved using the methods for first order equations.

Example 2.22. Double Tank Problem

One has two tanks connected together, labelled tank X and tank Y, as shown in Figure $2.27$ . Let tank $\mathrm{X}$ initially have 100 gallons of brine made with 100 pounds of salt. Tank $Y$ initially has 100 gallons of pure water. Now pure water is pumped into $\operatorname{tank} X$ at a rate of $2.0$ gallons per minute. Some of the mixture of brine and pure water flows into tank $Y$ at 3 gallons per minute. To keep the tank levels the same, one gallon of the Y mixture flows back into tank X at a rate of one gallon per minute and $2.0$ gallons per minute drains out. Find the amount of salt at any given time in the tanks. What happens over a long period of time?

In this problem we set up two equations. Let $x(t)$ be the amount of salt in tank X and $y(t)$ the amount of salt in tank $Y$ . Again, we carefully look at the rates into and out of each tank in order to set up the system of differential equations. We obtain the system

$\begin{aligned} &\dfrac{d x}{d t}=\dfrac{y}{100}-\dfrac{3 x}{100} \\[4pt] &\dfrac{d y}{d t}=\dfrac{3 x}{100}-\dfrac{3 y}{100} \end{aligned} \nonumber$

This is a linear, homogenous constant coefficient system of two first order equations, which we know how to solve.

Chemical Kinetics

There are many problems that come from studying chemical reactions. The simplest reaction is when a chemical $A$ turns into chemical $B$ . This happens at a certain rate, $k>0$ . This can be represented by the chemical formula

In this case we have that the rates of change of the concentrations of $A,[A]$ , and $B,[B]$ , are given by

$\begin{aligned} &\dfrac{d[A]}{d t}=-k[A] \\[4pt] &\dfrac{d[B]}{d t}=k[A] \end{aligned} \nonumber$

Think about this as it is a key to understanding the next reactions.

A more complicated reaction is given by

$A \underset{k_{1}}{\longrightarrow} B \underset{k_{2}}{\longrightarrow} C \text {. } \nonumber$

In this case we can add to the above equation the rates of change of concentrations $[B]$ and $[C]$ . The resulting system of equations is

$\begin{aligned} \dfrac{d[A]}{d t} &=-k_{1}[A] \\[4pt] \dfrac{d[B]}{d t} &=k_{1}[A]-k_{2}[B] \\[4pt] \dfrac{d[C]}{d t} &=k_{2}[B] \end{aligned} \nonumber$

One can further consider reactions in which a reverse reaction is possible. Thus, a further generalization occurs for the reaction

$A \underset{k_{1}}{\stackrel{k_{3}}{\longrightarrow}} B \underset{k_{2}}{\longrightarrow} C \text {. } \nonumber$

The resulting system of equations is

$\begin{aligned} &\dfrac{d[A]}{d t}=-k_{1}[A]+k_{3}[B] \\[4pt] &\dfrac{d[B]}{d t}=k_{1}[A]-k_{2}[B]-k_{3}[B] \\[4pt] &\dfrac{d[C]}{d t}=k_{2}[B] \end{aligned} \nonumber$

More complicated chemical reactions will be discussed at a later time.

Epidemics

Another interesting area of application of differential equation is in predicting the spread of disease. Typically, one has a population of susceptible people or animals. Several infected individuals are introduced into the population and one is interested in how the infection spreads and if the number of infected people drastically increases or dies off. Such models are typically nonlinear and we will look at what is called the SIR model in the next chapter. In this section we will model a simple linear model.

Let break the population into three classes. First, $S(t)$ are the healthy people, who are susceptible to infection. Let $I(t)$ be the number of infected people. Of these infected people, some will die from the infection and others recover. Let’s assume that initially there in one infected person and the rest, say $N$ , are obviously healthy. Can we predict how many deaths have occurred by time t?

Let’s try and model this problem using the compartmental analysis we had seen in the mixing problems. The total rate of change of any population would be due to those entering the group less those leaving the group. For example, the number of healthy people decreases due infection and can increase when some of the infected group recovers. Let’s assume that the rate of infection is proportional to the number of healthy people, $a S$ . Also, we assume that the number who recover is proportional to the number of infected, $r I$ . Thus, the rate of change of the healthy people is found as

$\dfrac{d S}{d t}=-a S+r I . \nonumber$

Let the number of deaths be $D(t)$ . Then, the death rate could be taken to be proportional to the number of infected people. So,

$\dfrac{d D}{d t}=d I \nonumber$

Finally, the rate of change of infectives is due to healthy people getting infected and the infectives who either recover or die. Using the corresponding terms in the other equations, we can write

$\dfrac{d I}{d t}=a S-r I-d I . \nonumber$

This linear system can be written in matrix form.

$\dfrac{d}{d t}\left(\begin{array}{c} S \\[4pt] I \\[4pt] D \end{array}\right)=\left(\begin{array}{ccc} -a & r & 0 \\[4pt] a & -d-r & 0 \\[4pt] 0 & d & 0 \end{array}\right)\left(\begin{array}{c} S \\[4pt] I \\[4pt] D \end{array}\right) \nonumber$

The eigenvalue equation for this system is

$\lambda\left[\lambda^{2}+(a+r+d) \lambda+a d\right]=0 . \nonumber$

The reader can find the solutions of this system and determine if this is a realistic model.

Appendix: Diagonalization and Linear Systems

As we have seen, the matrix formulation for linear systems can be powerful, especially for $n$ differential equations involving $n$ unknown functions. Our ability to proceed towards solutions depended upon the solution of eigenvalue problems. However, in the case of repeated eigenvalues we saw some additional complications. This all depends deeply on the background linear algebra. Namely, we relied on being able to diagonalize the given coefficient matrix. In this section we will discuss the limitations of diagonalization and introduce the Jordan canonical form.

We begin with the notion of similarity. Matrix $A$ is similar to matrix $B$ if and only if there exists a nonsingular matrix $P$ such that

$B=P^{-1} A P . \nonumber$

Recall that a nonsingular matrix has a nonzero determinant and is invertible.

We note that the similarity relation is an equivalence relation. Namely, it satisfies the following

$A$ is similar to itself.
If $A$ is similar to $B$ , then $B$ is similar to $A$ .
If $A$ is similar to $B$ and $B$ is similar to $C$ , the $A$ is similar to $C$ .

Also, if $A$ is similar to $B$ , then they have the same eigenvalues. This follows from a simple computation of the eigenvalue equation. Namely,

$\begin{aligned} 0 &=\operatorname{det}(B-\lambda I) \\[4pt] &=\operatorname{det}\left(P^{-1} A P-\lambda P^{-1} I P\right) \\[4pt] &=\operatorname{det}(P)^{-1} \operatorname{det}(A-\lambda I) \operatorname{det}(P) \\[4pt] &=\operatorname{det}(A-\lambda I) \end{aligned} \nonumber$

Therefore, $\operatorname{det}(A-\lambda I)=0$ and $\lambda$ is an eigenvalue of both $A$ and $B$ .

An $n \times n$ matrix $A$ is diagonalizable if and only if $A$ is similar to a diagonal matrix $D$ ; i.e., there exists a nonsingular matrix $P$ such that

$D=P^{-1} A P . \nonumber$

One of the most important theorems in linear algebra is the Spectral Theorem. This theorem tells us when a matrix can be diagonalized. In fact, it goes beyond matrices to the diagonalization of linear operators. We learn in linear algebra that linear operators can be represented by matrices once we pick a particular representation basis. Diagonalization is simplest for finite dimensional vector spaces and requires some generalization for infinite dimensional vectors spaces. Examples of operators to which the spectral theorem applies are self-adjoint operators (more generally normal operators on Hilbert spaces). We will explore some of these ideas later in the course. The spectral theorem provides a canonical decomposition, called the spectral decomposition, or eigendecomposition, of the underlying vector space on which it acts.

The next theorem tells us how to diagonalize a matrix:

Theorem 2.23. Let $A$ be an $n \times n$ matrix. Then $A$ is diagonalizable if and only if $A$ has $n$ linearly independent eigenvectors. If so, then

$D=P^{-1} A P . \nonumber$

If $\left\{v_{1}, \ldots, v_{n}\right\}$ are the eigenvectors of $A$ and $\left\{\lambda_{1}, \ldots, \lambda_{n}\right\}$ are the corresponding eigenvalues, then $v_{j}$ is the jth column of $P$ and $D_{j j}=\lambda_{j}$ .

A simpler determination results by noting

Theorem 2.24. Let $A$ be an $n \times n$ matrix with $n$ real and distinct eigenvalues. Then $A$ is diagonalizable.

Therefore, we need only look at the eigenvalues and determine diagonalizability. In fact, one also has from linear algebra the following result.

Theorem 2.25. Let $A$ be an $n \times n$ real symmetric matrix. Then $A$ is diagonalizable.

Recall that a symmetric matrix is one whose transpose is the same as the matrix, or $A_{i j}=A_{j i}$ .

Example 2.26. Consider the matrix

$A=\left(\begin{array}{lll} 1 & 2 & 2 \\[4pt] 2 & 3 & 0 \\[4pt] 2 & 0 & 3 \end{array}\right) \nonumber$

This is a real symmetric matrix. The characteristic polynomial is found to be

$\operatorname{det}(A-\lambda I)=-(\lambda-5)(\lambda-3)(\lambda+1)=0 \nonumber$

As before, we can determine the corresponding eigenvectors (for $\lambda=-1,3,5$ , respectively) as

$\left(\begin{array}{c} -2 \\[4pt] 1 \\[4pt] 1 \end{array}\right), \quad\left(\begin{array}{c} 0 \\[4pt] -1 \\[4pt] 1 \end{array}\right), \quad\left(\begin{array}{l} 1 \\[4pt] 1 \\[4pt] 1 \end{array}\right) \text {. } \nonumber$

We can use these to construct the diagonalizing matrix $P$ . Namely, we have

$P^{-1} A P=\left(\begin{array}{ccc} -2 & 0 & 1 \\[4pt] 1 & -1 & 1 \\[4pt] 1 & 1 & 1 \end{array}\right)^{-1}\left(\begin{array}{lll} 1 & 2 & 2 \\[4pt] 2 & 3 & 0 \\[4pt] 2 & 0 & 3 \end{array}\right)\left(\begin{array}{ccc} -2 & 0 & 1 \\[4pt] 1 & -1 & 1 \\[4pt] 1 & 1 & 1 \end{array}\right)=\left(\begin{array}{ccc} -1 & 0 & 0 \\[4pt] 0 & 3 & 0 \\[4pt] 0 & 0 & 5 \end{array}\right) \nonumber$

Now diagonalization is an important idea in solving linear systems of first order equations, as we have seen for simple systems. If our system is originally diagonal, that means our equations are completely uncoupled. Let our system take the form

$\dfrac{d \mathbf{y}}{d t}=D \mathbf{y} \nonumber$

where $D$ is diagonal with entries $\lambda_{i}, i=1, \ldots, n$ . The system of equations, $y_{i}^{\prime}=\lambda_{i} y_{i}$ , has solutions

$y_{i}(t)=c_{c} e^{\lambda_{i} t} \nonumber$

Thus, it is easy to solve a diagonal system.

Let $A$ be similar to this diagonal matrix. Then

$\dfrac{d \mathbf{y}}{d t}=P^{-1} A P \mathbf{y} \nonumber$

This can be rewritten as

$\dfrac{d P \mathbf{y}}{d t}=A P \mathbf{y} \nonumber$

Defining $\mathbf{x}=P \mathbf{y}$ , we have

$\dfrac{d \mathbf{x}}{d t}=A \mathbf{x} \nonumber$

This simple derivation shows that if $A$ is diagonalizable, then a transformation of the original system in $\mathbf{x}$ to new coordinates, or a new basis, results in a simpler system in $\mathbf{y}$ .

However, it is not always possible to diagonalize a given square matrix. This is because some matrices do not have enough linearly independent vectors, or we have repeated eigenvalues. However, we have the following theorem:

Theorem 2.27. Every $n \times n$ matrix $A$ is similar to a matrix of the form

$J=\operatorname{diag}\left[J_{1}, J_{2}, \ldots, J_{n}\right] \nonumber$

where

$J_{i}=\left(\begin{array}{ccccc} \lambda_{i} & 1 & 0 & \cdots & 0 \\[4pt] 0 & \lambda_{i} & 1 & \cdots & 0 \\[4pt] \vdots & \ddots & \ddots & \ddots & \vdots \\[4pt] 0 & \cdots & 0 & \lambda_{i} & 1 \\[4pt] 0 & 0 & \cdots & 0 & \lambda_{i} \end{array}\right) \nonumber$

We will not go into the details of how one finds this Jordan Canonical Form or proving the theorem. In practice you can use a computer algebra system to determine this and the similarity matrix. However, we would still need to know how to use it to solve our system of differential equations. Example 2.28. Let’s consider a simple system with the $3 \times 3$ Jordan block

$A=\left(\begin{array}{lll} 2 & 1 & 0 \\[4pt] 0 & 2 & 1 \\[4pt] 0 & 0 & 2 \end{array}\right) \nonumber$

The corresponding system of coupled first order differential equations takes the form

$\begin{aligned} &\dfrac{d x_{1}}{d t}=2 x_{1}+x_{2}, \\[4pt] &\dfrac{d x_{2}}{d t}=2 x_{2}+x_{3}, \\[4pt] &\dfrac{d x_{3}}{d t}=2 x_{3} . \end{aligned} \nonumber$

The last equation is simple to solve, giving $x_{3}(t)=c_{3} e^{2 t}$ . Inserting into the second equation, you have a

$\dfrac{d x_{2}}{d t}=2 x_{2}+c_{3} e^{2 t} \nonumber$

Using the integrating factor, $e^{-2 t}$ , one can solve this equation to get $x_{2}(t)=$ $\left(c_{2}+c_{3} t\right) e^{2 t}$ . Similarly, one can solve the first equation to obtain $x_{1}(t)=$ $\left(c_{1}+c_{2} t+\dfrac{1}{2} c_{3} t^{2}\right) e^{2 t}$

This should remind you of a problem we had solved earlier leading to the generalized eigenvalue problem in (2.43). This suggests that there is a more general theory when there are multiple eigenvalues and relating to Jordan canonical forms.

Let’s write the solution we just obtained in vector form. We have

$\mathbf{x}(t)=\left[c_{1}\left(\begin{array}{l} 1 \\[4pt] 0 \\[4pt] 0 \end{array}\right)+c_{2}\left(\begin{array}{l} t \\[4pt] 1 \\[4pt] 0 \end{array}\right)+c_{3}\left(\begin{array}{c} \dfrac{1}{2} t^{2} \\[4pt] t \\[4pt] 1 \end{array}\right)\right] e^{2 t} \nonumber$

It looks like this solution is a linear combination of three linearly independent solutions,

$\begin{aligned} &\mathbf{x}=\mathbf{v}_{1} e^{2 \lambda t} \\[4pt] &\mathbf{x}=\left(t \mathbf{v}_{1}+\mathbf{v}_{2}\right) e^{\lambda t} \\[4pt] &\mathbf{x}=\left(\dfrac{1}{2} t^{2} \mathbf{v}_{1}+t \mathbf{v}_{2}+\mathbf{v}_{3}\right) e^{\lambda t} \end{aligned} \nonumber$

where $\lambda=2$ and the vectors satisfy the equations

$\begin{aligned} &(A-\lambda I) \mathbf{v}_{1}=0 \\[4pt] &(A-\lambda I) \mathbf{v}_{2}=\mathbf{v}_{1} \\[4pt] &(A-\lambda I) \mathbf{v}_{3}=\mathbf{v}_{2} \end{aligned} \nonumber$

and

$\begin{aligned} (A-\lambda I) \mathbf{v}_{1} &=0 \\[4pt] (A-\lambda I)^{2} \mathbf{v}_{2} &=0 \\[4pt] (A-\lambda I)^{3} \mathbf{v}_{3} &=0 \end{aligned} \nonumber$

It is easy to generalize this result to build linearly independent solutions corresponding to multiple roots (eigenvalues) of the characteristic equation.

Problems

2.1. Consider the system

$\begin{array}{r} x^{\prime}=-4 x-y \\[4pt] y^{\prime}=x-2 y \end{array} \nonumber$

a. Determine the second order differential equation satisfied by $x(t)$ .

b. Solve the differential equation for $x(t)$ .

c. Using this solution, find $y(t)$ .

d. Verify your solutions for $x(t)$ and $y(t)$ .

e. Find a particular solution to the system given the initial conditions $x(0)=$ 1 and $y(0)=0$ .

2.2. Consider the following systems. Determine the families of orbits for each system and sketch several orbits in the phase plane and classify them by their type (stable node, etc.)

$\begin{aligned} &x^{\prime}=3 x \\[4pt] &y^{\prime}=-2 y \end{aligned} \nonumber$

$\begin{aligned} &x^{\prime}=-y \\[4pt] &y^{\prime}=-5 x \end{aligned} \nonumber$

$\begin{aligned} &x^{\prime}=2 y \\[4pt] &y^{\prime}=-3 x \end{aligned} \nonumber$

$\mathrm{d}$

$\begin{aligned} &x^{\prime}=x-y \\[4pt] &y^{\prime}=y \end{aligned} \nonumber$

$\begin{aligned} &x^{\prime}=2 x+3 y \\[4pt] &y^{\prime}=-3 x+2 y \end{aligned} \nonumber$

2.3. Use the transformations relating polar and Cartesian coordinates to prove that

$\dfrac{d \theta}{d t}=\dfrac{1}{r^{2}}\left[x \dfrac{d y}{d t}-y \dfrac{d x}{d t}\right] \nonumber$

2.4. In Equation (2.34) the exponential of a matrix was defined.

a. Let

$A=\left(\begin{array}{ll} 2 & 0 \\[4pt] 0 & 0 \end{array}\right) \nonumber$

Compute $e^{A}$ .

b. Give a definition of $\cos A$ and compute $\cos \left(\begin{array}{ll}1 & 0 \\[4pt] 0 & 2\end{array}\right)$ in simplest form.

c. Prove $e^{P A P^{-1}}=P e^{A} P^{-1}$ .

2.5. Consider the general system

$\begin{aligned} &x^{\prime}=a x+b y \\[4pt] &y^{\prime}=c x+d y . \end{aligned} \nonumber$

Can one determine the family of trajectories for the general case? Recall, this means we have to solve the first order equation

$\dfrac{d y}{d x}=\dfrac{c x+d y}{a x+b y} . \nonumber$

[Actually, this equation is homogeneous of degree 0.] It can be written in the form $\dfrac{d y}{d x}=F\left(\dfrac{y}{x}\right)$ . For such equations, one can make the substitution $z=\dfrac{y}{x}$ , or $y(x)=x z(x)$ , and obtain a separable equation for $z$ .

a. Using the general system, show that $z=z(x)$ satisfies and equation of the form

$x \dfrac{d z}{d x}=F(z)-z . \nonumber$

Identify the function $F(z)$ .

b. Use the equation for $z(x)$ in part a to find the family of trajectories of the system

$\begin{aligned} x^{\prime} &=x-y \\[4pt] y^{\prime} &=x+y . \end{aligned} \nonumber$

First determine the appropriate $F(z)$ and then solve the resulting separable equation as a relation between $z$ and $x$ . Then write the solution of the original equation in terms of $x$ and $y$ . c. Use polar coordinates to describe the family of solutions obtained. You can rewrite the solution in polar coordinates and/or solve the system rewritten in polar coordinates.

2.6. Find the eigenvalue(s) and eigenvector(s) for the following:
a. $\left(\begin{array}{ll}4 & 2 \\[4pt] 3 & 3\end{array}\right)$
b. $\left(\begin{array}{ll}3 & -5 \\[4pt] 1 & -1\end{array}\right)$
c. $\left(\begin{array}{ll}4 & 1 \\[4pt] 0 & 4\end{array}\right)$
d. $\left(\begin{array}{ccc}1 & -1 & 4 \\[4pt] 3 & 2 & -1 \\[4pt] 2 & 1 & -1\end{array}\right)$

2.7. Consider the following systems. For each system determine the coefficient matrix. When possible, solve the eigenvalue problem for each matrix and use the eigenvalues and eigenfunctions to provide solutions to the given systems. Finally, in the common cases which you investigated in Problem 2.2, make comparisons with your previous answers, such as what type of eigenvalues correspond to stable nodes.

$\begin{aligned} &x^{\prime}=3 x-y \\[4pt] &y^{\prime}=2 x-2 y \end{aligned} \nonumber$

$\begin{aligned} &x^{\prime}=-y \\[4pt] &y^{\prime}=-5 x \end{aligned} \nonumber$

$\begin{aligned} &x^{\prime}=x-y \\[4pt] &y^{\prime}=y \end{aligned} \nonumber$

$\mathrm{d}$

$\begin{aligned} &x^{\prime}=2 x+3 y \\[4pt] &y^{\prime}=-3 x+2 y \end{aligned} \nonumber$

$\begin{aligned} &x^{\prime}=-4 x-y \\[4pt] &y^{\prime}=x-2 y . \end{aligned} \nonumber$

$\begin{aligned} &x^{\prime}=x-y \\[4pt] &y^{\prime}=x+y \end{aligned} \nonumber$

2.8. For each of the following matrices consider the system $\mathbf{x}^{\prime}=A \mathbf{x}$ and

a. Find the fundamental solution matrix.

b. Find the principal solution matrix.

$A=\left(\begin{array}{ll} 1 & 1 \\[4pt] 4 & 1 \end{array}\right) \nonumber$

$A=\left(\begin{array}{ll} 2 & 5 \\[4pt] 0 & 2 \end{array}\right) \nonumber$

$A=\left(\begin{array}{cc} 4 & -13 \\[4pt] 2 & -6 \end{array}\right) \nonumber$

$\mathrm{d} .$

$A=\left(\begin{array}{ccc} 1 & -1 & 4 \\[4pt] 3 & 2 & -1 \\[4pt] 2 & 1 & -1 \end{array}\right) \nonumber$

2.9. For the following problems

Rewrite the problem in matrix form.
Find the fundamental matrix solution.
Determine the general solution of the nonhomogeneous system.
Find the principal matrix solution.
Determine the particular solution of the initial value problem.

a. $y^{\prime \prime}+y=2 \sin 3 x, \quad y(0)=2, \quad y^{\prime}(0)=0$ .

b. $y^{\prime \prime}-3 y^{\prime}+2 y=20 e^{-2 x}, \quad y(0)=0, \quad y^{\prime}(0)=6$ .

2.10. Prove Equation $(2.75)$

$\mathbf{x}(t)=\Psi(t) \mathbf{x}_{0}+\Psi(t) \int_{t_{0}}^{t} \Psi^{-1}(s) \mathbf{f}(s) d s \nonumber$

starting with Equation (2.73)

2.11. Add a third spring connected to mass two in the coupled system shown in Figure $2.19$ to a wall on the far right. Assume that the masses are the same and the springs are the same.

a. Model this system with a set of first order differential equations.

b. If the masses are all $2.0 \mathrm{~kg}$ and the spring constants are all $10.0 \mathrm{~N} / \mathrm{m}$ , then find the general solution for the system. c. Move mass one to the left (of equilibrium) $10.0 \mathrm{~cm}$ and mass two to the right $5.0 \mathrm{~cm}$ . Let them go. find the solution and plot it as a function of time. Where is each mass at $5.0$ seconds?

2.12. Consider the series circuit in Figure $2.20$ with $L=1.00 \mathrm{H}, R=1.00 \times 10^{2}$ $\Omega, C=1.00 \times 10^{-4} \mathrm{~F}$ , and $V_{0}=1.00 \times 10^{3} \mathrm{~V} .$

a. Set up the problem as a system of two first order differential equations for the charge and the current.

b. Suppose that no charge is present and no current is flowing at time $t=0$ when $V_{0}$ is applied. Find the current and the charge on the capacitor as functions of time.

c. Plot your solutions and describe how the system behaves over time.

2.13. You live in a cabin in the mountains and you would like to provide yourself with water from a water tank that is 25 feet above the level of the pipe going into the cabin. [See Figure 2.28.] The tank is filled from an aquifer $125 \mathrm{ft}$ below the surface and being pumped at a maximum rate of 7 gallons per minute. As this flow rate is not sufficient to meet your daily needs, you would like to store water in the tank and have gravity supply the needed pressure. So, you design a cylindrical tank that is $35 \mathrm{ft}$ high and has a $10 \mathrm{ft}$ diameter. The water then flows through pipe at the bottom of the tank. You are interested in the height $h$ of the water at time $t$ . This in turn will allow you to figure the water pressure.

Figure 2.28. A water tank problem in the mountains.

First, the differential equation governing the flow of water from a tank through an orifice is given as

$\dfrac{d h}{d t}=\dfrac{K-\alpha a \sqrt{2 g h}}{A} \nonumber$

Here $K$ is the rate at which water is being pumped into the top of the tank. $A$ is the cross sectional area of this tank. $\alpha$ is called the contraction coefficient, which measures the flow through the orifice, which has cross section $a$ . We will assume that $\alpha=0.63$ and that the water enters in a 6 in diameter PVC pipe.

a. Assuming that the water tank is initially full, find the minimum flow rate in the system during the first two hours.

b. What is the minimum water pressure during the first two hours? Namely, what is the gauge pressure at the house? Note that $\Delta P=\rho g H$ , where $\rho$ is the water density and $H$ is the total height of the fluid (tank plus vertical pipe). Note that $\rho g=0.434$ psi (pounds per square inch).

c. How long will it take for the tank to drain to $10 \mathrm{ft}$ above the base of the tank?

Other information you may need is 1 gallon $=231$ in ${ }^{2}$ and $g=32.2 \mathrm{ft} / \mathrm{s}^{2}$ .

2.14. Initially a 200 gallon tank is filled with pure water. At time $t=0$ a salt concentration with 3 pounds of salt per gallon is added to the container at the rate of 4 gallons per minute, and the well-stirred mixture is drained from the container at the same rate.

a. Find the number of pounds of salt in the container as a function of time.

b. How many minutes does it take for the concentration to reach 2 pounds per gallon?

c. What does the concentration in the container approach for large values of time? Does this agree with your intuition?

d. Assuming that the tank holds much more than 200 gallons, and everything is the same except that the mixture is drained at 3 gallons per minute, what would the answers to parts a and become?

2.15. You make two gallons of chili for a party. The recipe calls for two teaspoons of hot sauce per gallon, but you had accidentally put in two tablespoons per gallon. You decide to feed your guests the chili anyway. Assume that the guests take $1 \mathrm{cup} / \mathrm{min}$ of chili and you replace what was taken with beans and tomatoes without any hot sauce. [1 gal $=16$ cups and $1 \mathrm{~Tb}=3 \mathrm{tsp} .]$

a. Write down the differential equation and initial condition for the amount of hot sauce as a function of time in this mixture-type problem.

b. Solve this initial value problem.

c. How long will it take to get the chili back to the recipe’s suggested concentration?

2.16. Consider the chemical reaction leading to the system in (2.111). Let the rate constants be $k_{1}=0.20 \mathrm{~ms}^{-1}, k_{2}=0.05 \mathrm{~ms}^{-1}$ , and $k_{3}=0.10 \mathrm{~ms}^{-1}$ . What do the eigenvalues of the coefficient matrix say about the behavior of the system? Find the solution of the system assuming $[A](0)=A_{0}=1.0$ $\mu \mathrm{mol},[B](0)=0$ , and $[C](0)=0$ . Plot the solutions for $t=0.0$ to $50.0 \mathrm{~ms}$ and describe what is happening over this time. 2.17. Consider the epidemic model leading to the system in (2.112). Choose the constants as $a=2.0$ days $^{-1}, d=3.0$ days $^{-1}$ , and $r=1.0$ days $^{-1}$ . What are the eigenvalues of the coefficient matrix? Find the solution of the system assuming an initial population of 1,000 and one infected individual. Plot the solutions for $t=0.0$ to $5.0$ days and describe what is happening over this time. Is this model realistic?

Nonlinear Systems

Introduction

Most of your studies of differential equations to date have been the study linear differential equations and common methods for solving them. However, the real world is very nonlinear. So, why study linear equations? Because they are more readily solved. As you may recall, we can use the property of linear superposition of solutions of linear differential equations to obtain general solutions. We will see that we can sometimes approximate the solutions of nonlinear systems with linear systems in small regions of phase space.

In general, nonlinear equations cannot be solved obtaining general solutions. However, we can often investigate the behavior of the solutions without actually being able to find simple expressions in terms of elementary functions. When we want to follow the evolution of these solutions, we resort to numerically solving our differential equations. Such numerical methods need to be executed with care and there are many techniques that can be used. We will not go into these techniques in this course. However, we can make use of computer algebra systems, or computer programs, already developed for obtaining such solutions.

Nonlinear problems occur naturally. We will see problems from many of the same fields we explored in Section 2.9. One example is that of population dynamics. Typically, we have a certain population, $y(t)$ , and the differential equation governing the growth behavior of this population is developed in a manner similar to that used previously for mixing problems. We note that the rate of change of the population is given by the Rate In minus the Rate Out. The Rate In is given by the number of the species born per unit time. The Rate Out is given by the number that die per unit time.

A simple population model can be obtained if one assumes that these rates are linear in the population. Thus, we assume that the Rate In $=b y$ and the Rate Out $=m y$ . Here we have denoted the birth rate as $b$ and the mortality rate as $m$ , . This gives the rate of change of population as

$\dfrac{d y}{d t}=b y-m y \nonumber$

Generally, these rates could depend upon time. In the case that they are both constant rates, we can define $k=b-m$ and we obtain the familiar exponential model:

$\dfrac{d y}{d t}=k y . \nonumber$

This is easily solved and one obtains exponential growth $(k>0)$ or decay $(k<$ $0)$ . This model has been named after Malthus 1 , a clergyman who used this model to warn of the impending doom of the human race if its reproductive practices continued.

However, when populations get large enough, there is competition for resources, such as space and food, which can lead to a higher mortality rate. Thus, the mortality rate may be a function of the population size, $m=m(y)$ . The simplest model would be a linear dependence, $m=\tilde{m}+c y$ . Then, the previous exponential model takes the form

$\dfrac{d y}{d t}=k y-c y^{2} \nonumber$

This is known as the logistic model of population growth. Typically, $c$ is small and the added nonlinear term does not really kick in until the population gets large enough.

While one can solve this particular equation, it is instructive to study the qualitative behavior of the solutions without actually writing down the explicit solutions. Such methods are useful for more difficult nonlinear equations. We will investigate some simple first order equations in the next section. In the following section we present the analytic solution for completeness.

We will resume our studies of systems of equations and various applications throughout the rest of this chapter. We will see that we can get quite a bit of information about the behavior of solutions by using some of our earlier methods for linear systems.

Autonomous First Order Equations

In this section we will review the techniques for studying the stability of nonlinear first order autonomous equations. We will then extend this study to looking at families of first order equations which are connected through a parameter.

Recall that a first order autonomous equation is given in the form

${ }^{1}$ Malthus, Thomas Robert. An Essay on the Principle of Population. Library of Economics and Liberty. Retrieved August 2, 2007 from the World Wide Web: http://www.econlib.org/library/Malthus/malPop1.html

$\dfrac{d y}{d t}=f(y) . \nonumber$

We will assume that $f$ and $\dfrac{\partial f}{\partial y}$ are continuous functions of $y$ , so that we know that solutions of initial value problems exist and are unique.

We will recall the qualitative methods for studying autonomous equations by considering the example

$\dfrac{d y}{d t}=y-y^{2} . \nonumber$

This is just an example of a logistic equation.

First, one determines the equilibrium, or constant, solutions given by $y^{\prime}=$ 0 . For this case, we have $y-y^{2}=0$ . So, the equilibrium solutions are $y=0$ and $y=1$ . Sketching these solutions, we divide the $t y$ -plane into three regions. Solutions that originate in one of these regions at $t=t_{0}$ will remain in that region for all $t>t_{0}$ since solutions cannot intersect. [Note that if two solutions intersect then they have common values $y_{1}$ at time $t_{1}$ . Using this information, we could set up an initial value problem for which the initial condition is $y\left(t_{1}\right)=y_{1}$ . Since the two different solutions intersect at this point in the phase plane, we would have an initial value problem with two different solutions corresponding to the same initial condition. This contradicts the uniqueness assumption stated above. We will leave the reader to explore this further in the homework.]

Next, we determine the behavior of solutions in the three regions. Noting that $d y / d t$ gives the slope of any solution in the plane, then we find that the solutions are monotonic in each region. Namely, in regions where $d y / d t>0$ , we have monotonically increasing functions. We determine this from the right side of our equation.

For example, in this problem $y-y^{2}>0$ only for the middle region and $y-y^{2}<0$ for the other two regions. Thus, the slope is positive in the middle region, giving a rising solution as shown in Figure 3.1. Note that this solution does not cross the equilibrium solutions. Similar statements can be made about the solutions in the other regions.

We further note that the solutions on either side of $y=1$ tend to approach this equilibrium solution for large values of $t$ . In fact, no matter how close one is to $y=1$ , eventually one will approach this solution as $t \rightarrow \infty$ . So, the equilibrium solution is a stable solution. Similarly, we see that $y=0$ is an unstable equilibrium solution.

If we are only interested in the behavior of the equilibrium solutions, we could just construct a phase line. In Figure $3.2$ we place a vertical line to the right of the $t y$ -plane plot. On this line one first places dots at the corresponding equilibrium solutions and labels the solutions. These points at the equilibrium solutions are end points for three intervals. In each interval one then places arrows pointing upward (downward) indicating solutions with positive (negative) slopes. Looking at the phase line one can now determine if a given equilibrium is stable (arrows pointing towards the point) or unstable

Figure 3.1. Representative solution behavior for $y^{\prime}=y-y^{2}$ .

(arrows pointing away from the point). In Figure $3.3$ we draw the final phase line by itself.

Figure 3.2. Representative solution behavior and phase line for $y^{\prime}=y-y^{2}$ .

Solution of the Logistic Equation

We have seen that one does not need an explicit solution of the logistic equation (3.2) in order to study the behavior of its solutions. However, the logistic equation is an example of a nonlinear first order equation that is solvable. It is an example of a Riccati equation.

The general form of the Riccati equation is

Figure 3.3. Phase line for $y^{\prime}=y-y^{2}$ .

$\dfrac{d y}{d t}=a(t)+b(t) y+c(t) y^{2} \nonumber$

As long as $c(t) \neq 0$ , this equation can be reduced to a second order linear differential equation through the transformation

$y(t)=-\dfrac{1}{c(t)} \dfrac{\dot{x}(t)}{x(t)} . \nonumber$

We will demonstrate this using the simple case of the logistic equation,

$\dfrac{d y}{d t}=k y-c y^{2} . \nonumber$

We let

$y(t)=\dfrac{1}{c} \dfrac{\dot{x}}{x} \nonumber$

Then

$\begin{aligned} \dfrac{d y}{d t} &=\dfrac{1}{c}\left[\dfrac{\ddot{x}}{x}-\left(\dfrac{\dot{x}}{x}\right)^{2}\right] \\[4pt] &=\dfrac{1}{c}\left[\dfrac{\ddot{x}}{x}-(c y)^{2}\right] \\[4pt] &=\dfrac{1}{c} \dfrac{\ddot{x}}{x}-c y^{2} \end{aligned} \nonumber$

Inserting this into the logistic equation (3.5), we have

$\dfrac{1}{c} \dfrac{\ddot{x}}{x}-c y^{2}=k \dfrac{1}{c}\left(\dfrac{\dot{x}}{x}\right)-c y^{2}, \nonumber$

$\ddot{x}=k \dot{x} . \nonumber$

This equation is readily solved to give

$x(t)=A+B e^{k t} . \nonumber$

Therefore, we have the solution to the logistic equation is

$y(t)=\dfrac{1}{c} \dfrac{\dot{x}}{x}=\dfrac{k B e^{k t}}{c\left(A+B e^{k t}\right)} \nonumber$

It appears that we have two arbitrary constants. But, we started out with a first order differential equation and expect only one arbitrary constant. However, we can resolve this by dividing the numerator and denominator by $k B e^{k t}$ and defining $C=\dfrac{A}{B}$ . Then we have

$y(t)=\dfrac{k / c}{1+C e^{-k t}}, \nonumber$

showing that there really is only one arbitrary constant in the solution.

We should note that this is not the only way to obtain the solution to the logistic equation, though it does provide an introduction to Riccati equations. A more direct approach would be to use separation of variables on the logistic equation. The reader should verify this.

$3.4$ Bifurcations for First Order Equations

In this section we introduce families of first order differential equations of the form

$\dfrac{d y}{d t}=f(y ; \mu) . \nonumber$

Here $\mu$ is a parameter that we can change and then observe the resulting effects on the behaviors of the solutions of the differential equation. When a small change in the parameter leads to large changes in the behavior of the solution, then the system is said to undergo a bifurcation. We will turn to some generic examples, leading to special bifurcations of first order autonomous differential equations.

Example 3.1. $y^{\prime}=y^{2}-\mu$ .

First note that equilibrium solutions occur for $y^{2}=\mu$ . In this problem, there are three cases to consider.

$\mu>0$ .

In this case there are two real solutions, $y=\pm \sqrt{\mu}$ . Note that $y^{2}-\mu<0$ for $|y|<\sqrt{\mu}$ . So, we have the left phase line in Figure 3.4. 2. $\mu=0$ .

There is only one equilibrium point at $y=0$ . The equation becomes $y^{\prime}=y^{2}$ . It is obvious that the right side of this equation is never negative. So, the phase line is shown as the middle line in Figure 3.4.

$\mu<0$ .

In this case there are no equilibrium solutions. Since $y^{2}-\mu>0$ , the slopes for all solutions are positive as indicated by the last phase line in Figure $3.4$

Figure 3.4. Phase lines for $y^{\prime}=y^{2}-\mu$ . On the left $\mu>0$ and on the right $\mu<0$ .

We can combine these results into one diagram known as a bifurcation diagram. We plot the equilibrium solutions $y$ vs $\mu$ . We begin by lining up the phase lines for various $\mu$ ’s. We display these in Figure 3.5. Note the pattern of equilibrium points satisfies $y=\mu^{2}$ as it should. This is easily seen to be a parabolic curve. The upper branch of this curve is a collection of unstable equilibria and the bottom is a stable branch. So, we can dispose of the phase lines and just keep the equilibria. However, we will draw the unstable branch as a dashed line and the stable branch as a solid line.

The bifurcation diagram is displayed in Figure 3.6. This type of bifurcation is called a saddle-node bifurcation. The point $\mu=0$ at which the behavior changes is called the bifurcation point. As $\mu$ goes from negative to positive, we go from having no equilibria to having one stable and one unstable equilibrium point.

Example 3.2. $y^{\prime}=y^{2}-\mu y$ .

In this example we have two equilibrium points, $y=0$ and $y=\mu$ . The behavior of the solutions depends upon the sign of $y^{2}-\mu y=y(y-\mu)$ . This leads to four cases with the indicated signs of the derivative.

$y>0, y-\mu>0 \Rightarrow y^{\prime}>0$ .
$y<0, y-\mu>0 \Rightarrow y^{\prime}<0$
$y>0, y-\mu<0 \Rightarrow y^{\prime}<0$ .
$y<0, y-\mu<0 \Rightarrow y^{\prime}>0$ .

The corresponding phase lines and superimposed bifurcation diagram are shown in 3.7. The bifurcation diagram is in Figure $3.8$ and this is called a transcritical bifurcation.

Figure 3.5. The typical phase lines for $y^{\prime}=y^{2}-\mu$ .

Figure 3.6. Bifurcation diagram for $y^{\prime}=y^{2}-\mu$ . This is an example of a saddle-node bifurcation.

Figure 3.7. Collection of phase lines for $y^{\prime}=y^{2}-\mu y$ .

Example 3.3. $y^{\prime}=y^{3}-\mu y$ .

For this last example, we find from $y^{3}-\mu y=y\left(y^{2}-\mu\right)=0$ that there are two cases.

$\mu<0$ In this case there is only one equilibrium point at $y=0$ . For positive values of $y$ we have that $y^{\prime}>0$ and for negative values of $y$ we have that $y^{\prime}<0$ . Therefore, this is an unstable equilibrium point.

Figure 3.8. Bifurcation diagram for $y^{\prime}=y^{2}-\mu y$ . This is an example of a transcritical bifurcation.

$\mu>0$ Here we have three equilibria, $x=0, \pm \sqrt{\mu}$ . A careful investigation shows that $x=0$ . is a stable equilibrium point and that the other two equilibria are unstable.

In Figure $3.9$ we show the phase lines for these two cases. The corresponding bifurcation diagram is then sketched in Figure 3.10. For obvious reasons this has been labeled a pitchfork bifurcation.

Figure 3.9. The phase lines for $y^{\prime}=y^{3}-\mu y$ . The left one corresponds to $\mu<0$ and the right phase line is for $\mu>0$ .

Figure 3.10. Bifurcation diagram for $y^{\prime}=y^{3}-\mu y$ . This is an example of a pitchfork bifurcation.

Nonlinear Pendulum

In this section we will introduce the nonlinear pendulum as our first example of periodic motion in a nonlinear system. Oscillations are important in many areas of physics. We have already seen the motion of a mass on a spring, leading to simple, damped, and forced harmonic motions. Later we will explore these effects on a simple nonlinear system. In this section we will introduce the nonlinear pendulum and determine its period of oscillation.

We begin by deriving the pendulum equation. The simple pendulum consists of a point mass $m$ hanging on a string of length $L$ from some support. [See Figure 3.11.] One pulls the mass back to some starting angle, $\theta_{0}$ , and releases it. The goal is to find the angular position as a function of time, $\theta(t)$ .

Figure 3.11. A simple pendulum consists of a point mass $m$ attached to a string of length $L$ . It is released from an angle $\theta_{0}$ .

There are a couple of derivations possible. We could either use Newton’s Second Law of Motion, $F=m a$ , or its rotational analogue in terms of torque. We will use the former only to limit the amount of physics background needed.

There are two forces acting on the point mass, the weight and the tension in the string. The weight points downward and has a magnitude of $m g$ , where $g$ is the standard symbol for the acceleration due to gravity. At the surface of the earth we can take this to be $9.8 \mathrm{~m} / \mathrm{s}^{2}$ or $32.2 \mathrm{ft} / \mathrm{s}^{2}$ . In Figure $3.12$ we show both the weight and the tension acting on the mass. The net force is also shown.

The tension balances the projection of the weight vector, leaving an unbalanced component of the weight in the direction of the motion. Thus, the magnitude of the sum of the forces is easily found from this unbalanced component as $F=m g \sin \theta$ .

Newton’s Second Law of Motion tells us that the net force is the mass times the acceleration. So, we can write

$m \ddot{x}=-m g \sin \theta . \nonumber$

Next, we need to relate $x$ and $\theta . x$ is the distance traveled, which is the length of the arc traced out by our point mass. The arclength is related to the angle, provided the angle is measured in radians. Namely, $x=r \theta$ for $r=L$ . Thus, we can write

Figure 3.12. There are two forces acting on the mass, the weight $m g$ and the tension $T$ . The magnitude of the net force is found to be $F=m g \sin \theta$ .

$m L \ddot{\theta}=-m g \sin \theta \nonumber$

Canceling the masses, leads to the nonlinear pendulum equation

$L \ddot{\theta}+g \sin \theta=0 . \nonumber$

There are several variations of Equation (3.8) which will be used in this text. The first one is the linear pendulum. This is obtained by making a small angle approximation. For small angles we know that $\sin \theta \approx \theta$ . Under this approximation (3.8) becomes

$L \ddot{\theta}+g \theta=0 . \nonumber$

We can also make the system more realistic by adding damping. This could be due to energy loss in the way the string is attached to the support or due to the drag on the mass, etc. Assuming that the damping is proportional to the angular velocity, we have equations for the damped nonlinear and damped linear pendula:

$\begin{gathered} L \ddot{\theta}+b \dot{\theta}+g \sin \theta=0 . \\[4pt] L \ddot{\theta}+b \dot{\theta}+g \theta=0 . \end{gathered} \nonumber$

Finally, we can add forcing. Imagine that the support is attached to a device to make the system oscillate horizontally at some frequency. Then we could have equations such as

$L \ddot{\theta}+b \dot{\theta}+g \sin \theta=F \cos \omega t . \nonumber$

We will look at these and other oscillation problems later in the exercises. These are summarized in the table below.

In Search of Solutions

Before returning to studying the equilibrium solutions of the nonlinear pendulum, we will look at how far we can get at obtaining analytical solutions. First, we investigate the simple linear pendulum.

The linear pendulum equation (3.9) is a constant coefficient second order linear differential equation. The roots of the characteristic equations are $r=$ $\pm \sqrt{\dfrac{g}{L}} i$ . Thus, the general solution takes the form

$\theta(t)=c_{1} \cos \left(\sqrt{\dfrac{g}{L}} t\right)+c_{2} \sin \left(\sqrt{\dfrac{g}{L}} t\right) \nonumber$

We note that this is usually simplified by introducing the angular frequency

$\omega \equiv \sqrt{\dfrac{g}{L}} . \nonumber$

One consequence of this solution, which is used often in introductory physics, is an expression for the period of oscillation of a simple pendulum. REcall that the period is the time it takes to complete one cycle of the oscillation. The period is found to be

$T=\dfrac{2 \pi}{\omega}=2 \pi \sqrt{\dfrac{L}{g}} \nonumber$

This value for the period of a simple pendulum is based on the linear pendulum equation, which was derived assuming a small angle approximation. How good is this approximation? What is meant by a small angle? We recall the Taylor series approximation of $\sin \theta$ about $\theta=0$ :

$\sin \theta=\theta-\dfrac{\theta^{3}}{3 !}+\dfrac{\theta^{5}}{5 !}+\ldots \nonumber$

One can obtain a bound on the error when truncating this series to one term after taking a numerical analysis course. But we can just simply plot the relative error, which is defined as Relative Error $=\left|\dfrac{\sin \theta-\theta}{\sin \theta}\right| \times 100 \%$ .

A plot of the relative error is given in Figure 3.13. We note that a one percent relative error corresponds to about $0.24$ radians, which is less that fourteen degrees. Further discussion on this is provided at the end of this section.

Figure 3.13. The relative error in percent when approximating $\sin \theta$ by $\theta$ .

We now turn to the nonlinear pendulum. We first rewrite Equation (3.8) in the simpler form

$\ddot{\theta}+\omega^{2} \sin \theta=0 . \nonumber$

We next employ a technique that is useful for equations of the form

$\ddot{\theta}+F(\theta)=0 \nonumber$

when it is easy to integrate the function $F(\theta)$ . Namely, we note that

$\dfrac{d}{d t}\left[\dfrac{1}{2} \dot{\theta}^{2}+\int^{\theta(t)} F(\phi) d \phi\right]=[\ddot{\theta}+F(\theta)] \dot{\theta} \nonumber$

For our problem, we multiply Equation (3.17) by $\dot{\theta}$ ,

$\ddot{\theta} \dot{\theta}+\omega^{2} \sin \theta \dot{\theta}=0 \nonumber$

and note that the left side of this equation is a perfect derivative. Thus,

$\dfrac{d}{d t}\left[\dfrac{1}{2} \dot{\theta}^{2}-\omega^{2} \cos \theta\right]=0 \nonumber$

Therefore, the quantity in the brackets is a constant. So, we can write

$\dfrac{1}{2} \dot{\theta}^{2}-\omega^{2} \cos \theta=c . \nonumber$

Solving for $\dot{\theta}$ , we obtain

$\dfrac{d \theta}{d t}=\sqrt{2\left(c+\omega^{2} \cos \theta\right)} \nonumber$

This equation is a separable first order equation and we can rearrange and integrate the terms to find that

$t=\int d t=\int \dfrac{d \theta}{\sqrt{2\left(c+\omega^{2} \cos \theta\right)}} . \nonumber$

Of course, one needs to be able to do the integral. When one gets a solution in this implicit form, one says that the problem has been solved by quadratures. Namely, the solution is given in terms of some integral. In the appendix to this chapter we show that this solution can be written in terms of elliptic integrals and derive corrections to formula for the period of a pendulum.

The Stability of Fixed Points in Nonlinear Systems

We are now interested in studying the stability of the equilibrium solutions of the nonlinear pendulum. Along the way we will develop some basic methods for studying the stability of equilibria in nonlinear systems.

We begin with the linear differential equation for damped oscillations as given earlier in Equation (3.9). In this case, we have a second order equation of the form

$x^{\prime \prime}+b x^{\prime}+\omega^{2} x . \nonumber$

Using the methods of Chapter 2, this second order equation can be written as a system of two first order equations:

$\begin{aligned} &x^{\prime}=y \\[4pt] &y^{\prime}=-b y-\omega^{2} x . \end{aligned} \nonumber$

This system has only one equilibrium solution, $x=0, y=0$ .

Turning to the damped nonlinear pendulum, we have the system

$\begin{aligned} x^{\prime} &=y \\[4pt] y^{\prime} &=-b y-\omega^{2} \sin x . \end{aligned} \nonumber$

This system also has the equilibrium solution, $x=0, y=0$ . However, there are actually an infinite number of solutions. The equilibria are determined from $y=0$ and $-b y-\omega^{2} \sin x=0$ . This implies that $\sin x=0$ . There are an infinite number of solutions: $x=n \pi, n=0, \pm 1, \pm 2, \ldots$ So, we have an infinite number of equilibria, $(n \pi, 0), n=0, \pm 1, \pm 2, \ldots$

Next, we need to determine their stability. To do this we need a more general theory for nonlinear systems. We begin with the $n$ -dimensional system

$\mathbf{x}^{\prime}=\mathbf{f}(\mathbf{x}), \quad \mathrm{x} \in \mathrm{R}^{n} \nonumber$

Here $\mathbf{f}: \mathrm{R}^{n} \rightarrow \mathrm{R}^{n}$ . We define fixed points, or equilibrium solutions, of this system as points $\mathrm{x}^{*}$ satisfying $\mathbf{f}\left(\mathrm{x}^{*}\right)=\mathbf{0}$ .

The stability in the neighborhood of fixed points can now be determined. We are interested in what happens to solutions of our system with initial conditions starting near a fixed point. We can represent a point near a fixed point in the form $\mathbf{x}=\mathbf{x}^{*}+\boldsymbol{\xi}$ , where the length of $\boldsymbol{\xi}$ gives an indication of how close we are to the fixed point. So, we consider that initially, $|\boldsymbol{\xi}| \ll 1$ .

As the system evolves, $\boldsymbol{\xi}$ will change. The change of $\boldsymbol{\xi}$ in time is in turn governed by a system of equations. We can approximate this evolution as follows. First, we note that

$\mathbf{x}^{\prime}=\boldsymbol{\xi}^{\prime} \nonumber$

Next, we have that

$\mathbf{f}(\mathbf{x})=\mathbf{f}\left(\mathbf{x}^{*}+\boldsymbol{\xi}\right) \nonumber$

We can expand the right side about the fixed point using a multidimensional version of Taylor’s Theorem. Thus, we have that

$\mathbf{f}\left(\mathbf{x}^{*}+\boldsymbol{\xi}\right)=\mathbf{f}\left(\mathbf{x}^{*}\right)+D \mathbf{f}\left(\mathbf{x}^{*}\right) \boldsymbol{\xi}+O\left(|\boldsymbol{\xi}|^{2}\right) \nonumber$

Here Df is the Jacobian matrix, defined as

$D \mathbf{f}=\left(\begin{array}{cccc} \dfrac{\partial f_{1}}{\partial x_{1}} & \dfrac{\partial f_{1}}{\partial x_{2}} & \cdots & \dfrac{\partial f_{1}}{\partial x_{n}} \\[4pt] \dfrac{\partial f_{2}}{\partial x_{1}} & \ddots & \ddots & \vdots \\[4pt] \vdots & \ddots & \ddots & \vdots \\[4pt] \dfrac{\partial f_{n}}{\partial x_{1}} & \cdots & \cdots & \dfrac{\partial f_{n}}{\partial x_{n}} \end{array}\right) \nonumber$

Noting that $\mathbf{f}\left(\mathbf{x}^{*}\right)=\mathbf{0}$ , we then have that system (3.22) becomes

$\xi^{\prime} \approx D \mathbf{f}\left(\mathbf{x}^{*}\right) \boldsymbol{\xi} \nonumber$

It is this equation which describes the behavior of the system near the fixed point. We say that system (3.22) has been linearized or that Equation (3.23) is the linearization of system $(3.22)$ . Example 3.4. As an example of the application of this linearization, we look at the system

$\begin{aligned} &x^{\prime}=-2 x-3 x y \\[4pt] &y^{\prime}=3 y-y^{2} \end{aligned} \nonumber$

We first determine the fixed points:

$\begin{aligned} &0=-2 x-3 x y=-x(2+3 y) \\[4pt] &0=3 y-y^{2}=y(3-y) \end{aligned} \nonumber$

From the second equation, we have that either $y=0$ or $y=3$ . The first equation then gives $x=0$ in either case. So, there are two fixed points: $(0,0)$ and $(0,3)$

Next, we linearize about each fixed point separately. First, we write down the Jacobian matrix.

$D \mathbf{f}(x, y)=\left(\begin{array}{cc} -2-3 y & -3 x \\[4pt] 0 & 3-2 y \end{array}\right) \nonumber$

Case I $(0,0)$ .

In this case we find that

$D \mathbf{f}(0,0)=\left(\begin{array}{cc} -2 & 0 \\[4pt] 0 & 3 \end{array}\right) \nonumber$

Therefore, the linearized equation becomes

$\xi^{\prime}=\left(\begin{array}{cc} -2 & 0 \\[4pt] 0 & 3 \end{array}\right) \boldsymbol{\xi} \nonumber$

This is equivalently written out as the system

$\begin{aligned} &\xi_{1}^{\prime}=-2 \xi_{1} \\[4pt] &\xi_{2}^{\prime}=3 \xi_{2} \end{aligned} \nonumber$

This is the linearized system about the origin. Note the similarity with the original system. We emphasize that the linearized equations are constant coefficient equations and we can use earlier matrix methods to determine the nature of the equilibrium point. The eigenvalues of the system are obviously $\lambda=-2,3$ . Therefore, we have that the origin is a saddle point.

Case II $(0,3)$ .

In this case we proceed as before. We write down the Jacobian matrix and look at its eigenvalues to determine the type of fixed point. So, we have that the Jacobian matrix is

$D \mathbf{f}(0,3)=\left(\begin{array}{cc} -2 & 0 \\[4pt] 0 & -3 \end{array}\right) \nonumber$

Here, we have the eigenvalues $\lambda=-2,-3$ . So, this fixed point is a stable node. This analysis has given us a saddle and a stable node. We know what the behavior is like near each fixed point, but we have to resort to other means to say anything about the behavior far from these points. The phase portrait for this system is given in Figure 3.14. You should be able to find the saddle point and the node. Notice how solutions behave in regions far from these points.

Figure 3.14. Phase plane for the system $x^{\prime}=-2 x-3 x y, y^{\prime}=3 y-y^{2}$ .

We can expect to be able to perform a linearization under general conditions. These are given in the Hartman-Großman Theorem:

Theorem 3.5. A continuous map exists between the linear and nonlinear systems when $D \mathbf{f}\left(\mathbf{x}^{*}\right)$ does not have any eigenvalues with zero real part.

Generally, there are several types of behavior that one can see in nonlinear systems. One can see sinks or sources, hyperbolic (saddle) points, elliptic points (centers) or foci. We have defined some of these for planar systems. In general, if at least two eigenvalues have real parts with opposite signs, then the fixed point is a hyperbolic point. If the real part of a nonzero eigenvalue is zero, then we have a center, or elliptic point.

Example 3.6. Return to the Nonlinear Pendulum

We are now ready to establish the behavior of the fixed points of the damped nonlinear pendulum in Equation (3.21). The system was

$\begin{aligned} x^{\prime} &=y \\[4pt] y^{\prime} &=-b y-\omega^{2} \sin x \end{aligned} \nonumber$

We found that there are an infinite number of fixed points at $(n \pi, 0), n=$ $0, \pm 1, \pm 2, \ldots$

We note that the Jacobian matrix is

$D \mathbf{f}(x, y)=\left(\begin{array}{cc} 0 & 1 \\[4pt] -\omega^{2} \cos x & -b \end{array}\right) \text {. } \nonumber$

Evaluating this at the fixed points, we find that

$D \mathbf{f}(n \pi, 0)=\left(\begin{array}{cc} 0 & 1 \\[4pt] -\omega^{2} \cos n \pi & -b \end{array}\right)=\left(\begin{array}{cc} 0 & 1 \\[4pt] \omega^{2}(-1)^{n+1} & -b \end{array}\right) \text {. } \nonumber$

There are two cases to consider: $n$ even and $n$ odd. For the first case, we find the eigenvalue equation

$\lambda^{2}+b \lambda+\omega^{2}=0 \nonumber$

This has the roots

$\lambda=\dfrac{-b \pm \sqrt{b^{2}-4 \omega^{2}}}{2} \nonumber$

For $b^{2}<4 \omega^{2}$ , we have two complex conjugate roots with a negative real part. Thus, we have stable foci for even $n$ values. If there is no damping, then we obtain centers.

In the second case, $n$ odd, we have that

$\lambda^{2}+b \lambda-\omega^{2}=0 \nonumber$

In this case we find

$\lambda=\dfrac{-b \pm \sqrt{b^{2}+4 \omega^{2}}}{2} . \nonumber$

Since $b^{2}+4 \omega^{2}>b^{2}$ , these roots will be real with opposite signs. Thus, we have hyperbolic points, or saddles.

In Figure (3.15) we show the phase plane for the undamped nonlinear pendulum. We see that we have a mixture of centers and saddles. There are orbits for which there is periodic motion. At $\theta=\pi$ the behavior is unstable. This is because it is difficult to keep the mass vertical. This would be appropriate if we were to replace the string by a massless rod. There are also unbounded orbits, going through all of the angles. These correspond to the mass spinning around the pivot in one direction forever. We have indicated in the figure solution curves with the initial conditions $\left(x_{0}, y_{0}\right)=(0,3),(0,2),(0,1),(5,1)$ .

When there is damping, we see that we can have a variety of other behaviors as seen in Figure (3.16). In particular, energy loss leads to the mass settling around one of the stable fixed points. This leads to an understanding as to why there are an infinite number of equilibria, even though physically the mass traces out a bound set of Cartesian points. We have indicated in the Figure (3.16) solution curves with the initial conditions $\left(x_{0}, y_{0}\right)=(0,3),(0,2),(0,1),(5,1)$ .

Figure 3.15. Phase plane for the undamped nonlinear pendulum. Solution curves are shown for initial conditions $\left(x_{0}, y_{0}\right)=(0,3),(0,2),(0,1),(5,1)$ .

Nonlinear Population Models

We have already encountered several models of population dynamics. Of course, one could dream up several other examples. There are two standard types of models: Predator-prey and competing species. In the predator-prey model, one typically has one species, the predator, feeding on the other, the prey. We will look at the standard Lotka-Volterra model in this section. The competing species model looks similar, except there are a few sign changes, since one species is not feeding on the other. Also, we can build in logistic terms into our model. We will save this latter type of model for the homework.

The Lotka-Volterra model takes the form

$\begin{aligned} &\dot{x}=a x-b x y, \\[4pt] &\dot{y}=-d y+c x y \end{aligned} \nonumber$

In this case, we can think of $x$ as the population of rabbits (prey) and $y$ is the population of foxes (predators). Choosing all constants to be positive, we can describe the terms.

ax: When left alone, the rabbit population will grow. Thus $a$ is the natural growth rate without predators.
$-d y$ : When there are no rabbits, the fox population should decay. Thus, the coefficient needs to be negative.

Figure 3.16. Phase plane for the damped nonlinear pendulum. Solution curves are shown for initial conditions $\left(x_{0}, y_{0}\right)=(0,3),(0,2),(0,1),(5,1)$ .

$-b x y$ : We add a nonlinear term corresponding to the depletion of the rabbits when the foxes are around.
cxy: The more rabbits there are, the more food for the foxes. So, we add a nonlinear term giving rise to an increase in fox population.

The analysis of the Lotka-Volterra model begins with determining the fixed points. So, we have from Equation (3.34)

$\begin{gathered} x(a-b y)=0, \\[4pt] y(-d+c x)=0 . \end{gathered} \nonumber$

Therefore, the origin and $\left(\dfrac{d}{c} \dfrac{a}{b}\right)$ are the fixed points.

Next, we determine their stability, by linearization about the fixed points. We can use the Jacobian matrix, or we could just expand the right hand side of each equation in (3.34). The Jacobian matrix is $D f(x, y)=\left(\begin{array}{cc}a-b y & -b x \\[4pt] c y & -d+c x\end{array}\right)$ . Evaluating at each fixed point, we have

$\begin{gathered} D f(0,0)=\left(\begin{array}{cc} a & 0 \\[4pt] 0 & -d \end{array}\right), \\[4pt] D f\left(\dfrac{d}{c}, \dfrac{a}{b}\right)=\left(\begin{array}{cc} 0 & -\dfrac{b d}{c} \\[4pt] \dfrac{a c}{b} & 0 \end{array}\right) . \end{gathered} \nonumber$

The eigenvalues of $(3.36)$ are $\lambda=a,-d$ . So, the origin is a saddle point. The eigenvalues of (3.37) satisfy $\lambda^{2}+a d=0$ . So, the other point is a center. In Figure $3.17$ we show a sample direction field for the Lotka-Volterra system.

Another way to linearize is to expand the equations about the fixed points. Even though this is equivalent to computing the Jacobian matrix, it sometimes might be faster.

Figure 3.17. Phase plane for the Lotka-Volterra system given by $\dot{x}=x-0.2 x y, \dot{y}=$ $-y+0.2 x y$ . Solution curves are shown for initial conditions $\left(x_{0}, y_{0}\right)=(8,3),(1,5)$ .

Limit Cycles

So far we have just been concerned with equilibrium solutions and their behavior. However, asymptotically stable fixed points are not the only attractors. There are other types of solutions, known as limit cycles, towards which a solution may tend. In this section we will look at some examples of these periodic solutions.

Such solutions are common in nature. Rayleigh investigated the problem

$x^{\prime \prime}+c\left(\dfrac{1}{3}\left(x^{\prime}\right)^{2}-1\right) x^{\prime}+x=0 \nonumber$

in the study of the vibrations of a violin string. Van der Pol studied an electrical circuit, modelling this behavior. Others have looked into biological systems, such as neural systems, chemical reactions, such as Michaelis-Menton kinetics or systems leading to chemical oscillations. One of the most important models in the historical study of dynamical systems is that of planetary motion and investigating the stability of planetary orbits. As is well known, these orbits are periodic.

Limit cycles are isolated periodic solutions towards which neighboring states might tend when stable. A key example exhibiting a limit cycle is given by the system

$\begin{aligned} &x^{\prime}=\mu x-y-x\left(x^{2}+y^{2}\right) \\[4pt] &y^{\prime}=x+\mu y-y\left(x^{2}+y^{2}\right) \end{aligned} \nonumber$

It is clear that the origin is a fixed point. The Jacobian matrix is given as

$\operatorname{Df}(0,0)=\left(\begin{array}{cc} \mu & -1 \\[4pt] 1 & \mu \end{array}\right) \nonumber$

The eigenvalues are found to be $\lambda=\mu \pm i$ . For $\mu=0$ we have a center. For $\mu<0$ we have a stable spiral and for $\mu>0$ we have an unstable spiral. However, this spiral does not wander off to infinity. We see in Figure $3.18$ that equilibrium point is a spiral. However, in Figure $3.19$ it is clear that the solution does not spiral out to infinity. It is bounded by a circle.

Figure 3.18. Phase plane for system (3.39) with $\mu=0.4$ .

Figure 3.19. Phase plane for system (3.39) with $\mu=0.4$ showing that the inner spiral is bounded by a limit cycle.

One can actually find the radius of this circle. This requires rewriting the system in polar form. Recall from Chapter 2 that this is done using

$\begin{gathered} r r^{\prime}=x x^{\prime}+y y^{\prime}, \\[4pt] r^{2} \theta^{\prime}=x y^{\prime}-y x^{\prime} . \end{gathered} \nonumber$

Inserting the system (3.39) into these expressions, we have

$r r^{\prime}=\mu r^{2}-r^{4}, \quad r^{2} \theta^{\prime}=r^{2}, \nonumber$

$r^{\prime}=\mu r-r^{3}, \theta^{\prime}=1 . \nonumber$

Of course, for a circle $r=$ const, therefore we need to look at the equilibrium solutions of Equation (3.43). This amounts to solving $\mu r-r^{3}=0$ for $r$ . The solutions of this equation are $r=0, \pm \sqrt{\mu}$ . We need only keep the one positive radius solution, $r=\sqrt{\mu}$ . In Figures $3.18-3.19 \mu=0.4$ , so we expect a circle with $r=\sqrt{0.4} \approx 0.63$ . The $\theta$ equation just tells us that we follow the limit cycle in a counterclockwise direction.

Limit cycles are not always circles. In Figures 3.20-3.21 we show the behavior of the Rayleigh system (3.38) for $c=0.4$ and $c=2.0$ . In this case we see that solutions tend towards a noncircular limit cycle.

Figure 3.20. Phase plane for the Rayleigh system (3.38) with $c=0.4$ .

The limit cycle for $c=2.0$ is shown in Figure $3.22$ .

Figure 3.21. Phase plane for the Rayleigh system (3.38) with $c=2.0$ .

Figure 3.22. Phase plane for the Rayleigh system (3.44) with $c=0.4$ .

Can one determine ahead of time if a given nonlinear system will have a limit cycle? In order to answer this question, we will introduce some definitions.

Figure 3.23. A sketch depicting the idea of trajectory, or orbit, passing through $x$ .

We first describe different trajectories and families of trajectories. A flow on $R^{2}$ is a function $\phi$ that satisfies the following

$\phi(\mathbf{x}, t)$ is continuous in both arguments.
$\phi(\mathbf{x}, 0)=\mathbf{x}$ for all $\mathbf{x} \in R^{2}$
$\phi\left(\phi\left(\mathbf{x}, t_{1}\right), t_{2}\right)=\phi\left(\mathbf{x}, t_{1}+t_{2}\right)$ .

The orbit, or trajectory, through $\mathbf{x}$ is defined as $\gamma=\{\phi(\mathbf{x}, t) \mid t \in I\}$ . In Figure $3.23$ we demonstrate these properties. For $t=0, \phi(\mathbf{x}, 0)=\mathbf{x}$ . Increasing $t$ , one follows the trajectory until one reaches the point $\phi\left(\mathbf{x}, t_{1}\right)$ . Continuing $t_{2}$ further, one is then at $\phi\left(\phi\left(\mathbf{x}, t_{1}\right), t_{2}\right)$ . By the third property, this is the same as going from $\mathbf{x}$ to $\phi\left(\mathbf{x}, t_{1}+t_{2}\right)$ for $t=t_{1}+t_{2}$ .

Having defined the orbits, we need to define the asymptotic behavior of the orbit for both positive and negative large times. We define the positive semiorbit through $\mathbf{x}$ as $\gamma^{+}=\{\phi(\mathbf{x}, t) \mid t>0\}$ . The negative semiorbit through $\mathbf{x}$ is defined as $\gamma^{-}=\{\phi(\mathbf{x}, t) \mid t<0\}$ . Thus, we have $\gamma=\gamma^{+} \cup^{-}$ .

The positive limit set, or $\omega$ -limit set, of point $\mathbf{x}$ is defined as

$\Lambda^{+}=\left\{\mathbf{y} \mid \text { there exists a sequence of } t_{n} \rightarrow \infty \text { such that } \phi\left(\mathbf{x}, t_{n}\right) \rightarrow \mathbf{y}\right\} \nonumber$

The $\mathbf{y}$ ’s are referred to as $\omega$ -limit points. This is shown in Figure $3.24$ .

Figure 3.24. A sketch depicting an $\omega$ -limit set. Note that the orbits tends towards the set as $t$ increases.

Figure 3.25. A sketch depicting an $\alpha$ -limit set. Note that the orbits tends away from the set as $t$ increases.

Similarly, we define the negative limit set, or it alpha-limit sets, of point $\mathbf{x}$ is defined as $\Lambda^{-}=\left\{\mathbf{y} \mid\right.$ there exists a sequences of $t_{n} \rightarrow-\infty$ such that $\left.\phi\left(\mathbf{x}, t_{n}\right) \rightarrow \mathbf{y}\right\}$

and the corresponding $\mathbf{y}$ ’s are $\alpha$ -limit points. This is shown in Figure $3.25$ .

There are several types of orbits that a system might possess. A cycle or periodic orbit is any closed orbit which is not an equilibrium point. A periodic orbit is stable if for every neighborhood of the orbit such that all nearby orbits stay inside the neighborhood. Otherwise, it is unstable. The orbit is asymptotically stable if all nearby orbits converge to the periodic orbit.

A limit cycle is a cycle which is the $\alpha$ or $\omega$ -limit set of some trajectory other than the limit cycle. A limit cycle $\Gamma$ is stable if $\Lambda^{+}=\Gamma$ for all $\mathbf{x}$ in some neighborhood of $\Gamma$ . A limit cycle $\Gamma$ is unstable if $\Lambda^{-}=\Gamma$ for all $\mathbf{x}$ in some neighborhood of $\Gamma$ . Finally, a limits cycle is semistable if it is attracting on one side and repelling on the other side. In the previous examples, we saw limit cycles that were stable. Figures $3.24$ and $3.25$ depict stable and unstable limit cycles, respectively.

We now state a theorem which describes the type of orbits we might find in our system.

Theorem 3.7. Poincaré-Bendixon Theorem Let $\gamma^{+}$ be contained in a bounded region in which there are finitely many critical points. Then $\Lambda^{+}$ is either

a single critical point;
a single closed orbit;
a set of critical points joined by heteroclinic orbits. [Compare Figures $3.27$ and ??.]

Figure 3.26. A heteroclinic orbit connecting two critical points.

We are interested in determining when limit cycles may, or may not, exist. A consequence of the Poincaré-Bendixon Theorem is given by the following corollary.

Corollary Let $D$ be a bounded closed set containing no critical points and suppose that $\gamma^{+} \subset D$ . Then there exists a limit cycle contained in $D$ .

More specific criteria allow us to determine if there is a limit cycle in a given region. These are given by Dulac’s Criteria and Bendixon’s Criteria.

Figure 3.27. A homoclinic orbit returning to the point it left.

Dulac’s Criteria Consider the autonomous planar system

$x^{\prime}=f(x, y), \quad y^{\prime}=g(x, y) \nonumber$

and a continuously differentiable function $\psi$ defined on an annular region $D$ contained in some open set. If

$\dfrac{\partial}{\partial x}(\psi f)+\dfrac{\partial}{\partial y}(\psi g) \nonumber$

does not change sign in $D$ , then there is at most one limit cycle contained entirely in $D$ .

Bendixon’s Criteria Consider the autonomous planar system

$x^{\prime}=f(x, y), \quad y^{\prime}=g(x, y) \nonumber$

defined on a simply connected domain $D$ such that

$\dfrac{\partial}{\partial x}(\psi f)+\dfrac{\partial}{\partial y}(\psi g) \neq 0 \nonumber$

in $D$ . Then there are no limit cycles entirely in $D$ .

These are easily proved using Green’s Theorem in the plane. We prove Bendixon’s Criteria. Let $\mathbf{f}=(f, g)$ . Assume that $\Gamma$ is a closed orbit lying in $D$ . Let $S$ be the interior of $\Gamma$ . Then

$\begin{aligned} \int_{S} \nabla \cdot \mathbf{f} d x d y &=\oint_{\Gamma}(f d y-g d x) \\[4pt] &=\int_{0}^{T}(f \dot{y}-g \dot{x}) d t \\[4pt] &=\int_{0}^{T}(f g-g f) d t=0 \end{aligned} \nonumber$

So, if $\nabla \cdot \mathbf{f}$ is not identically zero and does not change sign in $S$ , then from the continuity of $\nabla \cdot \mathbf{f}$ in $S$ we have that the right side above is either positive or negative. Thus, we have a contradiction and there is no closed orbit lying in $D$

Example 3.8. Consider the earlier example in (3.39) with $\mu=1$ .

$\begin{aligned} &x^{\prime}=x-y-x\left(x^{2}+y^{2}\right) \\[4pt] &y^{\prime}=x+y-y\left(x^{2}+y^{2}\right) . \end{aligned} \nonumber$

We already know that a limit cycle exists at $x^{2}+y^{2}=1$ . A simple computation gives that

$\nabla \cdot \mathbf{f}=2-4 x^{2}-4 y^{2} \nonumber$

For an arbitrary annulus $a<x^{2}+y^{2}<b$ , we have

$2-4 b<\nabla \cdot \mathbf{f}<2-4 a . \nonumber$

For $a=3 / 4$ and $b=5 / 4,-3<\nabla \cdot \mathbf{f}<-1$ . Thus, $\nabla \cdot \mathbf{f}<0$ in the annulus $3 / 4<x^{2}+y^{2}<5 / 4$ . Therefore, by Dulac’s Criteria there is at most one limit cycle in this annulus.

Example 3.9. Consider the system

$\begin{aligned} &x^{\prime}=y \\[4pt] &y^{\prime}=-a x-b y+c x^{2}+d y^{2} . \end{aligned} \nonumber$

Let $\psi(x, y)=e^{-2 d x}$ . Then,

$\dfrac{\partial}{\partial x}(\psi y)+\dfrac{\partial}{\partial y}\left(\psi\left(-a x-b y+c x^{2}+d y^{2}\right)\right)=-b e^{-2 d x} \neq 0 \nonumber$

We conclude by Bendixon’s Criteria that there are no limit cycles for this system.

Nonautonomous Nonlinear Systems

In this section we discuss nonautonomous systems. Recall that an autonomous system is one in which there is no explicit time dependence. A simple example is the forced nonlinear pendulum given by the nonhomogeneous equation

$\ddot{x}+\omega^{2} \sin x=f(t) . \nonumber$

We can set this up as a system of two first order equations:

$\begin{aligned} &\dot{x}=y \\[4pt] &\dot{y}=-\omega^{2} \sin x+f(t) \end{aligned} \nonumber$

This system is not in a form for which we could use the earlier methods. Namely, it is a nonautonomous system. However, we introduce a new variable $z(t)=t$ and turn it into an autonomous system in one more dimension. The new system takes the form

$\begin{aligned} &\dot{x}=y \\[4pt] &\dot{y}=-\omega^{2} \sin x+f(z) \\[4pt] &\dot{z}=1 \end{aligned} \nonumber$

This system is a three dimensional autonomous, possibly nonlinear, system and can be explored using our earlier methods.

A more interesting model is provided by the Duffing Equation. This equation models hard spring and soft spring oscillations. It also models a periodically forced beam as shown in Figure 3.28. It is of interest because it is a simple system which exhibits chaotic dynamics and will motivate us towards using new visualization methods for nonautonomous systems.

Figure 3.28. One model of the Duffing equation describes a periodically forced beam which interacts with two magnets.

The most general form of Duffing’s equation is given by

$\ddot{x}+k \dot{x}+\left(\beta x^{3} \pm \omega_{0}^{2} x\right)=\Gamma \cos (\omega t+\phi) . \nonumber$

This equation models hard spring $(\beta>0)$ and soft spring $(\beta<0)$ oscillations. However, we will use a simpler version of the Duffing equation:

$\ddot{x}+k \dot{x}+x^{3}-x=\Gamma \cos \omega t . \nonumber$

Let’s first look at the behavior of some of the orbits of the system as we vary the parameters. In Figures $3.29-3.31$ we show some typical solution plots superimposed on the direction field. We start with the the undamped $(k=0)$ and unforced $(\Gamma=0)$ Duffing equation,

$\ddot{x}+x^{3}-x==0 . \nonumber$

We can write this second order equation as the autonomous system

$\begin{aligned} &\dot{x}=y \\[4pt] &\dot{y}=x\left(1-x^{2}\right) \end{aligned} \nonumber$

We see there are three equilibrium points at $(0,0),(\pm 1,0)$ . In Figure $3.29$ we plot several orbits for We see that the three equilibrium points consist of two centers and a saddle.

Figure 3.29. Phase plane for the undamped, unforced Duffing equation $(k=0, \Gamma=0)$ .

We now turn on the damping. The system becomes

$\begin{aligned} &\dot{x}=y \\[4pt] &\dot{y}=-k y+x\left(1-x^{2}\right) . \end{aligned} \nonumber$

In Figure $3.30$ we show what happens when $k=0.1$ These plots are reminiscent of the plots for the nonlinear pendulum; however, there are fewer equilibria. The centers become stable spirals.

Next we turn on the forcing to obtain a damped, forced Duffing equation. The system is now nonautonomous.

Figure 3.30. Phase plane for the unforced Duffing equation with $k=0.1$ and $\Gamma=0$ .

$\begin{aligned} &\dot{x}=y \\[4pt] &\dot{y}=x\left(1-x^{2}\right)+\Gamma \cos \omega t \end{aligned} \nonumber$

In Figure $3.31$ we only show one orbit with $k=0.1, \Gamma=0.5$ , and $\omega=1.25$ . The solution intersects itself and look a bit messy. We can imagine what we would get if we added any more orbits. For completeness, we show in Figure $3.32$ an example with four different orbits.

In cases for which one has periodic orbits such as the Duffing equation, Poincaré introduced the notion of surfaces of section. One embeds the orbit in a higher dimensional space so that there are no self intersections, like we saw in Figures $3.31$ and 3.32. In Figure $3.33$ we show an example where a simple orbit is shown as it periodically pierces a given surface.

In order to simplify the resulting pictures, one only plots the points at which the orbit pierces the surface as sketched in Figure 3.34. In practice, there is a natural frequency, such as $\omega$ in the forced Duffing equation. Then, one plots points at times that are multiples of the period, $T=\dfrac{2 \pi}{\omega}$ . In Figure $3.35$ we show what the plot for one orbit would look like for the damped, unforced Duffing equation.

The more interesting case, is when there is forcing and damping. In this case the surface of section plot is given in Figure $3.36$ . While this is not as busy as the solution plot in Figure 3.31, it still provides some interesting behavior. What one finds is what is called a strange attractor. Plotting many orbits, we find that after a long time, all of the orbits are attracted to a small region in the plane, much like a stable node attracts nearby orbits. However, this

Figure 3.31. Phase plane for the Duffing equation with $k=0.1, \Gamma=0.5$ , and $\omega=1.25$ . In this case we show only one orbit which was generated from the initial condition $\left(x_{0}=1.0, \quad y_{0}=0.5\right)$ .

Figure 3.32. Phase plane for the Duffing equation with $k=0.1, \Gamma=0.5$ , and $\omega=1.25$ . In this case four initial conditions were used to generate four orbits.

Figure 3.33. Poincaré’s surface of section. One notes each time the orbit pierces the surface.

Figure 3.34. As an orbit pierces the surface of section, one plots the point of intersection in that plane to produce the surface of section plot.

set consists of more than one point. Also, the flow on the attractor is chaotic in nature. Thus, points wander in an irregular way throughout the attractor. This is one of the interesting topics in chaos theory and this whole theory of dynamical systems has only been touched in this text leaving the reader to wander of into further depth into this fascinating field.

Maple Code for Phase Plane Plots

For reference, the plots in Figures $3.29$ and $3.30$ were generated in Maple using the following commands:

Figure 3.35. Poincaré’s surface of section plot for the damped, unforced Duffing equation.

Figure 3.36. Poincaré’s surface of section plot for the damped, forced Duffing equation.

This leads to what is known as a strange attractor.

The surface of section plots at the end of the last section were obtained using code from S. Lynch’s book Dynamical Systems with Applications Using Maple. The Maple code is given by

Appendix: Period of the Nonlinear Pendulum

In Section 3.5.1 we saw that the solution of the nonlinear pendulum problem can be found up to quadrature. In fact, the integral in Equation (3.19) can be transformed into what is know as an elliptic integral of the first kind. We will rewrite our result and then use it to obtain an approximation to the period of oscillation of our nonlinear pendulum, leading to corrections to the linear result found earlier.

We will first rewrite the constant found in (3.18). This requires a little physics. The swinging of a mass on a string, assuming no energy loss at the pivot point, is a conservative process. Namely, the total mechanical energy is conserved. Thus, the total of the kinetic and gravitational potential energies is a constant. Noting that $v=L \dot{\theta}$ , the kinetic energy of the mass on the string is given as

$T=\dfrac{1}{2} m v^{2}=\dfrac{1}{2} m L^{2} \dot{\theta}^{2} . \nonumber$

The potential energy is the gravitational potential energy. If we set the potential energy to zero at the bottom of the swing, then the potential energy is $U=m g h$ , where $h$ is the height that the mass is from the bottom of the swing. A little trigonometry gives that $h=L(1-\cos \theta)$ . This gives the potential energy as

$U=m g L(1-\cos \theta) . \nonumber$

So, the total mechanical energy is

$E=\dfrac{1}{2} m L^{2} \theta^{\prime 2}+m g L(1-\cos \theta) . \nonumber$

We note that a little rearranging shows that we can relate this to Equation $(3.18)$

$\dfrac{1}{2}\left(\theta^{\prime}\right)^{2}-\omega^{2} \cos \theta=\dfrac{1}{m L^{2}} E-\omega^{2}=c . \nonumber$

We can use Equation (3.56) to get a value for the total energy. At the top of the swing the mass is not moving, if only for a moment. Thus, the kinetic energy is zero and the total energy is pure potential energy. Letting $\theta_{0}$ denote the angle at the highest position, we have that

$E=m g L\left(1-\cos \theta_{0}\right)=m L^{2} \omega^{2}\left(1-\cos \theta_{0}\right) . \nonumber$

Here we have used the relation $g=L \omega^{2}$ .

Therefore, we have found that

$\dfrac{1}{2} \dot{\theta}^{2}-\omega^{2} \cos \theta=\omega^{2}\left(1-\cos \theta_{0}\right) . \nonumber$

Using the half angle formula,

$\sin ^{2} \dfrac{\theta}{2}=\dfrac{1}{2}(1-\cos \theta) \nonumber$

we can rewrite Equation $(3.57)$ as

$\dfrac{1}{2} \dot{\theta}^{2}=2 \omega^{2}\left[\sin ^{2} \dfrac{\theta_{0}}{2}-\sin ^{2} \dfrac{\theta}{2}\right] \nonumber$

Solving for $\theta^{\prime}$ , we have

$\dfrac{d \theta}{d t}=2 \omega\left[\sin ^{2} \dfrac{\theta_{0}}{2}-\sin ^{2} \dfrac{\theta}{2}\right]^{1 / 2} \nonumber$

One can now apply separation of variables and obtain an integral similar to the solution we had obtained previously. Noting that a motion from $\theta=0$ to $\theta=\theta_{0}$ is a quarter of a cycle, then we have that

$T=\dfrac{2}{\omega} \int_{0}^{\theta_{0}} \dfrac{d \phi}{\sqrt{\sin ^{2} \dfrac{\theta_{0}}{2}-\sin ^{2} \dfrac{\theta}{2}}} \nonumber$

This result is not much different than our previous result, but we can now easily transform the integral into an elliptic integral. We define

$z=\dfrac{\sin \dfrac{\theta}{2}}{\sin \dfrac{\theta_{0}}{2}} \nonumber$

and

$k=\sin \dfrac{\theta_{0}}{2} \nonumber$

Then Equation (3.60) becomes

$T=\dfrac{4}{\omega} \int_{0}^{1} \dfrac{d z}{\sqrt{\left(1-z^{2}\right)\left(1-k^{2} z^{2}\right)}} . \nonumber$

This is done by noting that $d z=\dfrac{1}{2 k} \cos \dfrac{\theta}{2} d \theta=\dfrac{1}{2 k}\left(1-k^{2} z^{2}\right)^{1 / 2} d \theta$ and that $\sin ^{2} \dfrac{\theta_{0}}{2}-\sin ^{2} \dfrac{\theta}{2}=k^{2}\left(1-z^{2}\right)$ . The integral in this result is an elliptic integral of the first kind. In particular, the elliptic integral of the first kind is defined

$F(\phi, k) \equiv=\int_{0}^{\phi} \dfrac{d \theta}{\sqrt{1-k^{2} \sin ^{2} \theta}}=\int_{0}^{\sin \phi} \dfrac{d z}{\sqrt{\left(1-z^{2}\right)\left(1-k^{2} z^{2}\right)}} . \nonumber$

In some contexts, this is known as the incomplete elliptic integral of the first kind and $K(k)=F\left(\dfrac{\pi}{2}, k\right)$ is called the complete integral of the first kind.

There are tables of values for elliptic integrals. Historically, that is how one found values of elliptic integrals. However, we now have access to computer algebra systems which can be used to compute values of such integrals. For small angles, we have that $k$ is small. So, we can develop a series expansion for the period, $T$ , for small $k$ . This is done by first expanding

$\left(1-k^{2} z^{2}\right)^{-1 / 2}=1+\dfrac{1}{2} k^{2} z^{2}+\dfrac{3}{8} k^{2} z^{4}+O\left((k z)^{6}\right) \nonumber$

Substituting this in the integrand and integrating term by term, one finds that

$T=2 \pi \sqrt{\dfrac{L}{g}}\left[1+\dfrac{1}{4} k^{2}+\dfrac{9}{64} k^{4}+\ldots\right] \nonumber$

This expression gives further corrections to the linear result, which only provides the first term. In Figure $3.37$ we show the relative errors incurred when keeping the $k^{2}$ and $k^{4}$ terms versus not keeping them. The reader is asked to explore this further in Problem 3.8.

Figure 3.37. The relative error in percent when approximating the exact period of a nonlinear pendulum with one, two, or three terms in Equation (3.62).

Problems

3.1. Find the equilibrium solutions and determine their stability for the following systems. For each case draw representative solutions and phase lines.
a. $y^{\prime}=y^{2}-6 y-16$ .
b. $y^{\prime}=\cos y$ .
c. $y^{\prime}=y(y-2)(y+3)$ .
d. $y^{\prime}=y^{2}(y+1)(y-4)$ .

3.2. For $y^{\prime}=y-y^{2}$ , find the general solution corresponding to $y(0)=y_{0}$ . Provide specific solutions for the following initial conditions and sketch them: a. $y(0)=0.25$ , b. $y(0)=1.5$ , and c. $y(0)=-0.5$ .

3.3. For each problem determine equilibrium points, bifurcation points and construct a bifurcation diagram. Discuss the different behaviors in each system.
a. $y^{\prime}=y-\mu y^{2}$
b. $y^{\prime}=y(\mu-y)(\mu-2 y)$
c. $x^{\prime}=\mu-x^{3}$
d. $x^{\prime}=x-\dfrac{\mu x}{1+x^{2}}$

3.4. Consider the family of differential equations $x^{\prime}=x^{3}+\delta x^{2}-\mu x$ .

a. Sketch a bifurcation diagram in the $x \mu$ -plane for $\delta=0$ .

b. Sketch a bifurcation diagram in the $x \mu$ -plane for $\delta>0$ . Hint: Pick a few values of $\delta$ and $\mu$ in order to get a feel for how this system behaves.

3.5. Consider the system

$\begin{aligned} &x^{\prime}=-y+x\left[\mu-x^{2}-y^{2}\right], \\[4pt] &y^{\prime}=x+y\left[\mu-x^{2}-y^{2}\right] \end{aligned} \nonumber$

Rewrite this system in polar form. Look at the behavior of the $r$ equation and construct a bifurcation diagram in $\mu r$ space. What might this diagram look like in the three dimensional $\mu x y$ space? (Think about the symmetry in this problem.) This leads to what is called a Hopf bifurcation.

3.6. Find the fixed points of the following systems. Linearize the system about each fixed point and determine the nature and stability in the neighborhood of each fixed point, when possible. Verify your findings by plotting phase portraits using a computer.

$\begin{aligned} &x^{\prime}=x(100-x-2 y), \\[4pt] &y^{\prime}=y(150-x-6 y) \end{aligned} \nonumber$

$\begin{aligned} &x^{\prime}=x+x^{3}, \\[4pt] &y^{\prime}=y+y^{3} \end{aligned} \nonumber$

$\begin{aligned} &x^{\prime}=x-x^{2}+x y \\[4pt] &y^{\prime}=2 y-x y-6 y^{2} \end{aligned} \nonumber$

$\begin{aligned} &x^{\prime}=-2 x y \\[4pt] &y^{\prime}=-x+y+x y-y^{3} . \end{aligned} \nonumber$

3.7. Plot phase portraits for the Lienard system

$\begin{aligned} &x^{\prime}=y-\mu\left(x^{3}-x\right) \\[4pt] &y^{\prime}=-x . \end{aligned} \nonumber$

for a small and a not so small value of $\mu$ . Describe what happens as one varies $\mu$ . 3.8. Consider the period of a nonlinear pendulum. Let the length be $L=1.0$ $\mathrm{m}$ and $g=9.8 \mathrm{~m} / \mathrm{s}^{2}$ . Sketch $T$ vs the initial angle $\theta_{0}$ and compare the linear and nonlinear values for the period. For what angles can you use the linear approximation confidently?

3.9. Another population model is one in which species compete for resources, such as a limited food supply. Such a model is given by

$\begin{aligned} &x^{\prime}=a x-b x^{2}-c x y \\[4pt] &y^{\prime}=d y-e y^{2}-f x y . \end{aligned} \nonumber$

In this case, assume that all constants are positive.

a Describe the effects/purpose of each terms.

$b$ Find the fixed points of the model.

c Linearize the system about each fixed point and determine the stability.

$\mathrm{d}$ From the above, describe the types of solution behavior you might expect, in terms of the model.

3.10. Consider a model of a food chain of three species. Assume that each population on its own can be modeled by logistic growth. Let the species be labeled by $x(t), y(t)$ , and $z(t)$ . Assume that population $x$ is at the bottom of the chain. That population will be depleted by population $y$ . Population $y$ is sustained by $x$ ’s, but eaten by $z$ ’s. A simple, but scaled, model for this system can be given by the system

$\begin{aligned} &x^{\prime}=x(1-x)-x y \\[4pt] &y^{\prime}=y(1-y)+x y-y z \\[4pt] &z^{\prime}=z(1-z)+y z \end{aligned} \nonumber$

a. Find the equilibrium points of the system.

b. Find the Jacobian matrix for the system and evaluate it at the equilibrium points.

c. Find the eigenvalues and eigenvectors.

d. Describe the solution behavior near each equilibrium point.

f. Which of these equilibria are important in the study of the population model and describe the interactions of the species in the neighborhood of these point $(\mathrm{s})$

3.11. Show that the system $x^{\prime}=x-y-x^{3}, y^{\prime}=x+y-y^{3}$ , has a unique limit cycle by picking an appropriate $\psi(x, y)$ in Dulac’s Criteria.

Boundary Value Problems

Introduction

Until this point we have solved initial value problems. For an initial value problem one has to solve a differential equation subject to conditions on the unknown function and its derivatives at one value of the independent variable. For example, for $x=x(t)$ we could have the initial value problem

$x^{\prime \prime}+x=2, \quad x(0)=1, \quad x^{\prime}(0)=0 \nonumber$

In the next chapters we will study boundary value problems and various tools for solving such problems. In this chapter we will motivate our interest in boundary value problems by looking into solving the one-dimensional heat equation, which is a partial differential equation. for the rest of the section, we will use this solution to show that in the background of our solution of boundary value problems is a structure based upon linear algebra and analysis leading to the study of inner product spaces. Though technically, we should be lead to Hilbert spaces, which are complete inner product spaces.

For an initial value problem one has to solve a differential equation subject to conditions on the unknown function or its derivatives at more than one value of the independent variable. As an example, we have a slight modification of the above problem: Find the solution $x=x(t)$ for $0 \leq t \leq 1$ that satisfies the problem

$x^{\prime \prime}+x=2, \quad x(0)=1, \quad x(1)=0 . \nonumber$

Typically, initial value problems involve time dependent functions and boundary value problems are spatial. So, with an initial value problem one knows how a system evolves in terms of the differential equation and the state of the system at some fixed time. Then one seeks to determine the state of the system at a later time.

For boundary values problems, one knows how each point responds to its neighbors, but there are conditions that have to be satisfied at the endpoints. An example would be a horizontal beam supported at the ends, like a bridge. The shape of the beam under the influence of gravity, or other forces, would lead to a differential equation and the boundary conditions at the beam ends would affect the solution of the problem. There are also a variety of other types of boundary conditions. In the case of a beam, one end could be fixed and the other end could be free to move. We will explore the effects of different boundary value conditions in our discussions and exercises.

Let’s solve the above boundary value problem. As with initial value problems, we need to find the general solution and then apply any conditions that we may have. This is a nonhomogeneous differential equation, so we have that the solution is a sum of a solution of the homogeneous equation and a particular solution of the nonhomogeneous equation, $x(t)=x_{h}(t)+x_{p}(t)$ . The solution of $x^{\prime \prime}+x=0$ is easily found as

$x_{h}(t)=c_{1} \cos t+c_{2} \sin t \nonumber$

The particular solution is easily found using the Method of Undetermined Coefficients,

$x_{p}(t)=2 \nonumber$

Thus, the general solution is

$x(t)=2+c_{1} \cos t+c_{2} \sin t . \nonumber$

We now apply the boundary conditions and see if there are values of $c_{1}$ and $c_{2}$ that yield a solution to our problem. The first condition, $x(0)=0$ , gives

$0=2+c_{1} \nonumber$

Thus, $c_{1}=-2$ . Using this value for $c_{1}$ , the second condition, $x(1)=1$ , gives

$0=2-2 \cos 1+c_{2} \sin 1 \nonumber$

This yields

$c_{2}=\dfrac{2(\cos 1-1)}{\sin 1} . \nonumber$

We have found that there is a solution to the boundary value problem and it is given by

$x(t)=2\left(1-\cos t \dfrac{(\cos 1-1)}{\sin 1} \sin t\right) \nonumber$

Boundary value problems arise in many physical systems, just as many of the initial values problems we have seen. We will see in the next section that boundary value problems for ordinary differential equations often appear in the solution of partial differential equations.

Partial Differential Equations

In this section we will introduce some generic partial differential equations and see how the discussion of such equations leads naturally to the study of boundary value problems for ordinary differential equations. However, we will not derive the particular equations, leaving that to courses in differential equations, mathematical physics, etc.

For ordinary differential equations, the unknown functions are functions of a single variable, e.g., $y=y(x)$ . Partial differential equations are equations involving an unknown function of several variables, such as $u=u(x, y), u=$ $u(x, y), u=u(x, y, z, t)$ , and its (partial) derivatives. Therefore, the derivatives are partial derivatives. We will use the standard notations $u_{x}=\dfrac{\partial u}{\partial x}, u_{x x}=\dfrac{\partial^{2} u}{\partial x^{2}}$ , etc.

There are a few standard equations that one encounters. These can be studied in one to three dimensions and are all linear differential equations. A list is provided in Table 4.1. Here we have introduced the Laplacian operator, $\nabla^{2} u=u_{x x}+u_{y y}+u_{z z}$ . Depending on the types of boundary conditions imposed and on the geometry of the system (rectangular, cylindrical, spherical, etc.), one encounters many interesting boundary value problems for ordinary differential equations.

Name	2 Vars	$3 \mathrm{D}$
Heat Equation	$u_{t}=k u_{x x}$	$u_{t}=k \nabla^{2} u$
Wave Equation	$u_{t t}=c^{2} u_{x x}$	$u_{t t}=c^{2} \nabla^{2} u$
Laplace’s Equation	$u_{x x}+u_{y y}=0$	$\nabla^{2} u=0$
Poisson’s Equation	$u_{x x}+u_{y y}=F(x, y)$	$\nabla^{2} u=F(x, y, z)$
Schrödinger’s Equation	$i u_{t}=u_{x x}+F(x, t) u$	$i u_{t}=\nabla^{2} u+F(x, y, z, t) u$

Table 4.1. List of generic partial differential equations.

Let’s look at the heat equation in one dimension. This could describe the heat conduction in a thin insulated rod of length $L$ . It could also describe the diffusion of pollutant in a long narrow stream, or the flow of traffic down a road. In problems involving diffusion processes, one instead calls this equation the diffusion equation.

A typical initial-boundary value problem for the heat equation would be that initially one has a temperature distribution $u(x, 0)=f(x)$ . Placing the bar in an ice bath and assuming the heat flow is only through the ends of the bar, one has the boundary conditions $u(0, t)=0$ and $u(L, t)=0$ . Of course, we are dealing with Celsius temperatures and we assume there is plenty of ice to keep that temperature fixed at each end for all time. So, the problem one would need to solve is given as

Another problem that will come up in later discussions is that of the vibrating string. A string of length $L$ is stretched out horizontally with both ends fixed. Think of a violin string or a guitar string. Then the string is plucked, giving the string an initial profile. Let $u(x, t)$ be the vertical displacement of the string at position $x$ and time $t$ . The motion of the string is governed by the one dimensional wave equation. The initial-boundary value problem for this problem is given as

Solving the Heat Equation

We would like to see how the solution of such problems involving partial differential equations will lead naturally to studying boundary value problems for ordinary differential equations. We will see this as we attempt the solution of the heat equation problem 4.3. We will employ a method typically used in studying linear partial differential equations, called the method of separation of variables.

We assume that $u$ can be written as a product of single variable functions of each independent variable,

$u(x, t)=X(x) T(t) \nonumber$

Substituting this guess into the heat equation, we find that

$X T^{\prime}=k X^{\prime \prime} T \nonumber$

Dividing both sides by $k$ and $u=X T$ , we then get

$\dfrac{1}{k} \dfrac{T^{\prime}}{T}=\dfrac{X^{\prime \prime}}{X} \nonumber$

We have separated the functions of time on one side and space on the other side. The only way that a function of $t$ equals a function of $x$ is if the functions are constant functions. Therefore, we set each function equal to a constant, $\lambda$

This leads to two equations:

$\begin{aligned} &T^{\prime}=k \lambda T \\[4pt] &X^{\prime \prime}=\lambda X \end{aligned} \nonumber$

These are ordinary differential equations. The general solutions to these equations are readily found as

$\begin{gathered} T(t)=A e^{k \lambda t} \\[4pt] X(x)=c_{1} e^{\sqrt{\lambda} x}+c_{2} e^{\sqrt{-\lambda} x} \end{gathered} \nonumber$

We need to be a little careful at this point. The aim is to force our product solutions to satisfy both the boundary conditions and initial conditions. Also, we should note that $\lambda$ is arbitrary and may be positive, zero, or negative. We first look at how the boundary conditions on $u$ lead to conditions on $X$ .

The first condition is $u(0, t)=0$ . This implies that

$X(0) T(t)=0 \nonumber$

for all $t$ . The only way that this is true is if $X(0)=0$ . Similarly, $u(L, t)=0$ implies that $X(L)=0$ . So, we have to solve the boundary value problem

$X^{\prime \prime}-\lambda X=0, \quad X(0)=0=X(L) \nonumber$

We are seeking nonzero solutions, as $X \equiv 0$ is an obvious and uninteresting solution. We call such solutions trivial solutions.

There are three cases to consider, depending on the sign of $\lambda$ .

I. $\underline{\lambda>0}$

In this case we have the exponential solutions

$X(x)=c_{1} e^{\sqrt{\lambda} x}+c_{2} e^{\sqrt{-\lambda} x} . \nonumber$

For $X(0)=0$ , we have

$0=c_{1}+c_{2} \nonumber$

We will take $c_{2}=-c_{1}$ . Then, $X(x)=c_{1}\left(e^{\sqrt{\lambda} x}-e^{\sqrt{-\lambda} x}\right)=2 c_{1} \sinh \sqrt{\lambda} x$ .

Applying the second condition, $X(L)=0$ yields

$c_{1} \sinh \sqrt{\lambda} L=0 . \nonumber$

This will be true only if $c_{1}=0$ , since $\lambda>0$ . Thus, the only solution in this case is $X(x)=0$ . This leads to a trivial solution, $u(x, t)=0$ .

II. $\underline{\lambda=0}$

$\overline{\text { For this case it is easier to set } \lambda \text { to zero in the differential equation. So, }$ $X^{\prime \prime}=0$ . Integrating twice, one finds

$X(x)=c_{1} x+c_{2} . \nonumber$

Setting $x=0$ , we have $c_{2}=0$ , leaving $X(x)=c_{1} x$ . Setting $x=L$ , we find $c_{1} L=0$ . So, $c_{1}=0$ and we are once again left with a trivial solution.

III. $\underline{\lambda<0}$

In this case is would be simpler to write $\lambda=-\mu^{2}$ . Then the differential equation is

$X^{\prime \prime}+\mu^{2} X=0 \nonumber$

The general solution is

$X(x)=c_{1} \cos \mu x+c_{2} \sin \mu x . \nonumber$

At $x=0$ we get $0=c_{1}$ . This leaves $X(x)=c_{2} \sin \mu x$ . At $x=L$ , we find

$0=c_{2} \sin \mu L . \nonumber$

So, either $c_{2}=0$ or $\sin \mu L=0 . c_{2}=0$ leads to a trivial solution again. But, there are cases when the sine is zero. Namely,

$\mu L==n \pi, \quad n=1,2, \ldots \nonumber$

Note that $n=0$ is not included since this leads to a trivial solution. Also, negative values of $n$ are redundant, since the sine function is an odd function.

In summary, we can find solutions to the boundary value problem (4.9) for particular values of $\lambda$ . The solutions are

$X_{n}(x)=\sin \dfrac{n \pi x}{L}, \quad n=1,2,3, \ldots \nonumber$

for

$\lambda_{n}=-\mu_{n}^{2}=-\left(\dfrac{n \pi}{L}\right)^{2}, \quad n=1,2,3, \ldots \nonumber$

Product solutions of the heat equation (4.3) satisfying the boundary conditions are therefore

$u_{n}(x, t)=b_{n} e^{k \lambda_{n} t} \sin \dfrac{n \pi x}{L}, \quad n=1,2,3, \ldots, \nonumber$

where $b_{n}$ is an arbitrary constant. However, these do not necessarily satisfy the initial condition $u(x, 0)=f(x)$ . What we do get is

$u_{n}(x, 0)=\sin \dfrac{n \pi x}{L}, \quad n=1,2,3, \ldots \nonumber$

So, if our initial condition is in one of these forms, we can pick out the right $n$ and we are done.

For other initial conditions, we have to do more work. Note, since the heat equation is linear, we can write a linear combination of our product solutions and obtain the general solution satisfying the given boundary conditions as

$u(x, t)=\sum_{n=1}^{\infty} b_{n} e^{k \lambda_{n} t} \sin \dfrac{n \pi x}{L} . \nonumber$

The only thing to impose is the initial condition:

$f(x)=u(x, 0)=\sum_{n=1}^{\infty} b_{n} \sin \dfrac{n \pi x}{L} . \nonumber$

So, if we are given $f(x)$ , can we find the constants $b_{n}$ ? If we can, then we will have the solution to the full initial-boundary value problem. This will be the subject of the next chapter. However, first we will look at the general form of our boundary value problem and relate what we have done to the theory of infinite dimensional vector spaces.

Connections to Linear Algebra

We have already seen in earlier chapters that ideas from linear algebra crop up in our studies of differential equations. Namely, we solved eigenvalue problems associated with our systems of differential equations in order to determine the local behavior of dynamical systems near fixed points. In our study of boundary value problems we will find more connections with the theory of vector spaces. However, we will find that our problems lie in the realm of infinite dimensional vector spaces. In this section we will begin to see these connections.

Eigenfunction Expansions for PDEs

In the last section we sought solutions of the heat equation. Let’s formally write the heat equation in the form

$\dfrac{1}{k} u_{t}=L[u] \nonumber$

where

$L=\dfrac{\partial^{2}}{\partial x^{2}} \nonumber$

$L$ is another example of a linear differential operator. [See Section 1.1.2.] It is a differential operator because it involves derivative operators. We sometimes define $D_{x}=\dfrac{\partial}{\partial x}$ , so that $L=D_{x}^{2}$ . It is linear, because for functions $f(x)$ and $g(x)$ and constants $\alpha, \beta$ we have

$L[\alpha f+\beta g]=\alpha L[f]+\beta L[g] \nonumber$

When solving the heat equation, using the method of separation of variables, we found an infinite number of product solutions $u_{n}(x, t)=T_{n}(t) X_{n}(x)$ . We did this by solving the boundary value problem

$L[X]=\lambda X, \quad X(0)=0=X(L) \nonumber$

Here we see that an operator acts on an unknown function and spits out an unknown constant times that unknown. Where have we done this before? This is the same form as $A \mathbf{v}=\lambda \mathbf{v}$ . So, we see that Equation (4.14) is really an eigenvalue problem for the operator $L$ and given boundary conditions. When we solved the heat equation in the last section, we found the eigenvalues

$\lambda_{n}=-\left(\dfrac{n \pi}{L}\right)^{2} \nonumber$

and the eigenfunctions

$X_{n}(x)=\sin \dfrac{n \pi x}{L} . \nonumber$

We used these to construct the general solution that is essentially a linear combination over the eigenfunctions,

$u(x, t)=\sum_{n=1}^{\infty} T_{n}(t) X_{n}(x) \nonumber$

Note that these eigenfunctions live in an infinite dimensional function space.

We would like to generalize this method to problems in which $L$ comes from an assortment of linear differential operators. So, we consider the more general partial differential equation

$u_{t}=L[u], \quad a \leq x \leq b, \quad t>0 \nonumber$

satisfying the boundary conditions

$B[u](a, t)=0, \quad B[u](b, t)=0, \quad t>0 \nonumber$

and initial condition

$u(x, 0)=f(x), \quad a \leq x \leq b \nonumber$

The form of the allowed boundary conditions $B[u]$ will be taken up later. Also, we will later see specific examples and properties of linear differential operators that will allow for this procedure to work.

We assume product solutions of the form $u_{n}(x, t)=b_{n}(t) \phi_{n}(x)$ , where the $\phi_{n}$ ’s are the eigenfunctions of the operator $L$ ,

$L \phi_{n}=\lambda_{n} \phi_{n}, \quad n=1,2, \ldots, \nonumber$

satisfying the boundary conditions

$B\left[\phi_{n}\right](a)=0, \quad B\left[\phi_{n}\right](b)=0 . \nonumber$

Inserting the general solution

$u(x, t)=\sum_{n=1}^{\infty} b_{n}(t) \phi_{n}(x) \nonumber$

into the partial differential equation, we have

$\begin{aligned} u_{t} &=L[u] \\[4pt] \dfrac{\partial}{\partial t} \sum_{n=1}^{\infty} b_{n}(t) \phi_{n}(x) &=L\left[\sum_{n=1}^{\infty} b_{n}(t) \phi_{n}(x)\right] \end{aligned} \nonumber$

On the left we differentiate term by term ${ }^{1}$ and on the right side we use the linearity of $L$ :

$\sum_{n=1}^{\infty} \dfrac{d b_{n}(t)}{d t} \phi_{n}(x)=\sum_{n=1}^{\infty} b_{n}(t) L\left[\phi_{n}(x)\right] \nonumber$

Now, we make use of the result of applying $L$ to the eigenfunction $\phi_{n}$ :

$\sum_{n=1}^{\infty} \dfrac{d b_{n}(t)}{d t} \phi_{n}(x)=\sum_{n=1}^{\infty} b_{n}(t) \lambda_{n} \phi_{n}(x) \nonumber$

Comparing both sides, or using the linear independence of the eigenfunctions, we see that

$\dfrac{d b_{n}(t)}{d t}=\lambda_{n} b_{n}(t) \nonumber$

whose solution is

$b_{n}(t)=b_{n}(0) e^{\lambda_{n} t} \nonumber$

So, the general solution becomes

${ }^{1}$ Infinite series cannot always be differentiated, so one must be careful. When we ignore such details for the time being, we say that we formally differentiate the series and formally apply the differential operator to the series. Such operations need to be justified later.

$u(x, t)=\sum_{n=1}^{\infty} b_{n}(0) e^{\lambda_{n} t} \phi_{n}(x) \nonumber$

This solution satisfies, at least formally, the partial differential equation and satisfies the boundary conditions.

Finally, we need to determine the $b_{n}(0)$ ’s, which are so far arbitrary. We use the initial condition $u(x, 0)=f(x)$ to find that

$f(x)=\sum_{n=1}^{\infty} b_{n}(0) \phi_{n}(x) . \nonumber$

So, given $f(x)$ , we are left with the problem of extracting the coefficients $b_{n}(0)$ in an expansion of $f$ in the eigenfunctions $\phi_{n}$ . We will see that this is related to Fourier series expansions, which we will take up in the next chapter.

Eigenfunction Expansions for Nonhomogeneous ODEs

Partial differential equations are not the only applications of the method of eigenfunction expansions, as seen in the last section. We can apply these method to nonhomogeneous two point boundary value problems for ordinary differential equations assuming that we can solve the associated eigenvalue problem.

Let’s begin with the nonhomogeneous boundary value problem:

$\begin{gathered} L[u]=f(x), \quad a \leq x \leq b \\[4pt] B[u](a)=0, \quad B[u](b)=0 . \end{gathered} \nonumber$

We first solve the eigenvalue problem,

$\begin{array}{r} L[\phi]=\lambda \phi, \quad a \leq x \leq b \\[4pt] B[\phi](a)=0, \quad B[\phi](b)=0, \end{array} \nonumber$

and obtain a family of eigenfunctions, $\left\{\phi_{n}(x)\right\}_{n=1}^{\infty}$ . Then we assume that $u(x)$ can be represented as a linear combination of these eigenfunctions:

$u(x)=\sum_{n=1}^{\infty} b_{n} \phi_{n}(x) \nonumber$

Inserting this into the differential equation, we have

$\begin{aligned} f(x) &=L[u] \\[4pt] &=L\left[\sum_{n=1}^{\infty} b_{n} \phi_{n}(x)\right] \\[4pt] &=\sum_{n=1}^{\infty} b_{n} L\left[\phi_{n}(x)\right] \end{aligned} \nonumber$

$\begin{aligned} &=\sum_{n=1}^{\infty} \lambda_{n} b_{n} \phi_{n}(x) \\[4pt] &\equiv \sum_{n=1}^{\infty} c_{n} \phi_{n}(x) \end{aligned} \nonumber$

Therefore, we have to find the expansion coefficients $c_{n}=\lambda_{n} b_{n}$ of the given $f(x)$ in a series expansion over the eigenfunctions. This is similar to what we had found for the heat equation problem and its generalization in the last section.

There are a lot of questions and details that have been glossed over in our formal derivations. Can we always find such eigenfunctions for a given operator? Do the infinite series expansions converge? Can we differentiate our expansions terms by term? Can one find expansions that converge to given functions like $f(x)$ above? We will begin to explore these questions in the case that the eigenfunctions are simple trigonometric functions like the $\phi_{n}(x)=\sin \dfrac{n \pi x}{L}$ in the solution of the heat equation.

Linear Vector Spaces

Much of the discussion and terminology that we will use comes from the theory of vector spaces. Until now you may only have dealt with finite dimensional vector spaces in your classes. Even then, you might only be comfortable with two and three dimensions. We will review a little of what we know about finite dimensional spaces so that we can deal with the more general function spaces, which is where our eigenfunctions live.

The notion of a vector space is a generalization of our three dimensional vector spaces. In three dimensions, we have things called vectors, which are arrows of a specific length and pointing in a given direction. To each vector, we can associate a point in a three dimensional Cartesian system. We just attach the tail of the vector $\mathbf{v}$ to the origin and the head lands at $(x, y, z)$ . We then use unit vectors $\mathbf{i}, \mathbf{j}$ and $\mathbf{k}$ along the coordinate axes to write

$\mathbf{v}=x \mathbf{i}+y \mathbf{j}+z \mathbf{k} \nonumber$

Having defined vectors, we then learned how to add vectors and multiply vectors by numbers, or scalars. Under these operations, we expected to get back new vectors. Then we learned that there were two types of multiplication of vectors. We could multiply then to get a scalar or a vector. This lead to the dot and cross products, respectively. The dot product was useful for determining the length of a vector, the angle between two vectors, or if the vectors were orthogonal.

These notions were later generalized to spaces of more than three dimensions in your linear algebra class. The properties outlined roughly above need to be preserved. So, we have to start with a space of vectors and the operations between them. We also need a set of scalars, which generally come from some field. However, in our applications the field will either be the set of real numbers or the set of complex numbers.

Definition 4.1. A vector space $V$ over a field $F$ is a set that is closed under addition and scalar multiplication and satisfies the following conditions: For any $u, v, w \in V$ and $a, b \in F$

$u+v=v+u$ .
$(u+v)+w=u+(v+w)$ .
There exists a 0 such that $0+v=v$ .
There exists $a-v$ such that $v+(-v)=0$ .
$a(b v)=(a b) v$ .
$(a+b) v=a v+b v$
$a(u+v)=a u+b v$ .
$1(v)=v$

Now, for an $n$ -dimensional vector space, we have the idea that any vector in the space can be represented as the sum over $n$ linearly independent vectors. Recall that a linearly independent set of vectors $\left\{\mathbf{v}_{j}\right\}_{j=1}^{n}$ satisfies

$\sum_{j=1}^{n} c_{j} \mathbf{v}_{j}=\mathbf{0} \quad \Leftrightarrow \quad c_{j}=0 \nonumber$

This leads to the idea of a basis set. The standard basis in an $n$ -dimensional vector space is a generalization of the standard basis in three dimensions (i, $\mathbf{j}$ and $\mathbf{k})$ . We define

$\mathbf{e}_{k}=(0, \ldots, 0, \underbrace{1}_{k \text { th space }}, 0, \ldots, 0), \quad k=1, \ldots, n . \nonumber$

Then, we can expand any $\mathbf{v} \in V$ as

$\mathbf{v}=\sum_{k=1}^{n} v_{k} \mathbf{e}_{k} \nonumber$

where the $v_{k}$ ’s are called the components of the vector in this basis and one can write $\mathbf{v}$ as an $n$ -tuple $\left(v_{1}, v_{2}, \ldots, v_{n}\right)$ .

The only other thing we will need at this point is to generalize the dot product, or scalar product. Recall that there are two forms for the dot product in three dimensions. First, one has that

$\mathbf{u} \cdot \mathbf{v}=u v \cos \theta \nonumber$

where $u$ and $v$ denote the length of the vectors. The other form, is the component form:

$\mathbf{u} \cdot \mathbf{v}=u_{1} v_{1}+u_{2} v_{2}+u_{3} v_{3}=\sum_{k=1}^{3} u_{k} v_{k} \nonumber$

Of course, this form is easier to generalize. So, we define the scalar product between to $n$ -dimensional vectors as

$<\mathbf{u}, \mathbf{v}>=\sum_{k=1}^{n} u_{k} v_{k} . \nonumber$

Actually, there are a number of notations that are used in other texts. One can write the scalar product as $(\mathbf{u}, \mathbf{v})$ or even use the Dirac notation $<\mathbf{u} \mid \mathbf{v}>$ for applications in quantum mechanics.

While it does not always make sense to talk about angles between general vectors in higher dimensional vector spaces, there is one concept that is useful. It is that of orthogonality, which in three dimensions another way of say vectors are perpendicular to each other. So, we also say that vectors $\mathbf{u}$ and $\mathbf{v}$ are orthogonal if and only if $\langle\mathbf{u}, \mathbf{v}\rangle=0$ . If $\left\{\mathbf{a}_{k}\right\}_{k=1}^{n}$ , is a set of basis vectors such that

$<\mathbf{a}_{j}, \mathbf{a}_{k}>=0, \quad k \neq j, \nonumber$

then it is called an orthogonal basis. If in addition each basis vector is a unit vector, then one has an orthonormal basis

Let $\left\{\mathbf{a}_{k}\right\}_{k=1}^{n}$ , be a set of basis vectors for vector space $V$ . We know that any vector $\mathbf{v}$ can be represented in terms of this basis, $\mathbf{v}=\sum_{k=1}^{n} v_{k} \mathbf{a}_{k}$ . If we know the basis and vector, can we find the components? The answer is, yes. We can use the scalar product of $\mathbf{v}$ with each basis element $\mathbf{a}_{j}$ . So, we have for $j=1, \ldots, n$

$\begin{aligned} <\mathbf{a}_{j}, \mathbf{v}>&=<\mathbf{a}_{j}, \sum_{k=1}^{n} v_{k} \mathbf{a}_{k}>\\[4pt] &=\sum_{k=1}^{n} v_{k}<\mathbf{a}_{j}, \mathbf{a}_{k}> \end{aligned} \nonumber$

Since we know the basis elements, we can easily compute the numbers

$A_{j k} \equiv<\mathbf{a}_{j}, \mathbf{a}_{k}> \nonumber$

and

$b_{j} \equiv<\mathbf{a}_{j}, \mathbf{v}>. \nonumber$

Therefore, the system (4.28) for the $v_{k}$ ’s is a linear algebraic system, which takes the form $A \mathbf{v}=\mathbf{b}$ . However, if the basis is orthogonal, then the matrix $A$ is diagonal and the system is easily solvable. We have that

$<\mathbf{a}_{j}, \mathbf{v}>=v_{j}<\mathbf{a}_{j}, \mathbf{a}_{j}>, \nonumber$

$v_{j}=\dfrac{\left\langle\mathbf{a}_{j}, \mathbf{v}\right\rangle}{\left.<\mathbf{a}_{j}, \mathbf{a}_{j}\right\rangle} . \nonumber$

In fact, if the basis is orthonormal, $A$ is the identity matrix and the solution is simpler:

$v_{j}=<\mathbf{a}_{j}, \mathbf{v}>. \nonumber$

We spent some time looking at this simple case of extracting the components of a vector in a finite dimensional space. The keys to doing this simply were to have a scalar product and an orthogonal basis set. These are the key ingredients that we will need in the infinite dimensional case. Recall that when we solved the heat equation, we had a function (vector) that we wanted to expand in a set of eigenfunctions (basis) and we needed to find the expansion coefficients (components). As you can see, we need to extend the concepts for finite dimensional spaces to their analogs in infinite dimensional spaces. Linear algebra will provide some of the backdrop for what is to follow: The study of many boundary value problems amounts to the solution of eigenvalue problems over infinite dimensional vector spaces (complete inner product spaces, the space of square integrable functions, or Hilbert spaces).

We will consider the space of functions of a certain type. They could be the space of continuous functions on $[0,1]$ , or the space of differentiably continuous functions, or the set of functions integrable from $a$ to $b$ . Later, we will specify the types of functions needed. We will further need to be able to add functions and multiply them by scalars. So, we can easily obtain a vector space of functions.

We will also need a scalar product defined on this space of functions. There are several types of scalar products, or inner products, that we can define. For a real vector space, we define

Definition 4.2. An inner product $<,>$ on a real vector space $V$ is a mapping from $V \times V$ into $R$ such that for $u, v, w \in V$ and $\alpha \in R$ one has

$<u+v, w>=<u, w>+<v, w>$ .
$<\alpha v, w>=\alpha<v, w>$ .
$<v, w>=\langle w, v>$ .
$<v, v>\geq 0$ and $<v, v>=0$ iff $v=0$ .

A real vector space equipped with the above inner product leads to a real inner product space. A more general definition with the third item replaced with $\langle v, w\rangle=\langle w, v\rangle$ is needed for complex inner product spaces.

For the time being, we are dealing just with real valued functions. We need an inner product appropriate for such spaces. One such definition is the following. Let $f(x)$ and $g(x)$ be functions defined on $[a, b]$ . Then, we define the inner product, if the integral exists, as

$<f, g>=\int_{a}^{b} f(x) g(x) d x . \nonumber$

So far, we have functions spaces equipped with an inner product. Can we find a basis for the space? For an $n$ -dimensional space we need $n$ basis vectors. For an infinite dimensional space, how many will we need? How do we know when we have enough? We will think about those things later.

Let’s assume that we have a basis of functions $\left\{\phi_{n}(x)\right\}_{n=1}^{\infty}$ . Given a function $f(x)$ , how can we go about finding the components of $f$ in this basis? In other words, let

$f(x)=\sum_{n=1}^{\infty} c_{n} \phi_{n}(x) \nonumber$

How do we find the $c_{n}$ ’s? Does this remind you of the problem we had earlier?

Formally, we take the inner product of $f$ with each $\phi_{j}$ , to find

$\begin{aligned} <\phi_{j}, f>&=<\phi_{j}, \sum_{n=1}^{\infty} c_{n} \phi_{n}>\\[4pt] &=\sum_{n=1}^{\infty} c_{n}<\phi_{j}, \phi_{n}> \end{aligned} \nonumber$

If our basis is an orthogonal basis, then we have

$<\phi_{j}, \phi_{n}>=N_{j} \delta_{j n}, \nonumber$

where $\delta_{i j}$ is the Kronecker delta defined as

$\delta_{i j}= \begin{cases}0, & i \neq j \\[4pt] 1, & i=j\end{cases} \nonumber$

Thus, we have

$\begin{aligned} <\phi_{j}, f>&=\sum_{n=1}^{\infty} c_{n}<\phi_{j}, \phi_{n}>\\[4pt] &=\sum_{n=1}^{\infty} c_{n} N_{j} \delta_{j n} \\[4pt] &=c_{1} N_{j} \delta_{j 1}+c_{2} N_{j} \delta_{j 2}+\ldots+c_{j} N_{j} \delta_{j j}+\ldots \\[4pt] &=c_{j} N_{j} \end{aligned} \nonumber$

So, the expansion coefficient is

$c_{j}=\dfrac{<\phi_{j}, f>}{N_{j}}=\dfrac{<\phi_{j}, f>}{<\phi_{j}, \phi_{j}>} \nonumber$

We summarize this important result:

Generalized Basis Expansion

Let $f(x)$ be represented by an expansion over a basis of orthogonal functions, $\left\{\phi_{n}(x)\right\}_{n=1}^{\infty}$ ,

$f(x)=\sum_{n=1}^{\infty} c_{n} \phi_{n}(x) . \nonumber$

Then, the expansion coefficients are formally determined as

$c_{n}=\dfrac{\left.<\phi_{n}, f\right\rangle}{<\phi_{n}, \phi_{n}>} . \nonumber$

In our preparation for later sections, let’s determine if the set of functions $\phi_{n}(x)=\sin n x$ for $n=1,2, \ldots$ is orthogonal on the interval $[-\pi, \pi]$ . We need to show that $<\phi_{n}, \phi_{m}>=0$ for $n \neq m$ . Thus, we have for $n \neq m$

$\begin{aligned} <\phi_{n}, \phi_{m}>&=\int_{-\pi}^{\pi} \sin n x \sin m x d x \\[4pt] &=\dfrac{1}{2} \int_{-\pi}^{\pi}[\cos (n-m) x-\cos (n+m) x] d x \\[4pt] &=\dfrac{1}{2}\left[\dfrac{\sin (n-m) x}{n-m}-\dfrac{\sin (n+m) x}{n+m}\right]_{-\pi}^{\pi}=0 \end{aligned} \nonumber$

Here we have made use of a trigonometric identity for the product of two sines. We recall how this identity is derived. Recall the addition formulae for cosines:

$\begin{aligned} &\cos (A+B)=\cos A \cos B-\sin A \sin B \\[4pt] &\cos (A-B)=\cos A \cos B+\sin A \sin B \end{aligned} \nonumber$

Adding, or subtracting, these equations gives

$\begin{aligned} &2 \cos A \cos B=\cos (A+B)+\cos (A-B), \\[4pt] &2 \sin A \sin B=\cos (A-B)-\cos (A+B) . \end{aligned} \nonumber$

So, we have determined that the set $\phi_{n}(x)=\sin n x$ for $n=1,2, \ldots$ is an orthogonal set of functions on the interval $[=\pi, \pi]$ . Just as with vectors in three dimensions, we can normalize our basis functions to arrive at an orthonormal basis, $<\phi_{n}, \phi_{m}>=\delta_{n m}, m, n=1,2, \ldots$ This is simply done by dividing by the length of the vector. Recall that the length of a vector was obtained as $v=\sqrt{\mathbf{v} \cdot \mathbf{v}}$ In the same way, we define the norm of our functions by

$\|f\|=\sqrt{<f, f>} . \nonumber$

Note, there are many types of norms, but this will be sufficient for us. For the above basis of sine functions, we want to first compute the norm of each function. Then we would like to find a new basis from this one such that each basis eigenfunction has unit length and is therefore an orthonormal basis. We first compute

$\begin{aligned} \left\|\phi_{n}\right\|^{2} &=\int_{-\pi}^{\pi} \sin ^{2} n x d x \\[4pt] &=\dfrac{1}{2} \int_{-\pi}^{\pi}[1-\cos 2 n x] d x \\[4pt] &=\dfrac{1}{2}\left[x-\dfrac{\sin 2 n x}{2 n}\right]_{-\pi}^{\pi}=\pi \end{aligned} \nonumber$

We have found for our example that

$<\phi_{n}, \phi_{m}>=\pi \delta_{n m} \nonumber$

and that $\left\|\phi_{n}\right\|=\sqrt{\pi}$ . Defining $\psi_{n}(x)=\dfrac{1}{\sqrt{\pi}} \phi_{n}(x)$ , we have normalized the $\phi_{n}$ ’s and have obtained an orthonormal basis of functions on $[-\pi, \pi]$ .

Expansions of functions in trigonometric bases occur often and originally resulted from the study of partial differential equations. They have been named Fourier series and will be the topic of the next chapter.

Problems

4.1. Solve the following problem:

$x^{\prime \prime}+x=2, \quad x(0)=0, \quad x^{\prime}(1)=0 . \nonumber$

4.2. Find product solutions, $u(x, t)=b(t) \phi(x)$ , to the heat equation satisfying the boundary conditions $u_{x}(0, t)=0$ and $u(L, t)=0$ . Use these solutions to find a general solution of the heat equation satisfying these boundary conditions.

4.3. Consider the following boundary value problems. Determine the eigenvalues, $\lambda$ , and eigenfunctions, $y(x)$ for each problem. ${ }^{2}$
a. $y^{\prime \prime}+\lambda y=0, \quad y(0)=0, \quad y^{\prime}(1)=0$ .
b. $y^{\prime \prime}-\lambda y=0, \quad y(-\pi)=0, \quad y^{\prime}(\pi)=0$ .
c. $x^{2} y^{\prime \prime}+x y^{\prime}+\lambda y=0, \quad y(1)=0, \quad y(2)=0$ .
d. $\left(x^{2} y^{\prime}\right)^{\prime}+\lambda y=0, \quad y(1)=0, \quad y^{\prime}(e)=0 .$

${ }^{2}$ In problem $d$ you will not get exact eigenvalues. Show that you obtain a transcendental equation for the eigenvalues in the form $\tan z=2 z$ . Find the first three eigenvalues numerically. 4.4. For the following sets of functions: i) show that each is orthogonal on the given interval, and ii) determine the corresponding orthonormal set.
$n=1,2,3, \ldots, \quad 0 \leq x \leq \pi$
a. $\{\sin 2 n x\}$ ,
b. $\{\cos n \pi x\}$ ,
$n=0,1,2, \ldots, \quad 0 \leq x \leq 2$
$n=1,2,3, \ldots, \quad x \in[-L, L]$ .
c. $\left\{\sin \dfrac{n \pi x}{L}\right\}$ ,

4.5. Consider the boundary value problem for the deflection of a horizontal beam fixed at one end,

$\dfrac{d^{4} y}{d x^{4}}=C, \quad y(0)=0, \quad y^{\prime}(0)=0, \quad y^{\prime \prime}(L)=0, \quad y^{\prime \prime \prime}(L)=0 \nonumber$

Solve this problem assuming that $C$ is a constant.

Fourier Series

Introduction

In this chapter we will look at trigonometric series. Previously, we saw that such series expansion occurred naturally in the solution of the heat equation and other boundary value problems. In the last chapter we saw that such functions could be viewed as a basis in an infinite dimensional vector space of functions. Given a function in that space, when will it have a representation as a trigonometric series? For what values of $x$ will it converge? Finding such series is at the heart of Fourier, or spectral, analysis.

There are many applications using spectral analysis. At the root of these studies is the belief that many continuous waveforms are comprised of a number of harmonics. Such ideas stretch back to the Pythagorean study of the vibrations of strings, which lead to their view of a world of harmony. This idea was carried further by Johannes Kepler in his harmony of the spheres approach to planetary orbits. In the 1700 ’s others worked on the superposition theory for vibrating waves on a stretched spring, starting with the wave equation and leading to the superposition of right and left traveling waves. This work was carried out by people such as John Wallis, Brook Taylor and Jean le Rond d’Alembert.

In 1742 d’Alembert solved the wave equation

$c^{2} \dfrac{\partial^{2} y}{\partial x^{2}}-\dfrac{\partial^{2} y}{\partial t^{2}}=0 \nonumber$

where $y$ is the string height and $c$ is the wave speed. However, his solution led himself and others, like Leonhard Euler and Daniel Bernoulli, to investigate what "functions" could be the solutions of this equation. In fact, this lead to a more rigorous approach to the study of analysis by first coming to grips with the concept of a function. For example, in 1749 Euler sought the solution for a plucked string in which case the initial condition $y(x, 0)=h(x)$ has a discontinuous derivative! In 1753 Daniel Bernoulli viewed the solutions as a superposition of simple vibrations, or harmonics. Such superpositions amounted to looking at solutions of the form

$y(x, t)=\sum_{k} a_{k} \sin \dfrac{k \pi x}{L} \cos \dfrac{k \pi c t}{L}, \nonumber$

where the string extends over the interval $[0, L]$ with fixed ends at $x=0$ and $x=L$ . However, the initial conditions for such superpositions are

$y(x, 0)=\sum_{k} a_{k} \sin \dfrac{k \pi x}{L} . \nonumber$

It was determined that many functions could not be represented by a finite number of harmonics, even for the simply plucked string given by an initial condition of the form

$y(x, 0)=\left\{\begin{array}{cc} c x, & 0 \leq x \leq L / 2 \\[4pt] c(L-x), & L / 2 \leq x \leq L \end{array}\right. \nonumber$

Thus, the solution consists generally of an infinite series of trigonometric functions.

Such series expansions were also of importance in Joseph Fourier’s solution of the heat equation. The use of such Fourier expansions became an important tool in the solution of linear partial differential equations, such as the wave equation and the heat equation. As seen in the last chapter, using the Method of Separation of Variables, allows higher dimensional problems to be reduced to several one dimensional boundary value problems. However, these studies lead to very important questions, which in turn opened the doors to whole fields of analysis. Some of the problems raised were

What functions can be represented as the sum of trigonometric functions?
How can a function with discontinuous derivatives be represented by a sum of smooth functions, such as the above sums?
Do such infinite sums of trigonometric functions a actually converge to the functions they represents?

Sums over sinusoidal functions naturally occur in music and in studying sound waves. A pure note can be represented as

$y(t)=A \sin (2 \pi f t), \nonumber$

where $A$ is the amplitude, $f$ is the frequency in hertz $(\mathrm{Hz})$ , and $t$ is time in seconds. The amplitude is related to the volume, or intensity, of the sound. The larger the amplitude, the louder the sound. In Figure $5.1$ we show plots of two such tones with $f=2 \mathrm{~Hz}$ in the top plot and $f=5 \mathrm{~Hz}$ in the bottom one.

Figure 5.1. Plots of $y(t)=\sin (2 \pi f t)$ on $[0,5]$ for $f=2 \mathrm{~Hz}$ and $f=5 \mathrm{~Hz}$ .

different amplitudes and frequencies. In Figure $5.2$ we see what happens when we add several sinusoids. Note that as one adds more and more tones with different characteristics, the resulting signal gets more complicated. However, we still have a function of time. In this chapter we will ask, "Given a function $f(t)$ , can we find a set of sinusoidal functions whose sum converges to $f(t)$ ?"

Looking at the superpositions in Figure 5.2, we see that the sums yield functions that appear to be periodic. This is not to be unexpected. We recall that a periodic function is one in which the function values repeat over the domain of the function. The length of the smallest part of the domain which repeats is called the period. We can define this more precisely.

Definition 5.1. A function is said to be periodic with period $T$ if $f(t+T)=$ $f(t)$ for all $t$ and the smallest such positive number $T$ is called the period.

For example, we consider the functions used in Figure $5.2$ . We began with $y(t)=2 \sin (4 \pi t)$ . Recall from your first studies of trigonometric functions that one can determine the period by dividing the coefficient of $t$ into $2 \pi$ to get the period. In this case we have

$T=\dfrac{2 \pi}{4 \pi}=\dfrac{1}{2} . \nonumber$

Looking at the top plot in Figure $5.1$ we can verify this result. (You can count the full number of cycles in the graph and divide this into the total time to get a more accurate value of the period.)

Figure 5.2. Superposition of several sinusoids. Top: Sum of signals with $f=2 \mathrm{~Hz}$ and $f=5 \mathrm{~Hz}$ . Bottom: Sum of signals with $f=2 \mathrm{~Hz}, f=5 \mathrm{~Hz}$ , and and $f=8 \mathrm{~Hz}$ .

Of course, this result makes sense, as the unit of frequency, the hertz, is also defined as $s^{-1}$ , or cycles per second.

Returning to the superpositions in Figure 5.2, we have that $y(t)=$ $\sin (10 \pi t)$ has a period of $0.2 \mathrm{~Hz}$ and $y(t)=\sin (16 \pi t)$ has a period of $0.125 \mathrm{~Hz}$ . The two superpositions retain the largest period of the signals added, which is $0.5 \mathrm{~Hz}$ .

Our goal will be to start with a function and then determine the amplitudes of the simple sinusoids needed to sum to that function. First of all, we will see that this might involve an infinite number of such terms. Thus, we will be studying an infinite series of sinusoidal functions.

Secondly, we will find that using just sine functions will not be enough either. This is because we can add sinusoidal functions that do not necessarily peak at the same time. We will consider two signals that originate at different times. This is similar to when your music teacher would make sections of the class sing a song like "Row, Row, Row your Boat" starting at slightly different times.

We can easily add shifted sine functions. In Figure $5.3$ we show the functions $y(t)=2 \sin (4 \pi t)$ and $y(t)=2 \sin (4 \pi t+7 \pi / 8)$ and their sum. Note that this shifted sine function can be written as $y(t)=2 \sin (4 \pi(t+7 / 32))$ . Thus, this corresponds to a time shift of $-7 / 8$ .

Figure 5.3. Plot of the functions $y(t)=2 \sin (4 \pi t)$ and $y(t)=2 \sin (4 \pi t+7 \pi / 8)$ and their sum.

We are now in a position to state our goal in this chapter.

Goal

Given a signal $f(t)$ , we would like to determine its frequency content by finding out what combinations of sines and cosines of varying frequencies and amplitudes will sum to the given function. This is called Fourier Analysis.

Fourier Trigonometric Series

As we have seen in the last section, we are interested in finding representations of functions in terms of sines and cosines. Given a function $f(x)$ we seek a representation in the form

$f(x) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos n x+b_{n} \sin n x\right] \nonumber$

Notice that we have opted to drop reference to the frequency form of the phase. This will lead to a simpler discussion for now and one can always make the transformation $n x=2 \pi f_{n} t$ when applying these ideas to applications.

The series representation in Equation (5.1) is called a Fourier trigonometric series. We will simply refer to this as a Fourier series for now. The set of constants $a_{0}, a_{n}, b_{n}, n=1,2, \ldots$ are called the Fourier coefficients. The constant term is chosen in this form to make later computations simpler, though some other authors choose to write the constant term as $a_{0}$ . Our goal is to find the Fourier series representation given $f(x)$ . Having found the Fourier series representation, we will be interested in determining when the Fourier series converges and to what function it converges.

From our discussion in the last section, we see that the infinite series is periodic. The largest period of the terms comes from the $n=1$ terms. The periods of $\cos x$ and $\sin x$ are $T=2 \pi$ . Thus, the Fourier series has period $2 \pi$ . This means that the series should be able to represent functions that are periodic of period $2 \pi$ .

While this appears restrictive, we could also consider functions that are defined over one period. In Figure $5.4$ we show a function defined on $[0,2 \pi]$ . In the same figure, we show its periodic extension. These are just copies of the original function shifted by the period and glued together. The extension can now be represented by a Fourier series and restricting the Fourier series to $[0,2 \pi]$ will give a representation of the original function. Therefore, we will first consider Fourier series representations of functions defined on this interval. Note that we could just as easily considered functions defined on $[-\pi, \pi]$ or any interval of length $2 \pi$ .

Fourier Coefficients

Figure 5.4. Plot of the functions $f(t)$ defined on $[0,2 \pi]$ and its periodic extension.

These expressions for the Fourier coefficients are obtained by considering special integrations of the Fourier series. We will look at the derivations of the $a_{n}$ ’s. First we obtain $a_{0}$ .

We begin by integrating the Fourier series term by term in Equation (5.1).

$\int_{0}^{2 \pi} f(x) d x=\int_{0}^{2 \pi} \dfrac{a_{0}}{2} d x+\int_{0}^{2 \pi} \sum_{n=1}^{\infty}\left[a_{n} \cos n x+b_{n} \sin n x\right] d x \nonumber$

We assume that we can integrate the infinite sum term by term. Then we need to compute

$\begin{gathered} \int_{0}^{2 \pi} \dfrac{a_{0}}{2} d x=\dfrac{a_{0}}{2}(2 \pi)=\pi a_{0}, \\[4pt] \int_{0}^{2 \pi} \cos n x d x=\left[\dfrac{\sin n x}{n}\right]_{0}^{2 \pi}=0 \\[4pt] \int_{0}^{2 \pi} \sin n x d x=\left[\dfrac{-\cos n x}{n}\right]_{0}^{2 \pi}=0 \end{gathered} \nonumber$

From these results we see that only one term in the integrated sum does not vanish leaving

$\int_{0}^{2 \pi} f(x) d x=\pi a_{0} \nonumber$

This confirms the value for $a_{0}$ .

Next, we need to find $a_{n}$ . We will multiply the Fourier series (5.1) by $\cos m x$ for some positive integer $m$ . This is like multiplying by $\cos 2 x$ , $\cos 5 x$ , etc. We are multiplying by all possible $\cos m x$ functions for different integers $m$ all at the same time. We will see that this will allow us to solve for the $a_{n}$ ’s.

We find the integrated sum of the series times $\cos m x$ is given by

$\begin{aligned} \int_{0}^{2 \pi} f(x) \cos m x d x &=\int_{0}^{2 \pi} \dfrac{a_{0}}{2} \cos m x d x \\[4pt] &+\int_{0}^{2 \pi} \sum_{n=1}^{\infty}\left[a_{n} \cos n x+b_{n} \sin n x\right] \cos m x d x \end{aligned} \nonumber$

Integrating term by term, the right side becomes

$\dfrac{a_{0}}{2} \int_{0}^{2 \pi} \cos m x d x+\sum_{n=1}^{\infty}\left[a_{n} \int_{0}^{2 \pi} \cos n x \cos m x d x+b_{n}^{2 \pi} \int_{0}^{2 \pi} \sin n x \cos m x d x\right]$ .

We have already established that $\int_{0}^{2 \pi} \cos m x d x=0$ , which implies that the first term vanishes.

Next we need to compute integrals of products of sines and cosines. This requires that we make use of some trigonometric identities. While you have seen such integrals before in your calculus class, we will review how to carry out such integrals. For future reference, we list several useful identities, some of which we will prove along the way.

$\begin{aligned} Useful Trigonometric Identities $& \\[4pt] \sin (x \pm y) &=\sin x \cos y \pm \sin y \cos x \\[4pt] \cos (x \pm y) &=\cos x \cos y \mp \sin x \sin y \\[4pt] \sin ^{2} x &=\dfrac{1}{2}(1-\cos 2 x) \\[4pt] \cos ^{2} x &=\dfrac{1}{2}(1+\cos 2 x) \\[4pt] \sin x \sin y &=\dfrac{1}{2}(\cos (x-y)-\cos (x+y)) \\[4pt] \cos x \cos y &=\dfrac{1}{2}(\cos (x+y)+\cos (x-y)) \\[4pt] \sin x \cos y &=\dfrac{1}{2}(\sin (x+y)+\sin (x-y)) \\[4pt]$\hline \end{aligned} \nonumber$

We first want to evaluate $\int_{0}^{2 \pi} \cos n x \cos m x d x$ . We do this by using the product identity. We had done this in the last chapter, but will repeat the derivation for the reader’s benefit. Recall the addition formulae for cosines:

$\cos (A+B)=\cos A \cos B-\sin A \sin B \nonumber$

$\cos (A-B)=\cos A \cos B+\sin A \sin B \nonumber$

Adding these equations gives

$2 \cos A \cos B=\cos (A+B)+\cos (A-B) . \nonumber$

We can use this identity with $A=m x$ and $B=n x$ to complete the integration. We have

$\begin{aligned} \int_{0}^{2 \pi} \cos n x \cos m x d x &=\dfrac{1}{2} \int_{0}^{2 \pi}[\cos (m+n) x+\cos (m-n) x] d x \\[4pt] &=\dfrac{1}{2}\left[\dfrac{\sin (m+n) x}{m+n}+\dfrac{\sin (m-n) x}{m-n}\right]_{0}^{2 \pi} \\[4pt] &=0 . \end{aligned} \nonumber$

There is one caveat when doing such integrals. What if one of the denominators $m \pm n$ vanishes? For our problem $m+n \neq 0$ , since both $m$ and $n$ are positive integers. However, it is possible for $m=n$ . This means that the vanishing of the integral can only happen when $m \neq n$ . So, what can we do about the $m=n$ case? One way is to start from scratch with our integration. (Another way is to compute the limit as $n$ approaches $m$ in our result and use L’Hopital’s Rule. Try it!)

So, for $n=m$ we have to compute $\int_{0}^{2 \pi} \cos ^{2} m x d x$ . This can also be handled using a trigonometric identity. Recall that

$\cos ^{2} \theta=\dfrac{1}{2}(1+\cos 2 \theta \text {. }) \nonumber$

Inserting this into the integral, we find

$\begin{aligned} \int_{0}^{2 \pi} \cos ^{2} m x d x &=\dfrac{1}{2} \int_{0}^{2 \pi}\left(1+\cos ^{2} 2 m x\right) d x \\[4pt] &=\dfrac{1}{2}\left[x+\dfrac{1}{2 m} \sin 2 m x\right]_{0}^{2 \pi} \\[4pt] &=\dfrac{1}{2}(2 \pi)=\pi \end{aligned} \nonumber$

To summarize, we have shown that

$\int_{0}^{2 \pi} \cos n x \cos m x d x=\left\{\begin{array}{l} 0, m \neq n \\[4pt] \pi, m=n \end{array}\right. \nonumber$

This holds true for $m, n=0,1, \ldots$ [Why did we include $m, n=0$ ?] When we have such a set of functions, they are said to be an orthogonal set over the integration interval.

Definition 5.3. A set of (real) functions $\left\{\phi_{n}(x)\right\}$ is said to be orthogonal on $[a, b]$ if $\int_{a}^{b} \phi_{n}(x) \phi_{m}(x) d x=0$ when $n \neq m$ . Furthermore, if we also have that $\int_{a}^{b} \phi_{n}^{2}(x) d x=1$ , these functions are called orthonormal. The set of functions $\{\cos n x\}_{n=0}^{\infty}$ are orthogonal on $[0,2 \pi]$ . Actually, they are orthogonal on any interval of length $2 \pi$ . We can make them orthonormal by dividing each function by $\sqrt{\pi}$ as indicated by Equation (5.15).

The notion of orthogonality is actually a generalization of the orthogonality of vectors in finite dimensional vector spaces. The integral $\int_{a}^{b} f(x) f(x) d x$ is the generalization of the dot product, and is called the scalar product of $f(x)$ and $g(x)$ , which are thought of as vectors in an infinite dimensional vector space spanned by a set of orthogonal functions. But that is another topic for later.

Returning to the evaluation of the integrals in equation (5.6), we still have to evaluate $\int_{0}^{2 \pi} \sin n x \cos m x d x$ . This can also be evaluated using trigonometric identities. In this case, we need an identity involving products of sines and cosines. Such products occur in the addition formulae for sine functions:

$\begin{aligned} &\sin (A+B)=\sin A \cos B+\sin B \cos A \\[4pt] &\sin (A-B)=\sin A \cos B-\sin B \cos A \end{aligned} \nonumber$

Adding these equations, we find that

$\sin (A+B)+\sin (A-B)=2 \sin A \cos B \nonumber$

Setting $A=n x$ and $B=m x$ , we find that

$\begin{aligned} \int_{0}^{2 \pi} \sin n x \cos m x d x &=\dfrac{1}{2} \int_{0}^{2 \pi}[\sin (n+m) x+\sin (n-m) x] d x \\[4pt] &=\dfrac{1}{2}\left[\dfrac{-\cos (n+m) x}{n+m}+\dfrac{-\cos (n-m) x}{n-m}\right]_{0}^{2 \pi} \\[4pt] &=(-1+1)+(-1+1)=0 \end{aligned} \nonumber$

For these integrals we also should be careful about setting $n=m$ . In this special case, we have the integrals

$\int_{0}^{2 \pi} \sin m x \cos m x d x=\dfrac{1}{2} \int_{0}^{2 \pi} \sin 2 m x d x=\dfrac{1}{2}\left[\dfrac{-\cos 2 m x}{2 m}\right]_{0}^{2 \pi}=0 \nonumber$

Finally, we can finish our evaluation of (5.6). We have determined that all but one integral vanishes. In that case, $n=m$ . This leaves us with

$\int_{0}^{2 \pi} f(x) \cos m x d x=a_{m} \pi \nonumber$

Solving for $a_{m}$ gives

$a_{m}=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) \cos m x d x \nonumber$

Since this is true for all $m=1,2, \ldots$ , we have proven this part of the theorem. The only part left is finding the $b_{n}$ ’s This will be left as an exercise for the reader.

We now consider examples of finding Fourier coefficients for given functions. In all of these cases we define $f(x)$ on $[0,2 \pi]$ .

Example 5.4. $f(x)=3 \cos 2 x, x \in[0,2 \pi]$ .

We first compute the integrals for the Fourier coefficients.

$\begin{aligned} &a_{0}=\dfrac{1}{\pi} \int_{0}^{2 \pi} 3 \cos 2 x d x=0 . \\[4pt] &a_{n}=\dfrac{1}{\pi} \int_{0}^{2 \pi} 3 \cos 2 x \cos n x d x=0, \quad n \neq 2 . \\[4pt] &a_{2}=\dfrac{1}{\pi} \int_{0}^{2 \pi} 3 \cos ^{2} 2 x d x=3, \\[4pt] &b_{n}=\dfrac{1}{\pi} \int_{0}^{2 \pi} 3 \cos 2 x \sin n x d x=0, \forall n . \end{aligned} \nonumber$

Therefore, we have that the only nonvanishing coefficient is $a_{2}=3$ . So there is one term and $f(x)=3 \cos 2 x$ . Well, we should have know this before doing all of these integrals. So, if we have a function expressed simply in terms of sums of simple sines and cosines, then it should be easy to write down the Fourier coefficients without much work.

Example 5.5. $f(x)=\sin ^{2} x, x \in[0,2 \pi]$ .

We could determine the Fourier coefficients by integrating as in the last example. However, it is easier to use trigonometric identities. We know that

$\sin ^{2} x=\dfrac{1}{2}(1-\cos 2 x)=\dfrac{1}{2}-\dfrac{1}{2} \cos 2 x . \nonumber$

There are no sine terms, so $b_{n}=0, n=1,2, \ldots$ There is a constant term, implying $a_{0} / 2=1 / 2$ . So, $a_{0}=1$ . There is a $\cos 2 x$ term, corresponding to $n=2$ , so $a_{2}=-\dfrac{1}{2}$ . That leaves $a_{n}=0$ for $n \neq 0,2$ .

Example 5.6. $f(x)=\left\{\begin{array}{c}1, \quad 0<x<\pi \\[4pt] -1, \pi<x<2 \pi\end{array}\right.$ .

This example will take a little more work. We cannot bypass evaluating any integrals at this time. This function is discontinuous, so we will have to compute each integral by breaking up the integration into two integrals, one over $[0, \pi]$ and the other over $[\pi, 2 \pi]$ .

$a_{0}=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) d x \nonumber$

We have found the Fourier coefficients for this function. Before inserting them into the Fourier series (5.1), we note that $\cos n \pi=(-1)^{n}$ . Therefore,

$1-\cos n \pi=\left\{\begin{array}{l} 0, n \text { even } \\[4pt] 2, n \text { odd. } \end{array}\right. \nonumber$

So, half of the $b_{n}$ ’s are zero. While we could write the Fourier series representation as

$f(x) \sim \dfrac{4}{\pi} \sum_{n=1, \text { odd }}^{\infty} \dfrac{1}{n} \sin n x \nonumber$

we could let $n=2 k-1$ and write

$f(x)=\dfrac{4}{\pi} \sum_{k=1}^{\infty} \dfrac{\sin (2 k-1) x}{2 k-1} \nonumber$

But does this series converge? Does it converge to $f(x)$ ? We will discuss this question later in the chapter.

$\begin{aligned} & =\dfrac{1}{\pi} \int_{0}^{\pi} d x+\dfrac{1}{\pi} \int_{\pi}^{2 \pi}(-1) d x \\[4pt] & =\dfrac{1}{\pi}(\pi)+\dfrac{1}{\pi}(-2 \pi+\pi)=0 . \\[4pt] & a_{n}=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) \cos n x d x \\[4pt] & =\dfrac{1}{\pi}\left[\int_{0}^{\pi} \cos n x d x-\int_{\pi}^{2 \pi} \cos n x d x\right] \\[4pt] & =\dfrac{1}{\pi}\left[\left(\dfrac{1}{n} \sin n x\right)_{0}^{\pi}-\left(\dfrac{1}{n} \sin n x\right)_{\pi}^{2 \pi}\right] \\[4pt] & =0 \text {. } \\[4pt] & b_{n}=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) \sin n x d x \\[4pt] & =\dfrac{1}{\pi}\left[\int_{0}^{\pi} \sin n x d x-\int_{\pi}^{2 \pi} \sin n x d x\right] \\[4pt] & =\dfrac{1}{\pi}\left[\left(-\dfrac{1}{n} \cos n x\right)_{0}^{\pi}+\left(\dfrac{1}{n} \cos n x\right)_{\pi}^{2 \pi}\right] \\[4pt] & =\dfrac{1}{\pi}\left[-\dfrac{1}{n} \cos n \pi+\dfrac{1}{n}+\dfrac{1}{n}-\dfrac{1}{n} \cos n \pi\right] \\[4pt] & =\dfrac{2}{n \pi}(1-\cos n \pi) \text {. } \end{aligned} \nonumber$

Fourier Series Over Other Intervals

In many applications we are interested in determining Fourier series representations of functions defined on intervals other than $[0,2 \pi]$ . In this section we will determine the form of the series expansion and the Fourier coefficients in these cases.

The most general type of interval is given as $[a, b]$ . However, this often is too general. More common intervals are of the form $[-\pi, \pi],[0, L]$ , or $[-L / 2, L / 2]$ . The simplest generalization is to the interval $[0, L]$ . Such intervals arise often in applications. For example, one can study vibrations of a one dimensional string of length $L$ and set up the axes with the left end at $x=0$ and the right end at $x=L$ . Another problem would be to study the temperature distribution along a one dimensional rod of length $L$ . Such problems lead to the original studies of Fourier series. As we will see later, symmetric intervals, $[-a, a]$ , are also useful.

Given an interval $[0, L]$ , we could apply a transformation to an interval of length $2 \pi$ by simply rescaling our interval. Then we could apply this transformation to our Fourier series representation to obtain an equivalent one useful for functions defined on $[0, L]$ .

We define $x \in[0,2 \pi]$ and $t \in[0, L]$ . A linear transformation relating these intervals is simply $x=\dfrac{2 \pi t}{L}$ as shown in Figure $5.5$ . So, $t=0$ maps to $x=0$ and $t=L$ maps to $x=2 \pi$ . Furthermore, this transformation maps $f(x)$ to a new function $g(t)=f(x(t))$ , which is defined on $[0, L]$ . We will determine the Fourier series representation of this function using the representation for $f(x)$

Figure 5.5. A sketch of the transformation between intervals $x \in[0,2 \pi]$ and $t \in[0, L]$ .

Recall the form of the Fourier representation for $f(x)$ in Equation (5.1):

$f(x) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos n x+b_{n} \sin n x\right] \nonumber$

Inserting the transformation relating $x$ and $t$ , we have

$g(t) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi t}{L}+b_{n} \sin \dfrac{2 n \pi t}{L}\right] . \nonumber$

This gives the form of the series expansion for $g(t)$ with $t \in[0, L]$ . But, we still need to determine the Fourier coefficients. Recall, that

$a_{n}=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) \cos n x d x \nonumber$

We need to make a substitution in the integral of $x=\dfrac{2 \pi t}{L}$ . We also will need to transform the differential, $d x=\dfrac{2 \pi}{L} d t$ . Thus, the resulting form for our coefficient is

$a_{n}=\dfrac{2}{L} \int_{0}^{L} g(t) \cos \dfrac{2 n \pi t}{L} d t \nonumber$

Similarly, we find that

$b_{n}=\dfrac{2}{L} \int_{0}^{L} g(t) \sin \dfrac{2 n \pi t}{L} d t \nonumber$

We note first that when $L=2 \pi$ we get back the series representation that we first studied. Also, the period of $\cos \dfrac{2 n \pi t}{L}$ is $L / n$ , which means that the representation for $g(t)$ has a period of $L$ .

At the end of this section we present the derivation of the Fourier series representation for a general interval for the interested reader. In Table $5.1$ we summarize some commonly used Fourier series representations.

We will end our discussion for now with some special cases and an example for a function defined on $[-\pi, \pi]$ .

Example 5.7. Let $f(x)=|x|$ on $[-\pi, \pi]$ We compute the coefficients, beginning as usual with $a_{0}$ . We have

$\begin{aligned} a_{0} &=\dfrac{1}{\pi} \int_{-\pi}^{\pi}|x| d x \\[4pt] &=\dfrac{2}{\pi} \int_{0}^{\pi}|x| d x=\pi \end{aligned} \nonumber$

At this point we need to remind the reader about the integration of even and odd functions.

Even Functions: In this evaluation we made use of the fact that the integrand is an even function. Recall that $f(x)$ is an even function if $f(-x)=f(x)$ for all $x$ . One can recognize even functions as they are symmetric with respect to the $y$ -axis as shown in Figure 5.6(A). If one integrates an even function over a symmetric interval, then one has that

$\int_{-a}^{a} f(x) d x=2 \int_{0}^{a} f(x) d x \nonumber$

One can prove this by splitting off the integration over negative values of $x$ , using the substitution $x=-y$ , and employing the evenness of $f(x)$ . Thus, Table 5.1. Special Fourier Series Representations on Different Intervals

Fourier Series on $[0, L]$

$\begin{aligned} f(x) & \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi x}{L}+b_{n} \sin \dfrac{2 n \pi x}{L}\right] \\[4pt] a_{n} &=\dfrac{2}{L} \int_{0}^{L} f(x) \cos \dfrac{2 n \pi x}{L} d x . \quad n=0,1,2, \ldots, \\[4pt] b_{n} &=\dfrac{2}{L} \int_{0}^{L} f(x) \sin \dfrac{2 n \pi x}{L} d x . \quad n=1,2, \ldots \end{aligned} \nonumber$

Fourier Series on $\left[-\dfrac{L}{2}, \dfrac{L}{2}\right]$

$\begin{aligned} f(x) & \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi x}{L}+b_{n} \sin \dfrac{2 n \pi x}{L}\right] . \\[4pt] a_{n} &=\dfrac{2}{L} \int_{-\dfrac{L}{2}}^{\dfrac{L}{2}} f(x) \cos \dfrac{2 n \pi x}{L} d x . \quad n=0,1,2, \ldots, \\[4pt] b_{n} &=\dfrac{2}{L} \int_{-\dfrac{L}{2}}^{\dfrac{L}{2}} f(x) \sin \dfrac{2 n \pi x}{L} d x . \quad n=1,2, \ldots \end{aligned} \nonumber$

Fourier Series on $[-\pi, \pi]$

$\begin{gathered} f(x) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos n x+b_{n} \sin n x\right] . \\[4pt] a_{n}=\dfrac{1}{\pi} \int_{-\pi}^{\pi} f(x) \cos n x d x . \quad n=0,1,2, \ldots, \\[4pt] b_{n}=\dfrac{1}{\pi} \int_{-\pi}^{\pi} f(x) \sin n x d x . \quad n=1,2, \ldots \end{gathered} \nonumber$

$\begin{aligned} \int_{-a}^{a} f(x) d x &=\int_{-a}^{0} f(x) d x+\int_{0}^{a} f(x) d x \\[4pt] &=-\int_{a}^{0} f(-y) d y+\int_{0}^{a} f(x) d x \\[4pt] &=\int_{0}^{a} f(y) d y+\int_{0}^{a} f(x) d x \\[4pt] &=2 \int_{0}^{a} f(x) d x \end{aligned} \nonumber$

This can be visually verified by looking at Figure $5.6(\mathrm{~A})$ .

Odd Functions: A similar computation could be done for odd functions. $f(x)$ is an odd function if $f(-x)=-f(x)$ for all $x$ . The graphs of such functions are symmetric with respect to the origin as shown in Figure $5.6(\mathrm{~B})$ . If one integrates an odd function over a symmetric interval, then one has that

Figure 5.6. Examples of the areas under (A) even and (B) odd functions on symmetric intervals, $[-a, a]$

We now continue with our computation of the Fourier coefficients for $f(x)=|x|$ on $[-\pi, \pi]$ . We have

$a_{n}=\dfrac{1}{\pi} \int_{-\pi}^{\pi}|x| \cos n x d x=\dfrac{2}{\pi} \int_{0}^{\pi} x \cos n x d x . \nonumber$

Here we have made use of the fact that $|x| \cos n x$ is an even function. In order to compute the resulting integral, we need to use integration by parts,

$\int_{a}^{b} u d v=\left.u v\right|_{a} ^{b}-\int_{a}^{b} v d u \nonumber$

by letting $u=x$ and $d v=\cos n x d x$ . Thus, $d u=d x$ and $v=\int d v=\dfrac{1}{n} \sin n x$ . Continuing with the computation, we have

$\begin{aligned} a_{n} &=\dfrac{2}{\pi} \int_{0}^{\pi} x \cos n x d x \\[4pt] &=\dfrac{2}{\pi}\left[\left.\dfrac{1}{n} x \sin n x\right|_{0} ^{\pi}-\dfrac{1}{n} \int_{0}^{\pi} \sin n x d x\right] \\[4pt] &=-\dfrac{2}{n \pi}\left[-\dfrac{1}{n} \cos n x\right]_{0}^{\pi} \\[4pt] &=-\dfrac{2}{\pi n^{2}}\left(1-(-1)^{n}\right) \end{aligned} \nonumber$

Here we have used the fact that $\cos n \pi=(-1)^{n}$ for any integer $n$ . This lead to a factor $\left(1-(-1)^{n}\right)$ . This factor can be simplified as

$1-(-1)^{n}=\left\{\begin{array}{l} 2, n \text { odd } \\[4pt] 0, n \text { even } \end{array}\right. \nonumber$

So, $a_{n}=0$ for $n$ even and $a_{n}=-\dfrac{4}{\pi n^{2}}$ for $n$ odd.

Computing the $b_{n}$ ’s is simpler. We note that we have to integrate $|x| \sin n x$ from $x=-\pi$ to $\pi$ . The integrand is an odd function and this is a symmetric interval. So, the result is that $b_{n}=0$ for all $n$ .

Putting this all together, the Fourier series representation of $f(x)=|x|$ on $[-\pi, \pi]$ is given as

$f(x) \sim \dfrac{\pi}{2}-\dfrac{4}{\pi} \sum_{n=1, \text { odd }}^{\infty} \dfrac{\cos n x}{n^{2}} \nonumber$

While this is correct, we can rewrite the sum over only odd $n$ by reindexing. We let $n=2 k-1$ for $k=1,2,3, \ldots$ Then we only get the odd integers. The series can then be written as

$f(x) \sim \dfrac{\pi}{2}-\dfrac{4}{\pi} \sum_{k=1}^{\infty} \dfrac{\cos (2 k-1) x}{(2 k-1)^{2}} . \nonumber$

Throughout our discussion we have referred to such results as Fourier representations. We have not looked at the convergence of these series. Here is an example of an infinite series of functions. What does this series sum to? We show in Figure $5.7$ the first few partial sums. They appear to be converging to $f(x)=|x|$ fairly quickly.

Even though $f(x)$ was defined on $[-\pi, \pi]$ we can still evaluate the Fourier series at values of $x$ outside this interval. In Figure $5.8$ , we see that the representation agrees with $f(x)$ on the interval $[-\pi, \pi]$ . Outside this interval we have a periodic extension of $f(x)$ with period $2 \pi$ .

Another example is the Fourier series representation of $f(x)=x$ on $[-\pi, \pi]$ as left for Problem 5.1. This is determined to be

$f(x) \sim 2 \sum_{n=1}^{\infty} \dfrac{(-1)^{n+1}}{n} \sin n x . \nonumber$

As seen in Figure $5.9$ we again obtain the periodic extension of our function. In this case we needed many more terms. Also, the vertical parts of the first plot are nonexistent. In the second plot we only plot the points and not the typical connected points that most software packages plot as the default style.

Figure 5.7. Plot of the first partial sums of the Fourier series representation for $f(x)=$ $|x|$

Figure 5.8. Plot of the first 10 terms of the Fourier series representation for $f(x)=|x|$ on the interval $[-2 \pi, 4 \pi]$ .

Figure 5.9. Plot of the first 10 terms and 200 terms of the Fourier series representation for $f(x)=x$ on the interval $[-2 \pi, 4 \pi]$

$\pi=4\left[1-\dfrac{1}{3}+\dfrac{1}{5}-\dfrac{1}{7}+\ldots\right] \nonumber$

Fourier Series on $[a, b]$

A Fourier series representation is also possible for a general interval, $t \in[a, b]$ . As before, we just need to transform this interval to $[0,2 \pi]$ . Let

$x=2 \pi \dfrac{t-a}{b-a} . \nonumber$

Inserting this into the Fourier series (5.1) representation for $f(x)$ we obtain

$g(t) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi(t-a)}{b-a}+b_{n} \sin \dfrac{2 n \pi(t-a)}{b-a}\right] \nonumber$

Well, this expansion is ugly. It is not like the last example, where the transformation was straightforward. If one were to apply the theory to applications, it might seem to make sense to just shift the data so that $a=0$ and be done with any complicated expressions. However, mathematics students enjoy the challenge of developing such generalized expressions. So, let’s see what is involved.

First, we apply the addition identities for trigonometric functions and rearrange the terms.

$\begin{aligned} g(t) \sim & \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi(t-a)}{b-a}+b_{n} \sin \dfrac{2 n \pi(t-a)}{b-a}\right] \\[4pt] =& \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a _ { n } \left(\cos \dfrac{2 n \pi t}{b-a} \cos \dfrac{2 n \pi a}{b-a}+\sin \dfrac{2 n \pi t}{b-a} \sin \dfrac{2 n \pi a}{b-}\right.\right.\\[4pt] &\left.+b_{n}\left(\sin \dfrac{2 n \pi t}{b-a} \cos \dfrac{2 n \pi a}{b-a}-\cos \dfrac{2 n \pi t}{b-a} \sin \dfrac{2 n \pi a}{b-a}\right)\right] \\[4pt] =& \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[\cos \dfrac{2 n \pi t}{b-a}\left(a_{n} \cos \dfrac{2 n \pi a}{b-a}-b_{n} \sin \dfrac{2 n \pi a}{b-a}\right)\right.\\[4pt] &\left.+\sin \dfrac{2 n \pi t}{b-a}\left(a_{n} \sin \dfrac{2 n \pi a}{b-a}+b_{n} \cos \dfrac{2 n \pi a}{b-a}\right)\right] \end{aligned} \nonumber$

Defining $A_{0}=a_{0}$ and

$\begin{aligned} A_{n} & \equiv a_{n} \cos \dfrac{2 n \pi a}{b-a}-b_{n} \sin \dfrac{2 n \pi a}{b-a} \\[4pt] B_{n} \equiv a_{n} \sin \dfrac{2 n \pi a}{b-a}+b_{n} \cos \dfrac{2 n \pi a}{b-a} \end{aligned} \nonumber$

we arrive at the more desirable form for the Fourier series representation of a function defined on the interval $[a, b]$ .

$g(t) \sim \dfrac{A_{0}}{2}+\sum_{n=1}^{\infty}\left[A_{n} \cos \dfrac{2 n \pi t}{b-a}+B_{n} \sin \dfrac{2 n \pi t}{b-a}\right] . \nonumber$

We next need to find expressions for the Fourier coefficients. We insert the known expressions for $a_{n}$ and $b_{n}$ and rearrange. First, we note that under the transformation $x=2 \pi \dfrac{t-a}{b-a}$ we have

$\begin{aligned} a_{n} &=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) \cos n x d x \\[4pt] &=\dfrac{2}{b-a} \int_{a}^{b} g(t) \cos \dfrac{2 n \pi(t-a)}{b-a} d t \end{aligned} \nonumber$

and

$\begin{aligned} b_{n} &=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) \cos n x d x \\[4pt] &=\dfrac{2}{b-a} \int_{a}^{b} g(t) \sin \dfrac{2 n \pi(t-a)}{b-a} d t \end{aligned} \nonumber$

Then, inserting these integrals in $A_{n}$ , combining integrals and making use of the addition formula for the cosine of the sum of two angles, we obtain

$\begin{aligned} A_{n} & \equiv a_{n} \cos \dfrac{2 n \pi a}{b-a}-b_{n} \sin \dfrac{2 n \pi a}{b-a} \\[4pt] &=\dfrac{2}{b-a} \int_{a}^{b} g(t)\left[\cos \dfrac{2 n \pi(t-a)}{b-a} \cos \dfrac{2 n \pi a}{b-a}-\sin \dfrac{2 n \pi(t-a)}{b-a} \sin \dfrac{2 n \pi a}{b-a}\right] d t \\[4pt] &=\dfrac{2}{b-a} \int_{a}^{b} g(t) \cos \dfrac{2 n \pi t}{b-a} d t \end{aligned} \nonumber$

A similar computation gives

$B_{n}=\dfrac{2}{b-a} \int_{a}^{b} g(t) \sin \dfrac{2 n \pi t}{b-a} d t \nonumber$

Summarizing, we have shown that:

Theorem 5.9. The Fourier series representation of $f(x)$ defined on $[a, b]$ when it exists, is given by

$f(x) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi x}{b-a}+b_{n} \sin \dfrac{2 n \pi x}{b-a}\right] . \nonumber$

with Fourier coefficients

$\begin{aligned} a_{n} &=\dfrac{2}{b-a} \int_{a}^{b} f(x) \cos \dfrac{2 n \pi x}{b-a} d x . \quad n=0,1,2, \ldots, \\[4pt] b_{n} &=\dfrac{2}{b-a} \int_{a}^{b} f(x) \sin \dfrac{2 n \pi x}{b-a} d x . \quad n=1,2, \ldots \end{aligned} \nonumber$

$5.4$ Sine and Cosine Series

In the last two examples $(f(x)=|x|$ and $f(x)=x$ on $[-\pi, \pi])$ we have seen Fourier series representations that contain only sine or cosine terms. As we know, the sine functions are odd functions and thus sum to odd functions. Similarly, cosine functions sum to even functions. Such occurrences happen often in practice. Fourier representations involving just sines are called sine series and those involving just cosines (and the constant term) are called cosine series.

Another interesting result, based upon these examples, is that the original functions, $|x|$ and $x$ agree on the interval $[0, \pi]$ . Note from Figures 5.7-5.9 that their Fourier series representations do as well. Thus, more than one series can be used to represent functions defined on finite intervals. All they need to do is to agree with the function over that particular interval. Sometimes one of these series is more useful because it has additional properties needed in the given application.

We have made the following observations from the previous examples: 1. There are several trigonometric series representations for a function defined on a finite interval.

Odd functions on a symmetric interval are represented by sine series and even functions on a symmetric interval are represented by cosine series.

These two observations are related and are the subject of this section. We begin by defining a function $f(x)$ on interval $[0, L]$ . We have seen that the Fourier series representation of this function appears to converge to a periodic extension of the function.

Figure 5.10. This is a sketch of a function and its various extensions. The original function $f(x)$ is defined on $[0,1]$ and graphed in the upper left corner. To its right is the periodic extension, obtained by adding replicas. The two lower plots are obtained by first making the original function even or odd and then creating the periodic extensions of the new function.

In general, we obtain three different periodic representations. In order to distinguish these we will refer to them simply as the periodic, even and odd extensions. Now, starting with $f(x)$ defined on $[0, L]$ , we would like to determine the Fourier series representations leading to these extensions. [For easy reference, the results are summarized in Table 5.2] We have already seen that the periodic extension of $f(x)$ is obtained through the Fourier series representation in Equation (5.53).

Table 5.2. Fourier Cosine and Sine Series Representations on $[0, L]$

Fourier Series on $[0, L]$

$\begin{aligned} f(x) & \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi x}{L}+b_{n} \sin \dfrac{2 n \pi x}{L}\right] . \\[4pt] a_{n} &=\dfrac{2}{L} \int_{0}^{L} f(x) \cos \dfrac{2 n \pi x}{L} d x . \quad n=0,1,2, \ldots, \\[4pt] b_{n} &=\dfrac{2}{L} \int_{0}^{L} f(x) \sin \dfrac{2 n \pi x}{L} d x . \quad n=1,2, \ldots \end{aligned} \nonumber$

Fourier Cosine Series on $[0, L]$

$f(x) \sim a_{0} / 2+\sum_{n=1}^{\infty} a_{n} \cos \dfrac{n \pi x}{L} . \nonumber$

where

$a_{n}=\dfrac{2}{L} \int_{0}^{L} f(x) \cos \dfrac{n \pi x}{L} d x . \quad n=0,1,2, \ldots \nonumber$

Fourier Sine Series on $[0, L]$

$f(x) \sim \sum_{n=1}^{\infty} b_{n} \sin \dfrac{n \pi x}{L} . \nonumber$

where

$b_{n}=\dfrac{2}{L} \int_{0}^{L} f(x) \sin \dfrac{n \pi x}{L} d x . \quad n=1,2, \ldots \nonumber$

Given $f(x)$ defined on $[0, L]$ , the even periodic extension is obtained by simply computing the Fourier series representation for the even function

$f_{e}(x) \equiv\left\{\begin{array}{c} f(x), \quad 0<x<L \\[4pt] f(-x)-L<x<0 \end{array}\right. \nonumber$

Since $f_{e}(x)$ is an even function on a symmetric interval $[-L, L]$ , we expect that the resulting Fourier series will not contain sine terms. Therefore, the series expansion will be given by [Use the general case in (5.51) with $a=-L$ and $b=L .]$ :

$f_{e}(x) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty} a_{n} \cos \dfrac{n \pi x}{L} . \nonumber$

with Fourier coefficients

$a_{n}=\dfrac{1}{L} \int_{-L}^{L} f_{e}(x) \cos \dfrac{n \pi x}{L} d x . \quad n=0,1,2, \ldots \nonumber$

However, we can simplify this by noting that the integrand is even and the interval of integration can be replaced by $[0, L]$ . On this interval $f_{e}(x)=f(x)$ . So, we have the Cosine Series Representation of $f(x)$ for $x \in[0, L]$ is given as

$f(x) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty} a_{n} \cos \dfrac{n \pi x}{L} . \nonumber$

where

$a_{n}=\dfrac{2}{L} \int_{0}^{L} f(x) \cos \dfrac{n \pi x}{L} d x . \quad n=0,1,2, \ldots \nonumber$

Similarly, given $f(x)$ defined on $[0, L]$ , the odd periodic extension is obtained by simply computing the Fourier series representation for the odd function

$f_{o}(x) \equiv\left\{\begin{array}{c} f(x), \quad 0<x<L, \\[4pt] -f(-x)-L<x<0 \end{array}\right. \nonumber$

The resulting series expansion leads to defining the Sine Series Representation of $f(x)$ for $x \in[0, L]$ as

$f(x) \sim \sum_{n=1}^{\infty} b_{n} \sin \dfrac{n \pi x}{L} . \nonumber$

where

$b_{n}=\dfrac{2}{L} \int_{0}^{L} f(x) \sin \dfrac{n \pi x}{L} d x . \quad n=1,2, \ldots \nonumber$

Example 5.10. In Figure $5.10$ we actually provided plots of the various extensions of the function $f(x)=x^{2}$ for $x \in[0,1]$ . Let’s determine the representations of the periodic, even and odd extensions of this function.

For a change, we will use a CAS (Computer Algebra System) package to do the integrals. In this case we can use Maple. A general code for doing this for the periodic extension is shown in Table 5.3.

Example 5.11. Periodic Extension - Trigonometric Fourier Series

Using the above code, we have that $a_{0}=\dfrac{2}{3} a_{n}=\dfrac{1}{n^{2} \pi^{2}}$ and $b_{n}=-\dfrac{1}{n \pi}$ . Thus, the resulting series is given as

$f(x) \sim \dfrac{1}{3}+\sum_{n=1}^{\infty}\left[\dfrac{1}{n^{2} \pi^{2}} \cos 2 n \pi x-\dfrac{1}{n \pi} \sin 2 n \pi x\right] \nonumber$

Table 5.3. Maple code for computing Fourier coefficients and plotting partial sums of the Fourier series.

In Figure $5.11$ we see the sum of the first 50 terms of this series. Generally, we see that the series seems to be converging to the periodic extension of $f$ . There appear to be some problems with the convergence around integer values of $x$ . We will later see that this is because of the discontinuities in the periodic extension and the resulting overshoot is referred to as the Gibbs phenomenon which is discussed in the appendix.

Example 5.12. Even Periodic Extension - Cosine Series

In this case we compute $a_{0}=\dfrac{2}{3}$ and $a_{n}=\dfrac{4(-1)^{n}}{n^{2} \pi^{2}}$ . Therefore, we have

$f(x) \sim \dfrac{1}{3}+\dfrac{4}{\pi^{2}} \sum_{n=1}^{\infty} \dfrac{(-1)^{n}}{n^{2}} \cos n \pi x . \nonumber$

In Figure $5.12$ we see the sum of the first 50 terms of this series. In this case the convergence seems to be much better than in the periodic extension case. We also see that it is converging to the even extension.

Example 5.13. Odd Periodic Extension - Sine Series

Finally, we look at the sine series for this function. We find that $b_{n}=$ $-\dfrac{2}{n^{3} \pi^{3}}\left(n^{2} \pi^{2}(-1)^{n}-2(-1)^{n}+2\right)$ . Therefore,

Figure 5.11. The periodic extension of $f(x)=x^{2}$ on $[0,1]$ .

Figure 5.12. The even periodic extension of $f(x)=x^{2}$ on $[0,1]$ . Once again we see discontinuities in the extension as seen in Figure 5.13. However, we have verified that our sine series appears to be converging to the odd extension as we first sketched in Figure 5.10.

Figure 5.13. The odd periodic extension of $f(x)=x^{2}$ on $[0,1]$ .

Appendix: The Gibbs Phenomenon

We have seen that when there is a jump discontinuity in the periodic extension of our functions, whether the function originally had a discontinuity or developed one due to a mismatch in the values of the endpoints. This can be seen in Figures $5.9,5.11$ and 5.13. The Fourier series has a difficult time converging at the point of discontinuity and these graphs of the Fourier series show a distinct overshoot which does not go away. This is called the Gibbs phenomenon and the amount of overshoot can be computed.

In one of our first examples, Example $5.6$ , we found the Fourier series representation of the piecewise defined function

$f(x)=\left\{\begin{array}{c} 1, \quad 0<x<\pi \\[4pt] -1, \quad \pi<x<2 \pi \end{array}\right. \nonumber$

to be

$f(x) \sim \dfrac{4}{\pi} \sum_{k=1}^{\infty} \dfrac{\sin (2 k-1) x}{2 k-1} . \nonumber$

In Figure $5.14$ we display the sum of the first ten terms. Note the wiggles, overshoots and under shoots near $x=0, \pm \pi$ . These are seen more when we plot the representation for $x \in[-3 \pi, 3 \pi]$ , as shown in Figure 5.15. We note that the overshoots and undershoots occur at discontinuities in the periodic extension of $f(x)$ . These occur whenever $f(x)$ has a discontinuity or if the values of $f(x)$ at the endpoints of the domain do not agree.

One might expect that we only need to add more terms. In Figure $5.16$ we show the sum for twenty terms. Note the sum appears to converge better for points far from the discontinuities. But, the overshoots and undershoots are still present. In Figures $5.17$ and $5.18$ show magnified plots of the overshoot at $x=0$ for $N=100$ and $N=500$ , respectively. We see that the overshoot persists. The peak is at about the same height, but its location seems to be getting closer to the origin. We will show how one can estimate the size of the overshoot.

Figure 5.14. The Fourier series representation of a step function on $[-\pi, \pi]$ for $N=10$ .

We can study the Gibbs phenomenon by looking at the partial sums of general Fourier trigonometric series for functions $f(x)$ defined on the interval $[-L, L]$ . Writing out the partial sums, inserting the Fourier coefficients and rearranging, we have

Figure 5.15. The Fourier series representation of a step function on $[-\pi, \pi]$ for $N=10$ plotted on $[-3 \pi, 3 \pi]$ displaying the periodicity.

Figure 5.16. The Fourier series representation of a step function on $[-\pi, \pi]$ for $N=20$ .

Figure 5.17. The Fourier series representation of a step function on $[-\pi, \pi]$ for $N=$ 100 .

$\begin{aligned} =& \dfrac{1}{2 L} \int_{-L}^{L} f(y) d y+\sum_{n=1}^{N}\left[\left(\dfrac{1}{L} \int_{-L}^{L} f(y) \cos \dfrac{n \pi y}{L} d y\right) \cos \dfrac{n \pi x}{L}\right.\\[4pt] &\left.+\left(\dfrac{1}{L} \int_{-L}^{L} f(y) \sin \dfrac{n \pi y}{L} d y\right) \sin \dfrac{n \pi x}{L}\right] \\[4pt] =&\left.\dfrac{1}{L} \int_{-L}^{L}\left\{\dfrac{1}{2}+\sum_{n=1}^{N}\left(\cos \dfrac{n \pi y}{L} \cos \dfrac{n \pi x}{L}+\sin \dfrac{n \pi y}{L} \sin \dfrac{n \pi x}{L}\right)\right\} f(y)\right\} f(y) d y \\[4pt] =& \dfrac{1}{L} \int_{-L}^{L}\left\{\dfrac{1}{2}+\sum_{n=1}^{N} \cos \dfrac{n \pi(y-x)}{L}\right\} \\[4pt] \equiv & \dfrac{1}{L} \int_{-L}^{L} D_{N}(y-x) f(y) d y . \end{aligned} \nonumber$

We have defined

$D_{N}(x)=\dfrac{1}{2}+\sum_{n=1}^{N} \cos \dfrac{n \pi x}{L}, \nonumber$

which is called the $N$ -th Dirichlet Kernel. We now prove

Figure 5.18. The Fourier series representation of a step function on $[-\pi, \pi]$ for $N=$ $500 .$

Proposition:

$D_{n}(x)= \begin{cases}\dfrac{\sin \left(\left(n+\dfrac{1}{2}\right) \dfrac{\pi x}{L}\right)}{2 \sin \dfrac{\pi x}{2 L}}, & \sin \dfrac{\pi x}{2 L} \neq 0 \\[4pt] n+\dfrac{1}{2}, & \sin \dfrac{\pi x}{2 L}=0\end{cases} \nonumber$

Proof: Let $\theta=\dfrac{\pi x}{L}$ and multiply $D_{n}(x)$ by $2 \sin \dfrac{\theta}{2}$ to obtain:

$\begin{aligned} 2 \sin \dfrac{\theta}{2} D_{n}(x)=& 2 \sin \dfrac{\theta}{2}\left[\dfrac{1}{2}+\cos \theta+\cdots+\cos n \theta\right] \\[4pt] =& \sin \dfrac{\theta}{2}+2 \cos \theta \sin \dfrac{\theta}{2}+2 \cos 2 \theta \sin \dfrac{\theta}{2}+\cdots+2 \cos n \theta \sin \dfrac{\theta}{2} \\[4pt] =& \sin \dfrac{\theta}{2}+\left(\sin \dfrac{3 \theta}{2}-\sin \dfrac{\theta}{2}\right)+\left(\sin \dfrac{5 \theta}{2}-\sin \dfrac{3 \theta}{2}\right)+\cdots \\[4pt] &+\left[\sin \left(n+\dfrac{1}{2}\right) \theta-\sin \left(n-\dfrac{1}{2}\right) \theta\right] \\[4pt] =& \sin \left(n+\dfrac{1}{2}\right) \theta \end{aligned} \nonumber$

Thus,

$2 \sin \dfrac{\theta}{2} D_{n}(x)=\sin \left(n+\dfrac{1}{2}\right) \theta \nonumber$

or if $\sin \dfrac{\theta}{2} \neq 0$ ,

$D_{n}(x)=\dfrac{\sin \left(n+\dfrac{1}{2}\right) \theta}{2 \sin \dfrac{\theta}{2}}, \quad \theta=\dfrac{\pi x}{L} \nonumber$

If $\sin \dfrac{\theta}{2}=0$ ,then one needs to apply L’Hospital’s Rule:

$\begin{aligned} \lim _{\theta \rightarrow 2 m \pi} \dfrac{\sin \left(n+\dfrac{1}{2}\right) \theta}{2 \sin \dfrac{\theta}{2}} &=\lim _{\theta \rightarrow 2 m \pi} \dfrac{\left(n+\dfrac{1}{2}\right) \cos \left(n+\dfrac{1}{2}\right) \theta}{\cos \dfrac{\theta}{2}} \\[4pt] &=\dfrac{\left(n+\dfrac{1}{2}\right) \cos (2 m n \pi+m \pi)}{\cos m \pi} \\[4pt] &=n+\dfrac{1}{2} . \end{aligned} \nonumber$

We further note that $D_{N}(x)$ is periodic with period $2 L$ and is an even function. So far, we have found that

$S_{N}(x)=\dfrac{1}{L} \int_{-L}^{L} D_{N}(y-x) f(y) d y \nonumber$

Now, make the substitution $\xi=y-x$ . Then,

$\begin{aligned} S_{N}(x) &=\dfrac{1}{L} \int_{-L-x}^{L-x} D_{N}(\xi) f(\xi+x) d \xi \\[4pt] &=\dfrac{1}{L} \int_{-L}^{L} D_{N}(\xi) f(\xi+x) d \xi \end{aligned} \nonumber$

In the second integral we have made use of the fact that $f(x)$ and $D_{N}(x)$ are periodic with period $2 L$ and shifted the interval back to $[-L, L]$ .

Now split the integration and use the fact that $D_{N}(x)$ is an even function. Then,

$\begin{aligned} S_{N}(x) &=\dfrac{1}{L} \int_{-L}^{0} D_{N}(\xi) f(\xi+x) d \xi+\dfrac{1}{L} \int_{0}^{L} D_{N}(\xi) f(\xi+x) d \xi \\[4pt] &=\dfrac{1}{L} \int_{0}^{L}[f(x-\xi)+f(\xi+x)] D_{N}(\xi) d \xi \end{aligned} \nonumber$

We can use this result to study the Gibbs phenomenon whenever it occurs. In particular, we will only concentrate on our earlier example. Namely,

$f(x)=\left\{\begin{array}{c} 1, \quad 0<x<\pi \\[4pt] -1, \pi<x<2 \pi \end{array}\right. \nonumber$

For this case, we have

$S_{N}(x)=\dfrac{1}{\pi} \int_{0}^{\pi}[f(x-\xi)+f(\xi+x)] D_{N}(\xi) d \xi \nonumber$

for

$D_{N}(x)=\dfrac{1}{2}+\sum_{n=1}^{N} \cos n x \nonumber$

Also, one can show that

$f(x-\xi)+f(\xi+x)=\left\{\begin{array}{c} 2, \quad 0 \leq \xi<x \\[4pt] 0, \quad x \leq \xi<\pi-x \\[4pt] -2, \pi-x \leq \xi<\pi \end{array}\right. \nonumber$

Thus, we have

$\begin{aligned} S_{N}(x) &=\dfrac{2}{\pi} \int_{0}^{x} D_{N}(\xi) d \xi-\dfrac{2}{\pi} \int_{\pi-x}^{\pi} D_{N}(\xi) d \xi \\[4pt] &=\dfrac{2}{\pi} \int_{0}^{x} D_{N}(z) d z+\dfrac{2}{-} \int_{0}^{x} D_{N}(\pi-z) d z . \end{aligned} \nonumber$

Here we made the substitution $z=\pi-\xi$ in the second integral. The Dirichlet kernel in the proposition for $L=\pi$ is given by

$D_{N}(x)=\dfrac{\sin \left(N+\dfrac{1}{2}\right) x}{2 \sin \dfrac{x}{2}} . \nonumber$

For $N$ large, we have $N+\dfrac{1}{2} \approx N$ , and for small $x$ , we have $\sin \dfrac{x}{2} \approx \dfrac{x}{2}$ . So, under these assumptions,

$D_{N}(x) \approx \dfrac{\sin N x}{x} . \nonumber$

Therefore,

$S_{N}(x) \rightarrow \dfrac{2}{\pi} \int_{0}^{x} \dfrac{\sin N \xi}{\xi} d \xi . \nonumber$

If we want to determine the locations of the minima and maxima, where the undershoot and overshoot occur, then we apply the first derivative test for extrema to $S_{N}(x)$ . Thus,

$\dfrac{d}{d x} S_{N}(x)=\dfrac{2}{\pi} \dfrac{\sin N x}{x}=0 . \nonumber$

The extrema occur for $N x=m \pi, m=\pm 1, \pm 2, \ldots$ One can show that there is a maximum at $x=\pi / N$ and a minimum for $x=2 \pi / N$ . The value for the overshoot can be computed as

$\begin{aligned} S_{N}(\pi / N) &=\dfrac{2}{\pi} \int_{0}^{\pi / N} \dfrac{\sin N \xi}{\xi} d \xi \\[4pt] &=\dfrac{2}{\pi} \int_{0}^{\pi} \dfrac{\sin t}{t} d t \\[4pt] &=\dfrac{2}{\pi} \operatorname{Si}(\pi) \\[4pt] &=1.178979744 \ldots . \end{aligned} \nonumber$

Note that this value is independent of $N$ and is given in terms of the sine integral,

$\operatorname{Si}(x) \equiv \int_{0}^{x} \dfrac{\sin t}{t} d t \nonumber$

Problems

5.1. Find the Fourier Series of each function $f(x)$ of period $2 \pi$ . For each series, plot the $N$ th partial sum,

$S_{N}=\dfrac{a_{0}}{2}+\sum_{n=1}^{N}\left[a_{n} \cos n x+b_{n} \sin n x\right], \nonumber$

for $N=5,10,50$ and describe the convergence (is it fast? what is it converging to, etc.) [Some simple Maple code for computing partial sums is shown below.]
a. $f(x)=x,|x|<\pi$ .
b. $f(x)=\dfrac{x^{2}}{4},|x|<\pi$ .
c. $f(x)=\pi-|x|,|x|<\pi$ .
d. $f(x)= \begin{cases}\dfrac{\pi}{2}, & 0<x<\pi \\[4pt] -\dfrac{\pi}{2}, & \pi<x<2 \pi\end{cases}$
e. $f(x)=\left\{\begin{array}{l}0,-\pi<x<0 \\[4pt] 1,0<x<\pi\end{array}\right.$

A simple set of commands in Maple are shown below, where you fill in the Fourier coefficients that you have computed by hand and $f(x)$ so that you can compare your results. Of course, other modifications may be needed.

5.2. Consider the function $f(x)=4 \sin ^{3} 2 x$

a. Derive an identity relating $\sin ^{3} \theta$ in terms of $\sin \theta$ and $\sin 3 \theta$ and express $f(x)$ in terms of simple sine functions.

b. Determine the Fourier coefficients of $f(x)$ in a Fourier series expansion on $[0,2 \pi]$ without computing any integrals!

5.3. Find the Fourier series of $f(x)=x$ on the given interval with the given period $T$ . Plot the $N$ th partial sums and describe what you see.

a. $0<x<2, T=2$ .

b. $-2<x<2, T=4$ .

5.4. The result in problem $5.1$ b above gives a Fourier series representation of $\dfrac{x^{2}}{4}$ . By picking the right value for $x$ and a little arrangement of the series, show that [See Example 5.8.]

$\dfrac{\pi^{2}}{6}=1+\dfrac{1}{2^{2}}+\dfrac{1}{3^{2}}+\dfrac{1}{4^{2}}+\cdots \nonumber$

$\dfrac{\pi^{2}}{8}=1+\dfrac{1}{3^{2}}+\dfrac{1}{5^{2}}+\dfrac{1}{7^{2}}+\cdots \nonumber$

5.5. Sketch (by hand) the graphs of each of the following functions over four periods. Then sketch the extensions each of the functions as both an even and odd periodic function. Determine the corresponding Fourier sine and cosine series and verify the convergence to the desired function using Maple.
a. $f(x)=x^{2}, 0<x<1$ .
b. $f(x)=x(2-x), 0<x<2$ .
c. $f(x)=\left\{\begin{array}{l}0,0<x<1 \text {, } \\[4pt] 1,1<x<2 \text {. }\end{array}\right.$
d. $f(x)=\left\{\begin{array}{c}\pi, \quad 0<x<\pi \\[4pt] 2 \pi-x, \pi<x<2 \pi\end{array}\right.$

Sturm-Liouville Eigenvalue Problems

Introduction

In the last chapters we have explored the solution of boundary value problems that led to trigonometric eigenfunctions. Such functions can be used to represent functions in Fourier series expansions. We would like to generalize some of those techniques in order to solve other boundary value problems. A class of problems to which our previous examples belong and which have eigenfunctions with similar properties are the Sturm-Liouville Eigenvalue Problems. These problems involve self-adjoint (differential) operators which play an important role in the spectral theory of linear operators and the existence of the eigenfunctions we described in Section 4.3.2. These ideas will be introduced in this chapter.

In physics many problems arise in the form of boundary value problems involving second order ordinary differential equations. For example, we might want to solve the equation

$a_{2}(x) y^{\prime \prime}+a_{1}(x) y^{\prime}+a_{0}(x) y=f(x) \nonumber$

subject to boundary conditions. We can write such an equation in operator form by defining the differential operator

$L=a_{2}(x) \dfrac{d^{2}}{d x^{2}}+a_{1}(x) \dfrac{d}{d x}+a_{0}(x) . \nonumber$

Then, Equation (6.1) takes the form

$L y=f \text {. } \nonumber$

As we saw in the general boundary value problem (4.20) in Section 4.3.2, we can solve some equations using eigenvalue expansions. Namely, we seek solutions to the eigenvalue problem

$L \phi=\lambda \phi \nonumber$

with homogeneous boundary conditions and then seek a solution as an expansion of the eigenfunctions. Formally, we let

$y=\sum_{n=1}^{\infty} c_{n} \phi_{n} \nonumber$

However, we are not guaranteed a nice set of eigenfunctions. We need an appropriate set to form a basis in the function space. Also, it would be nice to have orthogonality so that we can easily solve for the expansion coefficients as was done in Section 4.3.2. [Otherwise, we would have to solve a infinite coupled system of algebraic equations instead of an uncoupled and diagonal system.]

It turns out that any linear second order operator can be turned into an operator that possesses just the right properties (self-adjointedness to carry out this procedure. The resulting operator is referred to as a Sturm-Liouville operator. We will highlight some of the properties of such operators and prove a few key theorems, though this will not be an extensive review of SturmLiouville theory. The interested reader can review the literature and more advanced texts for a more in depth analysis.

We define the Sturm-Liouville operator as

$\mathcal{L}=\dfrac{d}{d x} p(x) \dfrac{d}{d x}+q(x) \nonumber$

The Sturm-Liouville eigenvalue problem is given by the differential equation

$\mathcal{L} u=-\lambda \sigma(x) u \nonumber$

$\dfrac{d}{d x}\left(p(x) \dfrac{d u}{d x}\right)+q(x) u+\lambda \sigma(x) u=0 \nonumber$

for $x \in(a, b)$ . The functions $p(x), p^{\prime}(x), q(x)$ and $\sigma(x)$ are assumed to be continuous on $(a, b)$ and $p(x)>0, \sigma(x)>0$ on $[a, b]$ . If the interval is finite and these assumptions on the coefficients are true on $[a, b]$ , then the problem is said to be regular. Otherwise, it is called singular.

We also need to impose the set of homogeneous boundary conditions

$\begin{array}{r} \alpha_{1} u(a)+\beta_{1} u^{\prime}(a)=0 \\[4pt] \alpha_{2} u(b)+\beta_{2} u^{\prime}(b)=0 \end{array} \nonumber$

The $\alpha$ ’s and $\beta$ ’s are constants. For different values, one has special types of boundary conditions. For $\beta_{i}=0$ , we have what are called Dirichlet boundary conditions. Namely, $u(a)=0$ and $u(b)=0$ . For $\alpha_{i}=0$ , we have Neumann boundary conditions. In this case, $u^{\prime}(a)=0$ and $u^{\prime}(b)=0$ . In terms of the heat equation example, Dirichlet conditions correspond to maintaining a fixed temperature at the ends of the rod. The Neumann boundary conditions would correspond to no heat flow across the ends, or insulating conditions, as there would be no temperature gradient at those points. The more general boundary conditions allow for partially insulated boundaries.

Another type of boundary condition that is often encountered is the periodic boundary condition. Consider the heated rod that has been bent to form a circle. Then the two end points are physically the same. So, we would expect that the temperature and the temperature gradient should agree at those points. For this case we write $u(a)=u(b)$ and $u^{\prime}(a)=u^{\prime}(b)$ . Boundary value problems using these conditions have to be handled differently than the above homogeneous conditions. These conditions leads to different types of eigenfunctions and eigenvalues.

As previously mentioned, equations of the form (6.1) occur often. We now show that Equation (6.1) can be turned into a differential equation of SturmLiouville form:

$\dfrac{d}{d x}\left(p(x) \dfrac{d y}{d x}\right)+q(x) y=F(x) \nonumber$

Another way to phrase this is provided in the theorem:

Theorem 6.1. Any second order linear operator can be put into the form of the Sturm-Liouville operator (6.2).

The proof of this is straight forward, as we shall soon show. Consider the equation (6.1). If $a_{1}(x)=a_{2}^{\prime}(x)$ , then we can write the equation in the form

$\begin{aligned} f(x) &=a_{2}(x) y^{\prime \prime}+a_{1}(x) y^{\prime}+a_{0}(x) y \\[4pt] &=\left(a_{2}(x) y^{\prime}\right)^{\prime}+a_{0}(x) y \end{aligned} \nonumber$

This is in the correct form. We just identify $p(x)=a_{2}(x)$ and $q(x)=a_{0}(x)$ .

However, consider the differential equation

$x^{2} y^{\prime \prime}+x y^{\prime}+2 y=0 . \nonumber$

In this case $a_{2}(x)=x^{2}$ and $a_{2}^{\prime}(x)=2 x \neq a_{1}(x)$ . The linear differential operator in this equation is not of Sturm-Liouville type. But, we can change it to a Sturm Liouville operator.

In the Sturm Liouville operator the derivative terms are gathered together into one perfect derivative. This is similar to what we saw in the first chapter when we solved linear first order equations. In that case we sought an integrating factor. We can do the same thing here. We seek a multiplicative function $\mu(x)$ that we can multiply through $(6.1)$ so that it can be written in Sturm-Liouville form. We first divide out the $a_{2}(x)$ , giving

$y^{\prime \prime}+\dfrac{a_{1}(x)}{a_{2}(x)} y^{\prime}+\dfrac{a_{0}(x)}{a_{2}(x)} y=\dfrac{f(x)}{a_{2}(x)} . \nonumber$

Now, we multiply the differential equation by $\mu$ :

$\mu(x) y^{\prime \prime}+\mu(x) \dfrac{a_{1}(x)}{a_{2}(x)} y^{\prime}+\mu(x) \dfrac{a_{0}(x)}{a_{2}(x)} y=\mu(x) \dfrac{f(x)}{a_{2}(x)} \nonumber$

The first two terms can now be combined into an exact derivative $\left(\mu y^{\prime}\right)^{\prime}$ if $\mu(x)$ satisfies

$\dfrac{d \mu}{d x}=\mu(x) \dfrac{a_{1}(x)}{a_{2}(x)} . \nonumber$

This is formally solved to give

$\mu(x)=e^{\int \dfrac{a_{1}(x)}{a_{2}(x)} d x} . \nonumber$

Thus, the original equation can be multiplied by factor

$\dfrac{\mu(x)}{a_{2}(x)}=\dfrac{1}{a_{2}(x)} e^{\int \dfrac{a_{1}(x)}{a_{2}(x)} d x} \nonumber$

to turn it into Sturm-Liouville form.

In summary,

$\begin{aligned} &\text { Equation (6.1), } \\[4pt] &\qquad a_{2}(x) y^{\prime \prime}+a_{1}(x) y^{\prime}+a_{0}(x) y=f(x) \end{aligned} \nonumber$

can be put into the Sturm-Liouville form

$\dfrac{d}{d x}\left(p(x) \dfrac{d y}{d x}\right)+q(x) y=F(x) \nonumber$

where

$\begin{aligned} p(x) &=e^{\int \dfrac{a_{1}(x)}{a_{2}(x)} d x} \\[4pt] q(x) &=p(x) \dfrac{a_{0}(x)}{a_{2}(x)} \\[4pt] F(x) &=p(x) \dfrac{f(x)}{a_{2}(x)} \end{aligned} \nonumber$

Example 6.2. For the example above,

$x^{2} y^{\prime \prime}+x y^{\prime}+2 y=0 . \nonumber$

We need only multiply this equation by

$\dfrac{1}{x^{2}} e^{\int \dfrac{d x}{x}}=\dfrac{1}{x}, \nonumber$

to put the equation in Sturm-Liouville form:

$\begin{aligned} 0 &=x y^{\prime \prime}+y^{\prime}+\dfrac{2}{x} y \\[4pt] &=\left(x y^{\prime}\right)^{\prime}+\dfrac{2}{x} y \end{aligned} \nonumber$

Properties of Sturm-Liouville Eigenvalue Problems

There are several properties that can be proven for the (regular) SturmLiouville eigenvalue problem. However, we will not prove them all here. We will merely list some of the important facts and focus on a few of the properties.

The eigenvalues are real, countable, ordered and there is a smallest eigenvalue. Thus, we can write them as $\lambda_{1}<\lambda_{2}<\ldots$ . However, there is no largest eigenvalue and $n \rightarrow \infty, \lambda_{n} \rightarrow \infty$ .
For each eigenvalue $\lambda_{n}$ there exists an eigenfunction $\phi_{n}$ with $n-1$ zeros on $(a, b)$ .
Eigenfunctions corresponding to different eigenvalues are orthogonal with respect to the weight function, $\sigma(x)$ . Defining the inner product of $f(x)$ and $g(x)$ as

$<f, g>=\int_{a}^{b} f(x) g(x) \sigma(x) d x \nonumber$

then the orthogonality of the eigenfunctios can be written in the form

$<\phi_{n}, \phi_{m}>=<\phi_{n}, \phi_{n}>\delta_{n m}, \quad n, m=1,2, \ldots \nonumber$

The set of eigenfunctions is complete; i.e., any piecewise smooth function can be represented by a generalized Fourier series expansion of the eigenfunctions,

$f(x) \sim \sum_{n=1}^{\infty} c_{n} \phi_{n}(x) \nonumber$

where

$c_{n}=\dfrac{<f, \phi_{n}>}{<\phi_{n}, \phi_{n}>} \nonumber$

Actually, one needs $f(x) \in L_{\sigma}^{2}[a, b]$ , the set of square integrable functions over $[a, b]$ with weight function $\sigma(x)$ . By square integrable, we mean that $<f, f><\infty$ . One can show that such a space is isomorphic to a Hilbert space, a complete inner product space.

Multiply the eigenvalue problem

$\mathcal{L} \phi_{n}=-\lambda_{n} \sigma(x) \phi_{n} \nonumber$

by $\phi_{n}$ and integrate. Solve this result for $\lambda_{n}$ , to find the Rayleigh Quotient

$\lambda_{n}=\dfrac{-\left.p \phi_{n} \dfrac{d \phi_{n}}{d x}\right|_{a} ^{b}-\int_{a}^{b}\left[p\left(\dfrac{d \phi_{n}}{d x}\right)^{2}-q \phi_{n}^{2}\right] d x}{<\phi_{n}, \phi_{n}>} \nonumber$

The Rayleigh quotient is useful for getting estimates of eigenvalues and proving some of the other properties. Example 6.3. We seek the eigenfunctions of the operator found in Example 6.2. Namely, we want to solve the eigenvalue problem

$\mathcal{L} y=\left(x y^{\prime}\right)^{\prime}+\dfrac{2}{x} y=-\lambda \sigma y \nonumber$

subject to a set of boundary conditions. Let’s use the boundary conditions

$y^{\prime}(1)=0, \quad y^{\prime}(2)=0 \nonumber$

[Note that we do not know $\sigma(x)$ yet, but will choose an appropriate function to obtain solutions.]

Expanding the derivative, we have

$x y^{\prime \prime}+y^{\prime}+\dfrac{2}{x} y=-\lambda \sigma y . \nonumber$

Multiply through by $x$ to obtain

$x^{2} y^{\prime \prime}+x y^{\prime}+(2+\lambda x \sigma) y=0 \nonumber$

Notice that if we choose $\sigma(x)=x^{-1}$ , then this equation can be made a Cauchy-Euler type equation. Thus, we have

$x^{2} y^{\prime \prime}+x y^{\prime}+(\lambda+2) y=0 . \nonumber$

The characteristic equation is

$r^{2}+\lambda+2=0 . \nonumber$

For oscillatory solutions, we need $\lambda+2>0$ . Thus, the general solution is

$y(x)=c_{1} \cos (\sqrt{\lambda+2} \ln |x|)+c_{2} \sin (\sqrt{\lambda+2} \ln |x|) . \nonumber$

Next we apply the boundary conditions. $y^{\prime}(1)=0$ forces $c_{2}=0$ . This leaves

$y(x)=c_{1} \cos (\sqrt{\lambda+2} \ln x) . \nonumber$

The second condition, $y^{\prime}(2)=0$ , yields

$\sin (\sqrt{\lambda+2} \ln 2)=0 \nonumber$

This will give nontrivial solutions when

$\sqrt{\lambda+2} \ln 2=n \pi, \quad n=0,1,2,3 \ldots \nonumber$

In summary, the eigenfunctions for this eigenvalue problem are

$y_{n}(x)=\cos \left(\dfrac{n \pi}{\ln 2} \ln x\right), \quad 1 \leq x \leq 2 \nonumber$

and the eigenvalues are $\lambda_{n}=2+\left(\dfrac{n \pi}{\ln 2}\right)^{2}$ for $n=0,1,2, \ldots$

Note: We include the $n=0$ case because $y(x)=$ constant is a solution of the $\lambda=-2$ case. More specifically, in this case the characteristic equation reduces to $r^{2}=0$ . Thus, the general solution of this Cauchy-Euler equation is

$y(x)=c_{1}+c_{2} \ln |x| \nonumber$

Setting $y^{\prime}(1)=0$ , forces $c_{2}=0 . y^{\prime}(2)$ automatically vanishes, leaving the solution in this case as $y(x)=c_{1}$ .

We note that some of the properties listed in the beginning of the section hold for this example. The eigenvalues are seen to be real, countable and ordered. There is a least one, $\lambda=2$ . Next, one can find the zeros of each eigenfunction on $[1,2]$ . Then the argument of the cosine, $\dfrac{n \pi}{\ln 2} \ln x$ , takes values 0 to $n \pi$ for $x \in[1,2]$ . The cosine function has $n-1$ roots on this interval.

Orthogonality can be checked as well. We set up the integral and use the substitution $y=\pi \ln x / \ln 2$ . This gives

$\begin{aligned} <y_{n}, y_{m}>&=\int_{1}^{2} \cos \left(\dfrac{n \pi}{\ln 2} \ln x\right) \cos \left(\dfrac{m \pi}{\ln 2} \ln x\right) \dfrac{d x}{x} \\[4pt] &=\dfrac{\ln 2}{\pi} \int_{0}^{\pi} \cos n y \cos m y d y \\[4pt] &=\dfrac{\ln 2}{2} \delta_{n, m} \end{aligned} \nonumber$

Adjoint Operators

In the study of the spectral theory of matrices, one learns about the adjoint of the matrix, $A^{\dagger}$ , and the role that self-adjoint, or Hermitian, matrices play in diagonalization. also, one needs the concept of adjoint to discuss the existence of solutions to the matrix problem $\mathbf{y}=A \mathbf{x}$ . In the same spirit, one is interested in the existence of solutions of the operator equation $L u=f$ and solutions of the corresponding eigenvalue problem. The study of linear operator on Hilbert spaces is a generalization of what the reader had seen in a linear algebra course.

Just as one can find a basis of eigenvectors and diagonalize Hermitian, or self-adjoint, matrices (or, real symmetric in the case of real matrices), we will see that the Sturm-Liouville operator is self-adjoint. In this section we will define the domain of an operator and introduce the notion of adjoint operators. In the last section we discuss the role the adjpoint plays in the existence of solutions to the operator equation $L u=f$ .

We first introduce some definitions.

Definition 6.4. The domain of a differential operator $L$ is the set of all $u \in$ $L_{\sigma}^{2}[a, b]$ satisfying a given set of homogeneous boundary conditions.

Definition 6.5. The adjoint, $L^{\dagger}$ , of operator $L$ satisfies

$<u, L v>=<L^{\dagger} u, v> \nonumber$

for all $v$ in the domain of $L$ and $u$ in the domain of $L^{\dagger}$ .

Example 6.6. As an example, we find the adjoint of second order linear differential operator $L=a_{2}(x) \dfrac{d^{2}}{d x^{2}}+a_{1}(x) \dfrac{d}{d x}+a_{0}(x)$ .

In order to find the adjoint, we place the operator under an integral. So, we consider the inner product

$<u, L v>=\int_{a}^{b} u\left(a_{2} v^{\prime \prime}+a_{1} v^{\prime}+a_{0} v\right) d x \nonumber$

We have to move the operator $L$ from $v$ and determine what operator is acting on $u$ in order to formally preserve the inner product. For a simple operator like $L=\dfrac{d}{d x}$ , this is easily done using integration by parts. For the given operator, we will need to apply several integrations by parts to the individual terms. We will consider the individual terms.

First we consider the $a_{1} v^{\prime}$ term. Integration by parts yields

$\int_{a}^{b} u(x) a_{1}(x) v^{\prime}(x) d x=\left.a_{1}(x) u(x) v(x)\right|_{a} ^{b}-\int_{a}^{b}\left(u(x) a_{1}(x)\right)^{\prime} v(x) d x \nonumber$

Now, we consider the $a_{2} v^{\prime \prime}$ term. In this case it will take two integrations by parts:

$\begin{aligned} \int_{a}^{b} u(x) a_{2}(x) v^{\prime \prime}(x) d x &=\left.a_{2}(x) u(x) v^{\prime}(x)\right|_{a} ^{b}-\int_{a}^{b}\left(u(x) a_{2}(x)\right)^{\prime} v(x)^{\prime} \\[4pt] =& {\left.\left[a_{2}(x) u(x) v^{\prime}(x)-\left(a_{2}(x) u(x)\right)^{\prime} v(x)\right]\right|_{a} ^{b} } \\[4pt] &+\int_{a}^{b}\left(u(x) a_{2}(x)\right)^{\prime \prime} v(x) d x \end{aligned} \nonumber$

Combining these results, we obtain

$\begin{aligned} <u, L v>&=\int_{a}^{b} u\left(a_{2} v^{\prime \prime}+a_{1} v^{\prime}+a_{0} v\right) d x \\[4pt] =& {\left.\left[a_{1}(x) u(x) v(x)+a_{2}(x) u(x) v^{\prime}(x)-\left(a_{2}(x) u(x)\right)^{\prime} v(x)\right]\right|_{a} ^{b} } \\[4pt] &+\int_{a}^{b}\left[\left(a_{2} u\right)^{\prime \prime}-\left(a_{1} u\right)^{\prime}+a_{0} u\right] v d x \end{aligned} \nonumber$

Inserting the boundary conditions for $v$ , one has to determine boundary conditions for $u$ such that

$\left.\left[a_{1}(x) u(x) v(x)+a_{2}(x) u(x) v^{\prime}(x)-\left(a_{2}(x) u(x)\right)^{\prime} v(x)\right]\right|_{a} ^{b}=0 \nonumber$

This leaves

$<u, L v>=\int_{a}^{b}\left[\left(a_{2} u\right)^{\prime \prime}-\left(a_{1} u\right)^{\prime}+a_{0} u\right] v d x \equiv<L^{\dagger} u, v>. \nonumber$

Therefore,

$L^{\dagger}=\dfrac{d^{2}}{d x^{2}} a_{2}(x)-\dfrac{d}{d x} a_{1}(x)+a_{0}(x) \nonumber$

When $L^{\dagger}=L$ , the operator is called formally self-adjoint. When the domain of $L$ is the same as the domain of $L^{\dagger}$ , the term self-adjoint is used. As the domain is important in establishing self-adjointness, we need to do a complete example in which the domain of the adjoint is found.

Example 6.7. Determine $L^{\dagger}$ and its domain for operator $L u=\dfrac{d u}{d x}$ where $u$ satisfies the boundary conditions $u(0)=2 u(1)$ on $[0,1]$ .

We need to find the adjoint operator satisfying $<v, L u>=<L^{\dagger} v, u>$ . Therefore, we rewrite the integral

$<v, L u>=\int_{0}^{1} v \dfrac{d u}{d x} d x=\left.u v\right|_{0} ^{1}-\int_{0}^{1} u \dfrac{d v}{d x} d x=<L^{\dagger} v, u>\text {. } \nonumber$

From this we have the adjoint problem consisting of an adjoint operator and the associated boundary condition:

$L^{\dagger}=-\dfrac{d}{d x}$
$\left.u v\right|_{0} ^{1}=0 \Rightarrow 0=u(1)[v(1)-2 v(0)] \Rightarrow v(1)=2 v(0)$

Lagrange’s and Green’s Identities

Before turning to the proofs that the eigenvalues of a Sturm-Liouville problem are real and the associated eigenfunctions orthogonal, we will first need to introduce two important identities. For the Sturm-Liouville operator,

$\mathcal{L}=\dfrac{d}{d x}\left(p \dfrac{d}{d x}\right)+q \nonumber$

we have the two identities:

Lagrange’s Identity $u \mathcal{L} v-v \mathcal{L} u=\left[p\left(u v^{\prime}-v u^{\prime}\right)\right]^{\prime}$ .

Green’s Identity $\int_{a}^{b}(u \mathcal{L} v-v \mathcal{L} u) d x=\left.\left[p\left(u v^{\prime}-v u^{\prime}\right)\right]\right|_{a} ^{b}$ .

Proof. The proof of Lagrange’s identity follows by a simple manipulations of the operator:

$\begin{aligned} u \mathcal{L} v-v \mathcal{L} u &=u\left[\dfrac{d}{d x}\left(p \dfrac{d v}{d x}\right)+q v\right]-v\left[\dfrac{d}{d x}\left(p \dfrac{d u}{d x}\right)+q u\right] \\[4pt] &=u \dfrac{d}{d x}\left(p \dfrac{d v}{d x}\right)-v \dfrac{d}{d x}\left(p \dfrac{d u}{d x}\right) \\[4pt] &=u \dfrac{d}{d x}\left(p \dfrac{d v}{d x}\right)+p \dfrac{d u}{d x} \dfrac{d v}{d x}-v \dfrac{d}{d x}\left(p \dfrac{d u}{d x}\right)-p \dfrac{d u}{d x} \dfrac{d v}{d x} \\[4pt] &=\dfrac{d}{d x}\left[p u \dfrac{d v}{d x}-p v \dfrac{d u}{d x}\right] \end{aligned} \nonumber$

Green’s identity is simply proven by integrating Lagrange’s identity.

Orthogonality and Reality

We are now ready to prove that the eigenvalues of a Sturm-Liouville problem are real and the corresponding eigenfunctions are orthogonal. These are easily established using Green’s identity, which in turn is a statement about the Sturm-Liouville operator being self-adjoint.

Theorem 6.8. The eigenvalues of the Sturm-Liouville problem are real.

Proof. Let $\phi_{n}(x)$ be a solution of the eigenvalue problem associated with $\lambda_{n}$ :

$\mathcal{L} \phi_{n}=-\lambda_{n} \sigma \phi_{n} . \nonumber$

The complex conjugate of this equation is

$\mathcal{L} \bar{\phi}_{n}=-\bar{\lambda}_{n} \sigma \bar{\phi}_{n} . \nonumber$

Now, multiply the first equation by $\bar{\phi}_{n}$ and the second equation by $\phi_{n}$ and then subtract the results. We obtain

$\bar{\phi}_{n} \mathcal{L} \phi_{n}-\phi_{n} \mathcal{L} \bar{\phi}_{n}=\left(\bar{\lambda}_{n}-\lambda_{n}\right) \sigma \phi_{n} \bar{\phi}_{n} \nonumber$

Integrate both sides of this equation:

$\int_{a}^{b}\left(\bar{\phi}_{n} \mathcal{L} \phi_{n}-\phi_{n} \mathcal{L} \bar{\phi}_{n}\right) d x=\left(\bar{\lambda}_{n}-\lambda_{n}\right) \int_{a}^{b} \sigma \phi_{n} \bar{\phi}_{n} d x \nonumber$

Apply Green’s identity to the left hand side to find

$\left.\left[p\left(\bar{\phi}_{n} \phi_{n}^{\prime}-\phi_{n} \bar{\phi}_{n}^{\prime}\right)\right]\right|_{a} ^{b}=\left(\bar{\lambda}_{n}-\lambda_{n}\right) \int_{a}^{b} \sigma \phi_{n} \bar{\phi}_{n} d x \nonumber$

Using the homogeneous boundary conditions for a self-adjoint operator, the left side vanishes to give

$0=\left(\bar{\lambda}_{n}-\lambda_{n}\right) \int_{a}^{b} \sigma\left\|\phi_{n}\right\|^{2} d x \nonumber$

The integral is nonnegative, so we must have $\bar{\lambda}_{n}=\lambda_{n}$ . Therefore, the eigenvalues are real. Theorem 6.9. The eigenfunctions corresponding to different eigenvalues of the Sturm-Liouville problem are orthogonal.

Proof. This is proven similar to the last theorem. Let $\phi_{n}(x)$ be a solution of the eigenvalue problem associated with $\lambda_{n}$

$\mathcal{L} \phi_{n}=-\lambda_{n} \sigma \phi_{n}, \nonumber$

and let $\phi_{m}(x)$ be a solution of the eigenvalue problem associated with $\lambda_{m} \neq$ $\lambda_{n}$

$\mathcal{L} \phi_{m}=-\lambda_{m} \sigma \phi_{m}, \nonumber$

Now, multiply the first equation by $\phi_{m}$ and the second equation by $\phi_{n}$ . Subtracting the results, we obtain

$\phi_{m} \mathcal{L} \phi_{n}-\phi_{n} \mathcal{L} \phi_{m}=\left(\lambda_{m}-\lambda_{n}\right) \sigma \phi_{n} \phi_{m} \nonumber$

Similar to the previous prooof, we integrate both sides of the equation and use Green’s identity and the boundary conditions for a self-adjoint operator. This leaves

$0=\left(\lambda_{m}-\lambda_{n}\right) \int_{a}^{b} \sigma \phi_{n} \phi_{m} d x . \nonumber$

Since the eigenvalues are distinct, we can divide by $\lambda_{m}-\lambda_{n}$ , leaving the desired result,

$\int_{a}^{b} \sigma \phi_{n} \phi_{m} d x=0 . \nonumber$

Therefore, the eigenfunctions are orthogonal with respect to the weight function $\sigma(x)$ .

The Rayleigh Quotient

The Rayleigh quotient is useful for getting estimates of eigenvalues and proving some of the other properties associated with Sturm-Liouville eigenvalue problems. We begin by multiplying the eigenvalue problem

$\mathcal{L} \phi_{n}=-\lambda_{n} \sigma(x) \phi_{n} \nonumber$

by $\phi_{n}$ and integrating. This gives

$\int_{a}^{b}\left[\phi_{n} \dfrac{d}{d x}\left(p \dfrac{d \phi_{n}}{d x}\right)+q \phi_{n}^{2}\right] d x=-\lambda \int_{a}^{b} \phi_{n}^{2} d x \nonumber$

One can solve the last equation for $\lambda$ to find

$\lambda=\dfrac{-\int_{a}^{b}\left[\phi_{n} \dfrac{d}{d x}\left(p \dfrac{d \phi_{n}}{d x}\right)+q \phi_{n}^{2}\right] d x}{\int_{a}^{b} \phi_{n}^{2} \sigma d x} \nonumber$

It appears that we have solved for the eigenvalue and have not needed the machinery we had developed in Chapter 4 for studying boundary value problems. However, we really cannot evaluate this expression because we do not know the eigenfunctions, $\phi_{n}(x)$ yet. Nevertheless, we will see what we can determine.

One can rewrite this result by performing an integration by parts on the first term in the numerator. Namely, pick $u=\phi_{n}$ and $d v=\dfrac{d}{d x}\left(p \dfrac{d \phi_{n}}{d x}\right) d x$ for the standard integration by parts formula. Then, we have

$\int_{a}^{b} \phi_{n} \dfrac{d}{d x}\left(p \dfrac{d \phi_{n}}{d x}\right) d x=\left.p \phi_{n} \dfrac{d \phi_{n}}{d x}\right|_{a} ^{b}-\int_{a}^{b}\left[p\left(\dfrac{d \phi_{n}}{d x}\right)^{2}-q \phi_{n}^{2}\right] d x \nonumber$

Inserting the new formula into the expression for $\lambda$ , leads to the Rayleigh Quotient

$\lambda_{n}=\dfrac{-\left.p \phi_{n} \dfrac{d \phi_{n}}{d x}\right|_{a} ^{b}+\int_{a}^{b}\left[p\left(\dfrac{d \phi_{n}}{d x}\right)^{2}-q \phi_{n}^{2}\right] d x}{\int_{a}^{b} \phi_{n}^{2} \sigma d x} . \nonumber$

In many applications the sign of the eigenvalue is important. As we had seen in the solution of the heat equation, $T^{\prime}+k \lambda T=0$ . Since we expect the heat energy to diffuse, the solutions should decay in time. Thus, we would expect $\lambda>0$ . In studying the wave equation, one expects vibrations and these are only possible with the correct sign of the eigenvalue (positive again). Thus, in order to have nonnegative eigenvalues, we see from (6.21) that

a. $q(x) \leq 0$ , and

b. $-\left.p \phi_{n} \dfrac{d \phi_{n}}{d x}\right|_{a} ^{b} \geq 0$

Furthermore, if $\lambda$ is a zero eigenvalue, then $q(x) \equiv 0$ and $\alpha_{1}=\alpha_{2}=0$ in the homogeneous boundary conditions. This can be seen by setting the numerator equal to zero. Then, $q(x)=0$ and $\phi_{n}^{\prime}(x)=0$ . The second of these conditions inserted into the boundary conditions forces the restriction on the type of boundary conditions.

One of the (unproven here) properties of Sturm-Liouville eigenvalue problems with homogeneous boundary conditions is that the eigenvalues are ordered, $\lambda_{1}<\lambda_{2}<\ldots$ . Thus, there is a smallest eigenvalue. It turns out that for any continuous function, $y(x)$

$\lambda_{1}=\min _{y(x)} \dfrac{-\left.p y \dfrac{d y}{d x}\right|_{a} ^{b}+\int_{a}^{b}\left[p\left(\dfrac{d y}{d x}\right)^{2}-q y^{2}\right] d x}{\int_{a}^{b} y^{2} \sigma d x} \nonumber$

and this minimum is obtained when $y(x)=\phi_{1}(x)$ . This result can be used to get estimates of the minimum eigenvalue by using trial functions which are continuous and satisfy the boundary conditions, but do not necessarily satisfy the differential equation. Example 6.10. We have already solved the eigenvalue problem $\phi^{\prime \prime}+\lambda \phi=0$ , $\phi(0)=0, \phi(1)=0$ . In this case, the lowest eigenvalue is $\lambda_{1}=\pi^{2}$ . We can pick a nice function satisfying the boundary conditions, say $y(x)=x-x^{2}$ Inserting this into Equation (6.22), we find

$\lambda_{1} \leq \dfrac{\int_{0}^{1}(1-2 x)^{2} d x}{\int_{0}^{1}\left(x-x^{2}\right)^{2} d x}=10 \nonumber$

Indeed, $10 \geq \pi^{2}$

The Eigenfunction Expansion Method

In section $4.3 .2$ we saw generally how one can use the eigenfunctions of a differential operator to solve a nonhomogeneous boundary value problem. In this chapter we have seen that Sturm-Liouville eigenvalue problems have the requisite set of orthogonal eigenfunctions. In this section we will apply the eigenfunction expansion method to solve a particular nonhomogenous boundary value problem.

Recall that one starts with a nonhomogeneous differential equation

$\mathcal{L} y=f, \nonumber$

where $y(x)$ is to satisfy given homogeneous boundary conditions. The method makes use of the eigenfunctions satisfying the eigenvalue problem

$\mathcal{L} \phi_{n}=-\lambda_{n} \sigma \phi_{n} \nonumber$

subject to the given boundary conditions. Then, one assumes that $y(x)$ can be written as an expansion in the eigenfunctions,

$y(x)=\sum_{n=1}^{\infty} c_{n} \phi_{n}(x), \nonumber$

and inserts the expansion into the nonhomogeneous equation. This gives

$f(x)=\mathcal{L}\left(\sum_{n=1}^{\infty} c_{n} \phi_{n}(x)\right)=-\sum_{n=1}^{\infty} c_{n} \lambda_{n} \sigma(x) \phi_{n}(x) \nonumber$

The expansion coefficients are then found by making use of the orthogonality of the eigenfunctions. Namely, we multiply the last equation by $\phi_{m}(x)$ and integrate. We obtain

$\int_{a}^{b} f(x) \phi_{m}(x) d x=-\sum_{n=1}^{\infty} c_{n} \lambda_{n} \int_{a}^{b} \phi_{n}(x) \phi_{m}(x) \sigma(x) d x \nonumber$

Orthogonality yields

$\int_{a}^{b} f(x) \phi_{m}(x) d x=-c_{m} \lambda_{m} \int_{a}^{b} \phi_{m}^{2}(x) \sigma(x) d x \nonumber$

Solving for $c_{m}$ , we have

$c_{m}=-\dfrac{\int_{a}^{b} f(x) \phi_{m}(x) d x}{\lambda_{m} \int_{a}^{b} \phi_{m}^{2}(x) \sigma(x) d x} . \nonumber$

Example 6.11. As an example, we consider the solution of the boundary value problem

$\begin{aligned} \left(x y^{\prime}\right)^{\prime}+\dfrac{y}{x} &=\dfrac{1}{x}, \quad x \in[1, e], \\[4pt] y(1) &=0=y(e) . \end{aligned} \nonumber$

This equation is already in self-adjoint form. So, we know that the associated Sturm-Liouville eigenvalue problem has an orthogonal set of eigenfunctions. We first determine this set. Namely, we need to solve

$\left(x \phi^{\prime}\right)^{\prime}+\dfrac{\phi}{x}=-\lambda \sigma \phi, \quad \phi(1)=0=\phi(e) . \nonumber$

Rearranging the terms and multiplying by $x$ , we have that

$x^{2} \phi^{\prime \prime}+x \phi^{\prime}+(1+\lambda \sigma x) \phi=0 . \nonumber$

This is almost an equation of Cauchy-Euler type. Picking the weight function $\sigma(x)=\dfrac{1}{x}$ , we have

$x^{2} \phi^{\prime \prime}+x \phi^{\prime}+(1+\lambda) \phi=0 . \nonumber$

This is easily solved. The characteristic equation is

$r^{2}+(1+\lambda)=0 . \nonumber$

One obtains nontrivial solutions of the eigenvalue problem satisfying the boundary conditions when $\lambda>-1$ . The solutions are

$\phi_{n}(x)=A \sin (n \pi \ln x), \quad n=1,2, \ldots \nonumber$

where $\lambda_{n}=n^{2} \pi^{2}-1$

It is often useful to normalize the eigenfunctions. This means that one chooses $A$ so that the norm of each eigenfunction is one. Thus, we have

$\begin{aligned} 1 &=\int_{1}^{e} \phi_{n}(x)^{2} \sigma(x) d x \\[4pt] &=A^{2} \int_{1}^{e} \sin (n \pi \ln x) \dfrac{1}{x} d x \\[4pt] &=A^{2} \int_{0}^{1} \sin (n \pi y) d y=\dfrac{1}{2} A^{2} \end{aligned} \nonumber$

Thus, $A=\sqrt{2}$

We now turn towards solving the nonhomogeneous problem, $\mathcal{L} y=\dfrac{1}{x}$ . We first expand the unknown solution in terms of the eigenfunctions,

$y(x)=\sum_{n=1}^{\infty} c_{n} \sqrt{2} \sin (n \pi \ln x) . \nonumber$

Inserting this solution into the differential equation, we have

$\dfrac{1}{x}=\mathcal{L} y=-\sum_{n=1}^{\infty} c_{n} \lambda_{n} \sqrt{2} \sin (n \pi \ln x) \dfrac{1}{x} \nonumber$

Next, we make use of orthogonality. Multiplying both sides by $\phi_{m}(x)=$ $\sqrt{2} \sin (m \pi \ln x)$ and integrating, gives

$\lambda_{m} c_{m}=\int_{1}^{e} \sqrt{2} \sin (m \pi \ln x) \dfrac{1}{x} d x=\dfrac{\sqrt{2}}{m \pi}\left[(-1)^{m}-1\right] . \nonumber$

Solving for $c_{m}$ , we have

$c_{m}=\dfrac{\sqrt{2}}{m \pi} \dfrac{\left[(-1)^{m}-1\right]}{m^{2} \pi^{2}-1} . \nonumber$

Finally, we insert our coefficients into the expansion for $y(x)$ . The solution is then

$y(x)=\sum_{n=1}^{\infty} \dfrac{2}{n \pi} \dfrac{\left[(-1)^{n}-1\right]}{n^{2} \pi^{2}-1} \sin (n \pi \ln (x)) . \nonumber$

The Fredholm Alternative Theorem

Given that $L y=f$ , when can one expect to find a solution? Is it unique? These questions are answered by the Fredholm Alternative Theorem. This theorem occurs in many forms from a statement about solutions to systems of algebraic equations to solutions of boundary value problems and integral equations. The theorem comes in two parts, thus the term "alternative". Either the equation has exactly one solution for all $f$ , or the equation has many solutions for some $f^{\prime}$ ’s and none for the rest.

The reader is familiar with the statements of the Fredholm Alternative for the solution of systems of algebraic equations. One seeks solutions of the system $A x=b$ for $A$ an $n \times m$ matrix. Defining the matrix adjoint, $A^{*}$ through $<A x, y>=<x, A^{*} y>$ for all $x, y, \in \mathcal{C}^{n}$ , then either

Theorem 6.12. First Alternative

The equation $A x=b$ has a solution if and only if $<b, v>=0$ for all $v$ such that $A^{*} v=0$ Theorem 6.13. Second Alternative

$A$ solution of $A x=b$ , if it exists, is unique if and only if $x=0$ is the only solution of $A x=0$ .

The second alternative is more familiar when given in the form: The solution of a nonhomogeneous system of $n$ equations and $n$ unknowns is unique if the only solution to the homogeneous problem is the zero solution. Or, equivalently, $A$ is invertible, or has nonzero determinant.

Proof. We prove the second theorem first. Assume that $A x=0$ for $x \neq 0$ and $A x_{0}=b$ . Then $A\left(x_{0}+\alpha x\right)=b$ for all $\alpha$ . Therefore, the solution is not unique. Conversely, if there are two different solutions, $x_{1}$ and $x_{2}$ , satisfying $A x_{1}=b$ and $A x_{2}=b$ , then one has a nonzero solution $x=x_{1}-x_{2}$ such that $A x=A\left(x_{1}-x_{2}\right)=0$ .

The proof of the first part of the first theorem is simple. Let $A^{*} v=0$ and $A x_{0}=b$ . Then we have

$<b, v>=<A x_{0}, v>=<x_{0}, A^{*} v>=0 . \nonumber$

For the second part we assume that $\langle b, v\rangle=0$ for all $v$ such that $A^{*} v=0$ . Write $b$ as the sum of a part that is in the range of $A$ and a part that in the space orthogonal to the range of $A, b=b_{R}+b_{O}$ . Then, $0=<b_{O}, A x>=<$ $A^{*} b, x>$ for all $x$ . Thus, $A^{*} b_{O}$ . Since $\langle b, v>=0$ for all $v$ in the nullspace of $A^{*}$ , then $<b, b_{O}>=0$ . Therefore, $<b, v>=0$ implies that $0=<b, O>=<$ $b_{R}+b_{O}, b_{O}>=<b_{O}, b_{O}>$ . This means that $b_{O}=0$ , giving $b=b_{R}$ is in the range of $A$ . So, $A x=b$ has a solution.

Example 6.14. Determine the allowed forms of $\mathbf{b}$ for a solution of $A \mathbf{x}=\mathbf{b}$ to exist, where

$A=\left(\begin{array}{ll} 1 & 2 \\[4pt] 3 & 6 \end{array}\right) \nonumber$

First note that $A^{*}=\bar{A}^{T}$ . This is seen by looking at

$\begin{aligned} <A \mathbf{x}, \mathbf{y}>&=<\mathbf{x}, A^{*} \mathbf{y}>\\[4pt] \sum_{i=1}^{n} \sum_{j=1}^{n} a_{i j} x_{j} \bar{y}_{i} &=\sum_{j=1}^{n} x_{j} \sum_{j=1}^{n} a_{i j} \bar{y}_{i} \\[4pt] &=\sum_{j=1}^{n} x_{j} \sum_{j=1}^{n}\left(\bar{a}^{T}\right)_{j i} y_{i} \end{aligned} \nonumber$

For this example,

$A^{*}=\left(\begin{array}{ll} 1 & 3 \\[4pt] 2 & 6 \end{array}\right) \nonumber$

We next solve $A^{*} \mathbf{v}=0$ . This means, $v_{1}+3 v_{2}=0$ . So, the nullspace of $A^{*}$ is spanned by $\mathbf{v}=(3,-1)^{T}$ . For a solution of $A \mathbf{x}=\mathbf{b}$ to exist, $\mathbf{b}$ would have to be orthogonal to $\mathbf{v}$ . Therefore, a solution exists when

$\mathbf{b}=\alpha\left(\begin{array}{l} 1 \\[4pt] 3 \end{array}\right) \nonumber$

So, what does this say about solutions of boundary value problems? There is a more general theory for linear operators. The matrix formulations follows, since matrices are simply representations of linear transformations. A more general statement would be

Theorem 6.15. If $L$ is a bounded linear operator on a Hilbert space, then $L y=f$ has a solution if and only if $<f, v>=0$ for every $v$ such that $L^{\dagger} v=0$

The statement for boundary value problems is similar. However, we need to be careful to treat the boundary conditions in our statement. As we have seen, after several integrations by parts we have that

$<\mathcal{L} u, v>=S(u, v)+<u, \mathcal{L}^{\dagger} v> \nonumber$

where $S(u, v)$ involves the boundary conditions on $u$ and $v$ . Note that for nonhomogeneous boundary conditions, this term may no longer vanish.

Theorem 6.16. The solution of the boundary value problem $\mathcal{L} u=f$ with boundary conditions $B u=g$ exists if and only if

$<f, v>-S(u, v)=0 \nonumber$

for all $v$ satisfying $\mathcal{L}^{\dagger} v=0$ and $B^{\dagger} v=0$ .

Example 6.17. Consider the problem

$u^{\prime \prime}+u=f(x), \quad u(0)-u(2 \pi)=\alpha, u^{\prime}(0)-u^{\prime}(2 \pi)=\beta \nonumber$

Only certain values of $\alpha$ and $\beta$ will lead to solutions. We first note that $L=L^{\dagger}$

$\dfrac{d^{2}}{d x^{2}}+1 \nonumber$

Solutions of

$L^{\dagger} v=0, \quad v(0)-v(2 \pi)=0, v^{\prime}(0)-v^{\prime}(2 \pi)=0 \nonumber$

are easily found to be linear combinations of $v=\sin x$ and $v=\cos x$ . Next one computes

$\begin{aligned} S(u, v) &=\left[u^{\prime} v-u v^{\prime}\right]_{0}^{2 \pi} \\[4pt] &=u^{\prime}(2 \pi) v(2 \pi)-u(2 \pi) v^{\prime}(2 \pi)-u^{\prime}(0) v(0)+u(0) v^{\prime}(0) \end{aligned} \nonumber$

For $v(x)=\sin x$ , this yields

$S(u, \sin x)=-u(2 \pi)+u(0)=\alpha \nonumber$

Similarly,

$S(u, \cos x)=\beta \nonumber$

Using $<f, v>-S(u, v)=0$ , this leads to the conditions

$\begin{aligned} &\int_{0}^{2 \pi} f(x) \sin x d x=\alpha \\[4pt] &\int_{0}^{2 \pi} f(x) \cos x d x=\beta \end{aligned} \nonumber$

Problems

6.1. Find the adjoint operator and its domain for $L u=u^{\prime \prime}+4 u^{\prime}-3 u, u^{\prime}(0)+$ $4 u(0)=0, u^{\prime}(1)+4 u(1)=0$ .

6.2. Show that a Sturm-Liouville operator with periodic boundary conditions on $[a, b]$ is self-adjoint if and only if $p(a)=p(b)$ . [Recall, periodic boundary conditions are given as $u(a)=u(b)$ and $\left.u^{\prime}(a)=u^{\prime}(b) .\right]$

6.3. The Hermite differential equation is given by $y^{\prime \prime}-2 x y^{\prime}+\lambda y=0$ . Rewrite this equation in self-adjoint form. From the Sturm-Liouville form obtained, verify that the differential operator is self adjoint on $(-\infty, \infty)$ . Give the integral form for the orthogonality of the eigenfunctions.

6.4. Find the eigenvalues and eigenfunctions of the given Sturm-Liouville problems.

a. $y^{\prime \prime}+\lambda y=0, y^{\prime}(0)=0=y^{\prime}(\pi)$ .

b. $\left(x y^{\prime}\right)^{\prime}+\dfrac{\lambda}{x} y=0, y(1)=y\left(e^{2}\right)=0$ .

6.5. The eigenvalue problem $x^{2} y^{\prime \prime}-\lambda x y^{\prime}+\lambda y=0$ with $y(1)=y(2)=0$ is not a Sturm-Liouville eigenvalue problem. Show that none of the eigenvalues are real by solving this eigenvalue problem.

6.6. In Example $6.10$ we found a bound on the lowest eigenvalue for the given eigenvalue problem.

a. Verify the computation in the example. b. Apply the method using

$y(x)=\left\{\begin{array}{cc} x, & 0<x<\dfrac{1}{2} \\[4pt] 1-x, & \dfrac{1}{2}<x<1 \end{array}\right. \nonumber$

Is this an upper bound on $\lambda_{1}$

c. Use the Rayleigh quotient to obtain a good upper bound for the lowest eigenvalue of the eigenvalue problem: $\phi^{\prime \prime}+\left(\lambda-x^{2}\right) \phi=0, \phi(0)=0$ , $\phi^{\prime}(1)=0$ .

6.7. Use the method of eigenfunction expansions to solve the problem:

$y^{\prime \prime}+4 y=x^{2}, \quad y(0)=y(1)=0 \nonumber$

6.8. Determine the solvability conditions for the nonhomogeneous boundary value problem: $u^{\prime \prime}+4 u=f(x), u(0)=\alpha, u^{\prime}(1)=\beta$ .

Special Functions

In this chapter we will look at some additional functions which arise often in physical applications and are eigenfunctions for some Sturm-Liouville boundary value problem. We begin with a collection of special functions, called the classical orthogonal polynomials. These include such polynomial functions as the Legendre polynomials, the Hermite polynomials, the Tchebychef and the Gegenbauer polynomials. Also, Bessel functions occur quite often. We will spend more time exploring the Legendre and Bessel functions. These functions are typically found as solutions of differential equations using power series methods in a first course in differential equations.

Classical Orthogonal Polynomials

We begin by noting that the sequence of functions $\left\{1, x, x^{2}, \ldots\right\}$ is a basis of linearly independent functions. In fact, by the Stone-Weierstrass Approximation Theorem this set is a basis of $L_{\sigma}^{2}(a, b)$ , the space of square integrable functions over the interval $[a, b]$ relative to weight $\sigma(x)$ . We are familiar with being able to expand functions over this basis, since the expansions are just power series representations of the functions,

$f(x) \sim \sum_{n=0}^{\infty} c_{n} x^{n} . \nonumber$

However, this basis is not an orthogonal set of basis functions. One can easily see this by integrating the product of two even, or two odd, basis functions with $\sigma(x)=1$ and $(a, b)=(-1,1)$ . For example,

$<1, x^{2}>=\int_{-1}^{1} x^{0} x^{2} d x=\dfrac{2}{3} \nonumber$

Since we have found that orthogonal bases have been useful in determining the coefficients for expansions of given functions, we might ask if it is possible to obtain an orthogonal basis involving these powers of $x$ . Of course, finite combinations of these basis element are just polynomials!

$\mathrm{OK}$ , we will ask. "Given a set of linearly independent basis vectors, can one find an orthogonal basis of the given space?" The answer is yes. We recall from introductory linear algebra, which mostly covers finite dimensional vector spaces, that there is a method for carrying this out called the GramSchmidt Orthogonalization Process. We will recall this process for finite dimensional vectors and then generalize to function spaces.

Figure 7.1. The basis $\mathbf{a}_{1}, \mathbf{a}_{2}$ , and $\mathbf{a}_{3}$ , of $\mathbf{R}^{3}$ considered in the text.

Let’s assume that we have three vectors that span $\mathbf{R}^{3}$ , given by $\mathbf{a}_{1}, \mathbf{a}_{2}$ , and $\mathbf{a}_{3}$ and shown in Figure 7.1. We seek an orthogonal basis $\mathbf{e}_{1}, \mathbf{e}_{2}$ , and $\mathbf{e}_{3}$ , beginning one vector at a time.

First we take one of the original basis vectors, say $\mathbf{a}_{1}$ , and define

$\mathbf{e}_{1}=\mathbf{a}_{1} \nonumber$

Of course, we might want to normalize our new basis vectors, so we would denote such a normalized vector with a "hat":

$\hat{\mathbf{e}}_{1}=\dfrac{\mathbf{e}_{1}}{e_{1}}, \nonumber$

where $e_{1}=\sqrt{\mathbf{e}_{1} \cdot \mathbf{e}_{1}}$ .

Figure 7.2. A plot of the vectors $\mathbf{e}_{1}, \mathbf{a}_{2}$ , and $\mathbf{e}_{2}$ needed to find the projection of $\mathbf{a}_{2}$ , on $\mathbf{e}_{1}$

Note that this is easily proven by writing the projection as a vector of length $a_{2} \cos \theta$ in direction $\hat{\mathbf{e}}_{1}$ , where $\theta$ is the angle between $\mathbf{e}_{1}$ and $\mathbf{a}_{2}$ . Using the definition of the dot product, $\mathbf{a} \cdot \mathbf{b}=a b \cos \theta$ , the projection formula follows.

Combining Equations (7.1)-(7.2), we find that

$\mathbf{e}_{2}=\mathbf{a}_{2}-\dfrac{\mathbf{a}_{2} \cdot \mathbf{e}_{1}}{e_{1}^{2}} \mathbf{e}_{1} \nonumber$

It is a simple matter to verify that $\mathbf{e}_{2}$ is orthogonal to $\mathbf{e}_{1}$ :

$\begin{aligned} \mathbf{e}_{2} \cdot \mathbf{e}_{1} &=\mathbf{a}_{2} \cdot \mathbf{e}_{1}-\dfrac{\mathbf{a}_{2} \cdot \mathbf{e}_{1}}{e_{1}^{2}} \mathbf{e}_{1} \cdot \mathbf{e}_{1} \\[4pt] &=\mathbf{a}_{2} \cdot \mathbf{e}_{1}-\mathbf{a}_{2} \cdot \mathbf{e}_{1}=0 \end{aligned} \nonumber$

Now, we seek a third vector $\mathbf{e}_{3}$ that is orthogonal to both $\mathbf{e}_{1}$ and $\mathbf{e}_{2}$ . Pictorially, we can write the given vector $\mathbf{a}_{3}$ as a combination of vector projections along $\mathbf{e}_{1}$ and $\mathbf{e}_{2}$ and the new vector. This is shown in Figure 7.3. Then we have,

$\mathbf{e}_{3}=\mathbf{a}_{3}-\dfrac{\mathbf{a}_{3} \cdot \mathbf{e}_{1}}{e_{1}^{2}} \mathbf{e}_{1}-\dfrac{\mathbf{a}_{3} \cdot \mathbf{e}_{2}}{e_{2}^{2}} \mathbf{e}_{2} . \nonumber$

Again, it is a simple matter to compute the scalar products with $\mathbf{e}_{1}$ and $\mathbf{e}_{2}$ to verify orthogonality.

We can easily generalize the procedure to the $N$ -dimensional case.

Gram-Schmidt Orthogonalization in $N$ -Dimensions

Figure 7.3. A plot of the vectors and their projections for determining $\mathbf{e}_{3}$ .

Now, we can generalize this idea to (real) function spaces.

Gram-Schmidt Orthogonalization for Function Spaces

Let $f_{n}(x), n \in N_{0}=\{0,1,2, \ldots\}$ , be a linearly independent sequence of continuous functions defined for $x \in[a, b]$ . Then, an orthogonal basis of functions, $\phi_{n}(x), n \in N_{0}$ can be found and is given by

$\phi_{0}(x)=f_{0}(x) \nonumber$

and

$\phi_{n}(x)=f_{n}(x)-\sum_{j=0}^{n-1} \dfrac{<f_{n}, \phi_{j}>}{\left\|\phi_{j}\right\|^{2}} \phi_{j}(x), \quad n=1,2, \ldots \nonumber$

Here we are using inner products relative to weight $\sigma(x)$ ,

$<f, g>=\int_{a}^{b} f(x) g(x) \sigma(x) d x \nonumber$

Note the similarity between the orthogonal basis in (7.7) and the expression for the finite dimensional case in Equation (7.6).

Example 7.1. Apply the Gram-Schmidt Orthogonalization process to the set $f_{n}(x)=x^{n}, n \in N_{0}$ , when $x \in(-1,1)$ and $\sigma(x)=1$ .

First, we have $\phi_{0}(x)=f_{0}(x)=1$ . Note that

$\int_{-1}^{1} \phi_{0}^{2}(x) d x=\dfrac{1}{2} \nonumber$

We could use this result to fix the normalization of our new basis, but we will hold off on doing that for now.

Now, we compute the second basis element:

$\begin{aligned} \phi_{1}(x) &=f_{1}(x)-\dfrac{<f_{1}, \phi_{0}>}{\left\|\phi_{0}\right\|^{2}} \phi_{0}(x) \\[4pt] &=x-\dfrac{<x, 1>}{\|1\|^{2}} 1=x \end{aligned} \nonumber$

since $<x, 1>$ is the integral of an odd function over a symmetric interval.

For $\phi_{2}(x)$ , we have

$\begin{aligned} \phi_{2}(x) &=f_{2}(x)-\dfrac{<f_{2}, \phi_{0}>}{\left\|\phi_{0}\right\|^{2}} \phi_{0}(x)-\dfrac{<f_{2}, \phi_{1}>}{\left\|\phi_{1}\right\|^{2}} \phi_{1} \\[4pt] &=x^{2}-\dfrac{<x^{2}, 1>}{\|1\|^{2}} 1-\dfrac{<x^{2}, x>}{\|x\|^{2}} x \\[4pt] &=x^{2}-\dfrac{\int_{-1}^{1} x^{2} d x}{\int_{-1}^{1} d x} \\[4pt] &=x^{2}-\dfrac{1}{3} \end{aligned} \nonumber$

So far, we have the orthogonal set $\left\{1, x, x^{2}-\dfrac{1}{3}\right\}$ . If one chooses to normalize these by forcing $\phi_{n}(1)=1$ , then one obtains the classical Legendre polynomials, $P_{n}(x)=\phi_{1}(x)$ . Thus,

$P_{2}(x)=\dfrac{1}{2}\left(3 x^{2}-1\right) . \nonumber$

Note that this normalization is different than the usual one. In fact, we see that $P_{2}(x)$ does not have a unit norm,

$\left\|P_{2}\right\|^{2}=\int_{-1}^{1} P_{2}^{2}(x) d x=\dfrac{2}{5} \nonumber$

The set of Legendre polynomials is just one set of classical orthogonal polynomials that can be obtained in this way. Many had originally appeared as solutions of important boundary value problems in physics. They all have similar properties and we will just elaborate some of these for the Legendre functions in the next section. Other orthogonal polynomials in this group are shown in Table $7.1$ .

For reference, we also note the differential equations satisfied by these functions.

$7.2$ Legendre Polynomials

In the last section we saw the Legendre polynomials in the context of orthogonal bases for a set of square integrable functions in $L^{2}(-1,1)$ . In your first course in differential equations, you saw these polynomials as one of the solutions of the differential equation

Polynomial	Symbol	Interval	$\sigma(x)$
Hermite	$H_{n}(x)$	$(-\infty, \infty)$	$e^{-x^{2}}$
Laguerre	$L_{n}^{\alpha}(x)$	$[0, \infty)$	$e^{-x}$
Legendre	$P_{n}(x)$	$(-1,1)$	1
Gegenbauer	$C_{n}^{\lambda}(x)$	$(-1,1)$	$\left(1-x^{2}\right)^{\lambda-1 / 2}$
Tchebychef of the 1st kind	$T_{n}(x)$	$(-1,1)$	$\left(1-x^{2}\right)^{-1 / 2}$
Tchebychef of the 2 nd kind	$U_{n}(x)$	$(-1,1)$	$\left(1-x^{2}\right)^{-1 / 2}$
Jacobi	$P_{n}^{(\nu, \mu)}(x)$	$(-1,1)$	$(1-x)^{\nu}(1-x)^{\mu}$

Table 7.1. Common classical orthogonal polynomials with the interval and weight function used to define them.

Polynomial	Differential Equation
Hermite	$y^{\prime \prime}-2 x y^{\prime}+2 n y=0$
Laguerre	$x y^{\prime \prime}+(\alpha+1-x) y^{\prime}+n y=0$
Legendre	$\left(1-x^{2}\right) y^{\prime \prime}-2 x y^{\prime}+n(n+1) y=0$
Gegenbauer	$\left(1-x^{2}\right) y^{\prime \prime}-(2 n+3) x y^{\prime}+\lambda y=0$
	$\left(1-x^{2}\right) y^{\prime \prime}-x y^{\prime}+n^{2} y=0$
Tchebychef of the 1st kind	$\left(1-x^{2}\right) y^{\prime \prime}+(\nu-\mu+(\mu+\nu+2) x) y^{\prime}+n(n+1+\mu+\nu) y=0$
Jacobi

Table $7.2$ . Differential equations satisfied by some of the common classical orthogonal polynomials.

$\left(1-x^{2}\right) y^{\prime \prime}-2 x y^{\prime}+n(n+1) y=0, \quad n \in N_{0} . \nonumber$

Recall that these were obtained by using power series expansion methods. In this section we will explore a few of the properties of these functions.

For completeness, we recall the solution of Equation (7.11) using the power series method. We assume that the solution takes the form

$y(x)=\sum_{k=0}^{\infty} a_{k} x^{k} . \nonumber$

The goal is to determine the coefficients, $a_{k}$ . Inserting this series into Equation (7.11), we have

$\left(1-x^{2}\right) \sum_{k=0}^{\infty} k(k-1) a_{k} x^{k-2}-\sum_{k=0}^{\infty} 2 a_{k} k x^{k}+\sum_{k=0}^{\infty} n(n+1) a_{k} x^{k}=0 \nonumber$

$\sum_{k=2}^{\infty} k(k-1) a_{k} x^{k-2}-\sum_{k=2}^{\infty} k(k-1) a_{k} x^{k}+\sum_{k=0}^{\infty}[-2 k+n(n+1)] a_{k} x^{k}=0 \nonumber$

We can combine some of these terms:

$\sum_{k=2}^{\infty} k(k-1) a_{k} x^{k-2}+\sum_{k=0}^{\infty}[-k(k-1)-2 k+n(n+1)] a_{k} x^{k}=0 . \nonumber$

Further simplification yields

$\sum_{k=2}^{\infty} k(k-1) a_{k} x^{k-2}+\sum_{k=0}^{\infty}[n(n+1)-k(k+1)] a_{k} x^{k}=0 \nonumber$

We need to collect like powers of $x$ . This can be done by reindexing each sum. In the first sum, we let $m=k-2$ , or $k=m+2$ . In the second sum we independently let $k=m$ . Then all powers of $x$ are of the form $x^{m}$ . This gives

$\sum_{m=0}^{\infty}(m+2)(m+1) a_{m+2} x^{m}+\sum_{m=0}^{\infty}[n(n+1)-m(m+1)] a_{m} x^{m}=0 \nonumber$

Combining these sums, we have

$\sum_{m=0}^{\infty}\left[(m+2)(m+1) a_{m+2}+(n(n+1)-m(m+1)) a_{m}\right] x^{m}=0 \nonumber$

This has to hold for all $x$ . So, the coefficients of $x^{m}$ must vanish:

$(m+2)(m+1) a_{m+2}+(n(n+1)-m(m+1)) a_{m} \nonumber$

Solving for $a_{m+2}$ , we obtain the recursion relation

$a_{m+2}=\dfrac{n(n+1)-m(m+1)}{(m+2)(m+1)} a_{m}, \quad m \geq 0 . \nonumber$

Thus, $a_{m+2}$ is proportional to $a_{m}$ . We can iterate and show that each coefficient is either proportional to $a_{0}$ or $a_{1}$ . However, for $n$ an integer, sooner, or later, $m=n$ and the series truncates. $a_{m}=0$ for $m>n$ . Thus, we obtain polynomial solutions. These polynomial solutions are the Legendre polynomials, which we designate as $y(x)=P_{n}(x)$ . Furthermore, for $n$ an even integer, $P_{n}(x)$ is an even function and for $n$ an odd integer, $P_{n}(x)$ is an odd function.

Actually, this is a trimmed down version of the method. We would need to find a second linearly independent solution. We will not discuss these solutions and leave that for the interested reader to investigate.

The Rodrigues Formula

The first property that the Legendre polynomials have is the Rodrigues formula:

$P_{n}(x)=\dfrac{1}{2^{n} n !} \dfrac{d^{n}}{d x^{n}}\left(x^{2}-1\right)^{n}, \quad n \in N_{0} . \nonumber$

From the Rodrigues formula, one can show that $P_{n}(x)$ is an $n$ th degree polynomial. Also, for $n$ odd, the polynomial is an odd function and for $n$ even, the polynomial is an even function.

As an example, we determine $P_{2}(x)$ from Rodrigues formula:

$\begin{aligned} P_{2}(x) &=\dfrac{1}{2^{2} 2 !} \dfrac{d^{2}}{d x^{2}}\left(x^{2}-1\right)^{2} \\[4pt] &=\dfrac{1}{8} \dfrac{d^{2}}{d x^{2}}\left(x^{4}-2 x^{2}+1\right) \\[4pt] &=\dfrac{1}{8} \dfrac{d}{d x}\left(4 x^{3}-4 x\right) \\[4pt] &=\dfrac{1}{8}\left(12 x^{2}-4\right) \\[4pt] &=\dfrac{1}{2}\left(3 x^{2}-1\right) \end{aligned} \nonumber$

Note that we get the same result as we found in the last section using orthogonalization.

One can systematically generate the Legendre polynomials in tabular form as shown in Table $7.2 .1$ . In Figure $7.4$ we show a few Legendre polynomials.

$n$	$\left(x^{2}-1\right)^{n}$	$\dfrac{d^{n}}{d x^{n}}\left(x^{2}-1\right)^{n}$	$\dfrac{1}{2^{n} n !}$	$P_{n}(x)$
0	1	1	1	1
1	$x^{2}-1$	$2 x$	$\dfrac{1}{2}$	$x$
2	$x^{4}-2 x^{2}+1$	$12 x^{2}-4$	$\dfrac{1}{8}$	$\dfrac{1}{2}\left(3 x^{2}-1\right)$
3	$x^{6}-3 x^{4}+3 x^{2}-1$	$120 x^{3}-72 x$	$\dfrac{1}{48}$	$\dfrac{1}{2}\left(5 x^{3}-3 x\right)$

Table 7.3. Tabular computation of the Legendre polynomials using the Rodrigues formula.

Figure 7.4. Plots of the Legendre polynomials $P_{2}(x), P_{3}(x), P_{4}(x)$ , and $P_{5}(x)$ .

Three Term Recursion Formula

The classical orthogonal polynomials also satisfy three term recursion formulae. In the case of the Legendre polynomials, we have

$(2 n+1) x P_{n}(x)=(n+1) P_{n+1}(x)+n P_{n-1}(x), \quad n=1,2, \ldots \nonumber$

This can also be rewritten by replacing $n$ with $n-1$ as

$(2 n-1) x P_{n-1}(x)=n P_{n}(x)+(n-1) P_{n-2}(x), \quad n=1,2, \ldots \nonumber$

We will prove this recursion formula in two ways. First we use the orthogonality properties of Legendre polynomials and the following lemma.

Lemma 7.2. The leading coefficient of $x^{n}$ in $P_{n}(x)$ is $\dfrac{1}{2^{n} n !} \dfrac{(2 n) !}{n !}$ .

Proof. We can prove this using Rodrigues formula. first, we focus on the leading coefficient of $\left(x^{2}-1\right)^{n}$ , which is $x^{2 n}$ . The first derivative of $x^{2 n}$ is $2 n x^{2 n-1}$ . The second derivative is $2 n(2 n-1) x^{2 n-2}$ . The $j$ th derivative is

$\dfrac{d^{j} x^{2 n}}{d x^{j}}=[2 n(2 n-1) \ldots(2 n-j+1)] x^{2 n-j} \nonumber$

Thus, the $n$ th derivative is given by

$\dfrac{d^{n} x^{2 n}}{d x^{n}}=[2 n(2 n-1) \ldots(n+1)] x^{n} \nonumber$

This proves that $P_{n}(x)$ has degree $n$ . The leading coefficient of $P_{n}(x)$ can now be written as

$\begin{aligned} \dfrac{1}{2^{n} n !}[2 n(2 n-1) \ldots(n+1)] &=\dfrac{1}{2^{n} n !}[2 n(2 n-1) \ldots(n+1)] \dfrac{n(n-1) \ldots 1}{n(n-1) \ldots 1} \\[4pt] &=\dfrac{1}{2^{n} n !} \dfrac{(2 n) !}{n !} \end{aligned} \nonumber$

In order to prove the three term recursion formula we consider the expression $(2 n-1) x P_{n-1}(x)-n P_{n}(x)$ . While each term is a polynomial of degree $n$ , the leading order terms cancel. We need only look at the coefficient of the leading order term first expression. It is

$(2 n-1) \dfrac{1}{2^{n-1}(n-1) !} \dfrac{(2 n-2) !}{(n-1) !}=\dfrac{1}{2^{n-1}(n-1) !} \dfrac{(2 n-1) !}{(n-1) !}=\dfrac{(2 n-1) !}{2^{n-1}[(n-1) !]^{2}} . \nonumber$

The coefficient of the leading term for $n P_{n}(x)$ can be written as

$n \dfrac{1}{2^{n} n !} \dfrac{(2 n) !}{n !}=n\left(\dfrac{2 n}{2 n^{2}}\right)\left(\dfrac{1}{2^{n-1}(n-1) !}\right) \dfrac{(2 n-1) !}{(n-1) !} \dfrac{(2 n-1) !}{2^{n-1}[(n-1) !]^{2}} . \nonumber$

It is easy to see that the leading order terms in $(2 n-1) x P_{n-1}(x)-n P_{n}(x)$ cancel.

The next terms will be of degree $n-2$ . This is because the $P_{n}$ ’s are either even or odd functions, thus only containing even, or odd, powers of $x$ . We conclude that

$(2 n-1) x P_{n-1}(x)-n P_{n}(x)=\text { polynomial of degree } n-2 . \nonumber$

Therefore, since the Legendre polynomials form a basis, we can write this polynomial as a linear combination of of Legendre polynomials:

$(2 n-1) x P_{n-1}(x)-n P_{n}(x)=c_{0} P_{0}(x)+c_{1} P_{1}(x)+\ldots+c_{n-2} P_{n-2}(x) . \nonumber$

Multiplying Equation $(7.17)$ by $P_{m}(x)$ for $m=0,1, \ldots, n-3$ , integrating from $-1$ to 1 , and using orthogonality, we obtain

$0=c_{m}\left\|P_{m}\right\|^{2}, \quad m=0,1, \ldots, n-3 . \nonumber$

[Note: $\int_{-1}^{1} x^{k} P_{n}(x) d x=0$ for $k \leq n-1$ . Thus, $\int_{-1}^{1} x P_{n-1}(x) P_{m}(x) d x=0$ for $m \leq n-3 .]$

Thus, all of these $c_{m}$ ’s are zero, leaving Equation (7.17) as

$(2 n-1) x P_{n-1}(x)-n P_{n}(x)=c_{n-2} P_{n-2}(x) . \nonumber$

The final coefficient can be found by using the normalization condition, $P_{n}(1)=1$ . Thus, $c_{n-2}=(2 n-1)-n=n-1$ .

The Generating Function

Figure 7.5. The position vectors used to describe the tidal force on the Earth due to the moon.

where $\theta$ is the angle between $\mathbf{r}_{1}$ and $\mathbf{r}_{2}$ .

Typically, one of the position vectors is much larger than the other. Let’s assume that $r_{1} \ll r_{2}$ . Then, one can write

$\Phi \propto \dfrac{1}{\sqrt{r_{1}^{2}-2 r_{1} r_{2} \cos \theta+r_{2}^{2}}}=\dfrac{1}{r_{2}} \dfrac{1}{\sqrt{1-2 \dfrac{r_{1}}{r_{2}} \cos \theta+\left(\dfrac{r_{1}}{r_{2}}\right)^{2}}} \nonumber$

Now, define $x=\cos \theta$ and $t=\dfrac{r_{1}}{r_{2}}$ . We then have the tidal potential is proportional to the generating function for the Legendre polynomials! So, we can write the tidal potential as

$\Phi \propto \dfrac{1}{r_{2}} \sum_{n=0}^{\infty} P_{n}(\cos \theta)\left(\dfrac{r_{1}}{r_{2}}\right)^{n} . \nonumber$

The first term in the expansion is the gravitational potential that gives the usual force between the Earth and the moon. [Recall that the force is the gradient of the potential, $\mathbf{F}=\nabla\left(\dfrac{1}{r}\right)$ .] The next terms will give expressions for the tidal effects

Now that we have some idea as to where this generating function might have originated, we can proceed to use it. First of all, the generating function can be used to obtain special values of the Legendre polynomials.

Example 7.3. Evaluate $P_{n}(0) . P_{n}(0)$ is found by considering $g(0, t)$ . Setting $x=0$ in Equation (7.18), we have

$g(0, t)=\dfrac{1}{\sqrt{1+t^{2}}}=\sum_{n=0}^{\infty} P_{n}(0) t^{n} \nonumber$

We can use the binomial expansion to find our final answer. [See the last section of this chapter for a review.] Namely, we have

$\dfrac{1}{\sqrt{1+t^{2}}}=1-\dfrac{1}{2} t^{2}+\dfrac{3}{8} t^{4}+\ldots \nonumber$

Comparing these expansions, we have the $P_{n}(0)=0$ for $n$ odd and for even integers one can show (see Problem $7.10$ ) that

$P_{2 n}(0)=(-1)^{n} \dfrac{(2 n-1) ! !}{(2 n) ! !} \nonumber$

where $n ! !$ is the double factorial,

$n ! !=\left\{\begin{array}{c} n(n-2) \ldots(3) 1, n>0, \text { odd } \\[4pt] n(n-2) \ldots(4) 2, n>0, \text { even } \\[4pt] 1 \quad n=0,-1 \end{array}\right. \nonumber$

Example 7.4. Evaluate $P_{n}(-1)$ . This is a simpler problem. In this case we have

$g(-1, t)=\dfrac{1}{\sqrt{1+2 t+t^{2}}}=\dfrac{1}{1+t}=1-t+t^{2}-t^{3}+\ldots \nonumber$

Therefore, $P_{n}(-1)=(-1)^{n}$ .

We can also use the generating function to find recursion relations. To prove the three term recursion (7.14) that we introduced above, then we need only differentiate the generating function with respect to $t$ in Equation (7.18) and rearrange the result. First note that

$\dfrac{\partial g}{\partial t}=\dfrac{x-t}{\left(1-2 x t+t^{2}\right)^{3 / 2}}=\dfrac{x-t}{1-2 x t+t^{2}} g(x, t) \nonumber$

Combining this with

$\dfrac{\partial g}{\partial t}=\sum_{n=0}^{\infty} n P_{n}(x) t^{n-1} \nonumber$

we have

$(x-t) g(x, t)=\left(1-2 x t+t^{2}\right) \sum_{n=0}^{\infty} n P_{n}(x) t^{n-1} \nonumber$

Inserting the series expression for $g(x, t)$ and distributing the sum on the right side, we obtain

$(x-t) \sum_{n=0}^{\infty} P_{n}(x) t^{n}=\sum_{n=0}^{\infty} n P_{n}(x) t^{n-1}-\sum_{n=0}^{\infty} 2 n x P_{n}(x) t^{n}+\sum_{n=0}^{\infty} n P_{n}(x) t^{n+1} \nonumber$

Rearranging leads to three separate sums:

$\sum_{n=0}^{\infty} n P_{n}(x) t^{n-1}-\sum_{n=0}^{\infty}(2 n+1) x P_{n}(x) t^{n}+\sum_{n=0}^{\infty}(n+1) P_{n}(x) t^{n+1}=0 \nonumber$

Each term contains powers of $t$ that we would like to combine into a single sum. This is done by reindexing. For the first sum, we could use the new index $k=n-1$ . Then, the first sum can be written

$\sum_{n=0}^{\infty} n P_{n}(x) t^{n-1}=\sum_{k=-1}^{\infty}(k+1) P_{k+1}(x) t^{k} \nonumber$

Using different indices is just another way of writing out the terms. Note that

$\sum_{n=0}^{\infty} n P_{n}(x) t^{n-1}=0+P_{1}(x)+2 P_{2}(x) t+3 P_{3}(x) t^{2}+\ldots \nonumber$

and

$\sum_{k=-1}^{\infty}(k+1) P_{k+1}(x) t^{k}=0+P_{1}(x)+2 P_{2}(x) t+3 P_{3}(x) t^{2}+\ldots \nonumber$

actually give the same sum. The indices are sometimes referred to as dummy indices because they do not show up in the expanded expression and can be replaced with another letter.

If we want to do so, we could now replace all of the $k$ ’s with $n$ ’s. However, we will leave the $k$ ’s in the first term and now reindex the next sums in Equation (7.21). The second sum just needs the replacement $n=k$ and the last sum we reindex using $k=n+1$ . Therefore, Equation (7.21) becomes

$\sum_{k=-1}^{\infty}(k+1) P_{k+1}(x) t^{k}-\sum_{k=0}^{\infty}(2 k+1) x P_{k}(x) t^{k}+\sum_{k=1}^{\infty} k P_{k-1}(x) t^{k}=0 . \nonumber$

We can now combine all of the terms, noting the $k=-1$ term is automatically zero and the $k=0$ terms give

$P_{1}(x)-x P_{0}(x)=0 . \nonumber$

Of course, we know this already. So, that leaves the $k>0$ terms:

$\sum_{k=1}^{\infty}\left[(k+1) P_{k+1}(x)-(2 k+1) x P_{k}(x)+k P_{k-1}(x)\right] t^{k}=0 \nonumber$

Since this is true for all $t$ , the coefficients of the $t^{k}$ ’s are zero, or

$(k+1) P_{k+1}(x)-(2 k+1) x P_{k}(x)+k P_{k-1}(x)=0, \quad k=1,2, \ldots \nonumber$

There are other recursion relations. For example,

$P_{n+1}^{\prime}(x)-P_{n-1}^{\prime}(x)=(2 n+1) P_{n}(x) . \nonumber$

This can be proven using the generating function by differentiating $g(x, t)$ with respect to $x$ and rearranging the resulting infinite series just as in this last manipulation. This will be left as Problem 7.4.

Another use of the generating function is to obtain the normalization constant. Namely, $\left\|P_{n}\right\|^{2}$ . Squaring the generating function, we have

$\dfrac{1}{1-2 x t+t^{2}}=\left[\sum_{n=0}^{\infty} P_{n}(x) t^{n}\right]^{2}=\sum_{n=0}^{\infty} \sum_{m=0}^{\infty} P_{n}(x) P_{m}(x) t^{n+m} \nonumber$

Integrating from -1 to 1 and using the orthogonality of the Legendre polynomials, we have

$\begin{aligned} \int_{-1}^{1} \dfrac{d x}{1-2 x t+t^{2}} &=\sum_{n=0}^{\infty} \sum_{m=0}^{\infty} t^{n+m} \int_{-1}^{1} P_{n}(x) P_{m}(x) d x \\[4pt] &=\sum_{n=0}^{\infty} t^{2 n} \int_{-1}^{1} P_{n}^{2}(x) d x \end{aligned} \nonumber$

However, one can show that

$\int_{-1}^{1} \dfrac{d x}{1-2 x t+t^{2}}=\dfrac{1}{t} \ln \left(\dfrac{1+t}{1-t}\right) \nonumber$

Expanding this expression about $t=0$ , we obtain

$\dfrac{1}{t} \ln \left(\dfrac{1+t}{1-t}\right)=\sum_{n=0}^{\infty} \dfrac{2}{2 n+1} t^{2 n} \nonumber$

Comparing this result with Equation $(7.27)$ , we find that

$\left\|P_{n}\right\|^{2}=\int_{-1}^{1} P_{n}(x) P_{m}(x) d x=\dfrac{2}{2 n+1} . \nonumber$

Eigenfunction Expansions

Finally, we can expand other functions in this orthogonal basis. This is just a generalized Fourier series. A Fourier-Legendre series expansion for $f(x)$ on $[-1,1]$ takes the form

$f(x) \sim \sum_{n=0}^{\infty} c_{n} P_{n}(x) . \nonumber$

As before, we can determine the coefficients by multiplying both sides by $P_{m}(x)$ and integrating. Orthogonality gives the usual form for the generalized Fourier coefficients. In this case, we have

$c_{n}=\dfrac{<f, P_{n}>}{\left\|P_{n}\right\|^{2}}, \nonumber$

where

$<f, P_{n}>=\int_{-1}^{1} f(x) P_{n}(x) d x \nonumber$

We have just found $\left\|P_{n}\right\|^{2}=\dfrac{2}{2 n+1}$ . Therefore, the Fourier-Legendre coefficients are

$c_{n}=\dfrac{2 n+1}{2} \int_{-1}^{1} f(x) P_{n}(x) d x . \nonumber$

Example 7.5. Expand $f(x)=x^{3}$ in a Fourier-Legendre series.

We simply need to compute

$c_{n}=\dfrac{2 n+1}{2} \int_{-1}^{1} x^{3} P_{n}(x) d x . \nonumber$

We first note that

$\int_{-1}^{1} x^{m} P_{n}(x) d x=0 \quad \text { for } m<n \nonumber$

This is simply proven using Rodrigues formula. Inserting Equation (7.12), we have

$\int_{-1}^{1} x^{m} P_{n}(x) d x=\dfrac{1}{2^{n} n !} \int_{-1}^{1} x^{m} \dfrac{d^{n}}{d x^{n}}\left(x^{2}-1\right)^{n} d x \nonumber$

Since $m<n$ , we can integrate by parts $m$ -times to show the result, using $P_{n}(1)=1$ and $P_{n}(-1)=(-1)^{n}$ . As a result, we will have for this example that $c_{n}=0$ for $n>3$ .

We could just compute $\int_{-1}^{1} x^{3} P_{m}(x) d x$ for $m=0,1,2, \ldots$ outright. But, noting that $x^{3}$ is an odd function, we easily confirm that $c_{0}=0$ and $c_{2}=0$ . This leaves us with only two coefficients to compute. These are

$c_{1}=\dfrac{3}{2} \int_{-1}^{1} x^{4} d x=\dfrac{3}{5} \nonumber$

and

$c_{3}=\dfrac{7}{2} \int_{-1}^{1} x^{3}\left[\dfrac{1}{2}\left(5 x^{3}-3 x\right)\right] d x=\dfrac{2}{5} \nonumber$

Thus,

$x^{3}=\dfrac{3}{5} P_{1}(x)+\dfrac{2}{5} P_{3}(x) . \nonumber$

Of course, this is simple to check using Table $7.2 .1$ :

$\dfrac{3}{5} P_{1}(x)+\dfrac{2}{5} P_{3}(x)=\dfrac{3}{5} x+\dfrac{2}{5}\left[\dfrac{1}{2}\left(5 x^{3}-3 x\right)\right]=x^{3} \nonumber$

Well, maybe we could have guessed this without doing any integration. Let’s see,

$\begin{aligned} x^{3} &=c_{1} x+\dfrac{1}{2} c_{2}\left(5 x^{3}-3 x\right) \\[4pt] &=\left(c_{1}-\dfrac{3}{2} c_{2}\right) x+\dfrac{5}{2} c_{2} x^{3} \end{aligned} \nonumber$

Equating coefficients of like terms, we have that $c_{2}=\dfrac{2}{5}$ and $c_{1}=\dfrac{3}{2} c_{2}=\dfrac{3}{5}$ . Example 7.6. Expand the Heaviside function in a Fourier-Legendre series.

The Heaviside function is defined as

$H(x)=\left\{\begin{array}{l} 1, x>0 \\[4pt] 0, x<0 \end{array}\right. \nonumber$

In this case, we cannot find the expansion coefficients without some integration. We have to compute

$\begin{aligned} c_{n} &=\dfrac{2 n+1}{2} \int_{-1}^{1} f(x) P_{n}(x) d x \\[4pt] &=\dfrac{2 n+1}{2} \int_{0}^{1} P_{n}(x) d x, \quad n=0,1,2, \ldots \end{aligned} \nonumber$

For $n=0$ , we have

$c_{0}=\dfrac{1}{2} \int_{0}^{1} d x=\dfrac{1}{2} . \nonumber$

For $n>1$ , we make use of the identity $(7.25)$ to find

$c_{n}=\dfrac{1}{2} \int_{0}^{1}\left[P_{n+1}^{\prime}(x)-P_{n-1}^{\prime}(x)\right] d x=\dfrac{1}{2}\left[P_{n-1}(0)-P_{n+1}(0)\right] . \nonumber$

Thus, the Fourier-Bessel series for the Heaviside function is

$f(x) \sim \dfrac{1}{2}+\dfrac{1}{2} \sum_{n=1}^{\infty}\left[P_{n-1}(0)-P_{n+1}(0)\right] P_{n}(x) . \nonumber$

We need to evaluate $P_{n-1}(0)-P_{n+1}(0)$ . Since $P_{n}(0)=0$ for $n$ odd, the $c_{n}$ ’s vanish for $n$ even. Letting $n=2 k-1$ , we have

$f(x) \sim \dfrac{1}{2}+\dfrac{1}{2} \sum_{k=1}^{\infty}\left[P_{2 k-2}(0)-P_{2 k}(0)\right] P_{2 k-1}(x) . \nonumber$

We can use Equation $(7.20)$

$P_{2 k}(0)=(-1)^{k} \dfrac{(2 k-1) ! !}{(2 k) ! !}, \nonumber$

to compute the coefficients:

$\begin{aligned} f(x) & \sim \dfrac{1}{2}+\dfrac{1}{2} \sum_{k=1}^{\infty}\left[P_{2 k-2}(0)-P_{2 k}(0)\right] P_{2 k-1}(x) \\[4pt] &\left.=\dfrac{1}{2}+\dfrac{1}{2} \sum_{k=1}^{\infty}\left[(-1)^{k-1} \dfrac{(2 k-3) ! !}{(2 k-2) ! !}-(-1)^{k} \dfrac{(2 k-1) ! !}{(2 k) ! !}\right] P_{2 k-1}(x)\right) \\[4pt] &=\dfrac{1}{2}-\dfrac{1}{2} \sum_{k=1}^{\infty}(-1)^{k} \dfrac{(2 k-3) ! !}{(2 k-2) ! !}\left[1+\dfrac{2 k-1}{2 k}\right] P_{2 k-1}(x) \\[4pt] &=\dfrac{1}{2}-\dfrac{1}{2} \sum_{k=1}^{\infty}(-1)^{k} \dfrac{(2 k-3) ! !}{(2 k-2) ! !} \dfrac{4 k-1}{2 k} P_{2 k-1}(x) \end{aligned} \nonumber$

The sum of the first 21 terms are shown in Figure 7.6. We note the slow convergence to the Heaviside function. Also, we see that the Gibbs phenomenon is present due to the jump discontinuity at $x=0$ .

Figure 7.6. Sum of first 21 terms for Fourier-Legendre series expansion of Heaviside function.

Gamma Function

Another function that often occurs in the study of special functions is the Gamma function. We will need the Gamma function in the next section on Bessel functions.

For $x>0$ we define the Gamma function as

$\Gamma(x)=\int_{0}^{\infty} t^{x-1} e^{-t} d t, \quad x>0 . \nonumber$

The Gamma function is a generalization of the factorial function. In fact, we have

$\Gamma(1)=1 \nonumber$

and

$\Gamma(x+1)=x \Gamma(x) . \nonumber$

The reader can prove this identity by simply performing an integration by parts. (See Problem 7.7.) In particular, for integers $n \in Z^{+}$ , we then have

$\Gamma(n+1)=n \Gamma(n)=n(n-1) \Gamma(n-2)=n(n-1) \cdots 2 \Gamma(1)=n ! \nonumber$

We can also define the Gamma function for negative, non-integer values of $x$ . We first note that by iteration on $n \in Z^{+}$ , we have

$\Gamma(x+n)=(x+n-1) \cdots(x+1) x \Gamma(x), \quad x<0, \quad x+n>0 \nonumber$

Solving for $\Gamma(x)$ , we then find

$\Gamma(x)=\dfrac{\Gamma(x+n)}{(x+n-1) \cdots(x+1) x}, \quad-n<x<0 \nonumber$

Note that the Gamma function is undefined at zero and the negative integers.

Example 7.7. We now prove that

$\Gamma\left(\dfrac{1}{2}\right)=\sqrt{\pi} . \nonumber$

This is done by direct computation of the integral:

$\Gamma\left(\dfrac{1}{2}\right)=\int_{0}^{\infty} t^{-\dfrac{1}{2}} e^{-t} d t \nonumber$

Letting $t=z^{2}$ , we have

$\Gamma\left(\dfrac{1}{2}\right)=2 \int_{0}^{\infty} e^{-z^{2}} d z \nonumber$

Due to the symmetry of the integrand, we obtain the classic integral

$\Gamma\left(\dfrac{1}{2}\right)=\int_{-\infty}^{\infty} e^{-z^{2}} d z \nonumber$

which can be performed using a standard trick. Consider the integral

$I=\int_{-\infty}^{\infty} e^{-x^{2}} d x \nonumber$

Then,

$I^{2}=\int_{-\infty}^{\infty} e^{-x^{2}} d x \int_{-\infty}^{\infty} e^{-y^{2}} d y . \nonumber$

Note that we changed the integration variable. This will allow us to write this product of integrals as a double integral:

$I^{2}=\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} e^{-\left(x^{2}+y^{2}\right)} d x d y . \nonumber$

This is an integral over the entire $x y$ -plane. We can transform this Cartesian integration to an integration over polar coordinates. The integral becomes

$I^{2}=\int_{0}^{2 \pi} \int_{0}^{\infty} e^{-r^{2}} r d r d \theta \nonumber$

This is simple to integrate and we have $I^{2}=\pi$ . So, the final result is found by taking the square root of both sides:

$\Gamma\left(\dfrac{1}{2}\right)=I=\sqrt{\pi} . \nonumber$

We have seen that the factorial function can be written in terms of Gamma functions. One can write the even and odd double factorials as

$(2 n) ! !=2^{n} n !, \quad(2 n+1) ! !=\dfrac{(2 n+1) !}{2^{n} n !} \nonumber$

In particular, one can write

$\Gamma\left(n+\dfrac{1}{2}\right)=\dfrac{(2 n-1) ! !}{2^{n}} \sqrt{\pi} \nonumber$

Another useful relation, which we only state, is

$\Gamma(x) \Gamma(1-x)=\dfrac{\pi}{\sin \pi x} \nonumber$

$7.4$ Bessel Functions

Another important differential equation that arises in many physics applications is

$x^{2} y^{\prime \prime}+x y^{\prime}+\left(x^{2}-p^{2}\right) y=0 . \nonumber$

This equation is readily put into self-adjoint form as

$\left(x y^{\prime}\right)^{\prime}+\left(x-\dfrac{p^{2}}{x}\right) y=0 . \nonumber$

This equation was solved in the first course on differential equations using power series methods, namely by using the Frobenius Method. One assumes a series solution of the form

$y(x)=\sum_{n=0}^{\infty} a_{n} x^{n+s} \nonumber$

and one seeks allowed values of the constant $s$ and a recursion relation for the coefficients, $a_{n}$ . One finds that $s=\pm p$ and

$a_{n}=-\dfrac{a_{n-2}}{(n+s)^{2}-p^{2}}, \quad n \geq 2 . \nonumber$

One solution of the differential equation is the Bessel function of the first kind of order $p$ , given as

$y(x)=J_{p}(x)=\sum_{n=0}^{\infty} \dfrac{(-1)^{n}}{\Gamma(n+1) \Gamma(n+p+1)}\left(\dfrac{x}{2}\right)^{2 n+p} . \nonumber$

In Figure $7.7$ we display the first few Bessel functions of the first kind of integer order. Note that these functions can be described as decaying oscillatory functions.

Figure 7.7. Plots of the Bessel functions $J_{0}(x), J_{1}(x), J_{2}(x)$ , and $J_{3}(x)$ .

A second linearly independent solution is obtained for $p$ not an integer as $J_{-p}(x)$ . However, for $p$ an integer, the $\Gamma(n+p+1)$ factor leads to evaluations of the Gamma function at zero, or negative integers, when $p$ is negative. Thus, the above series is not defined in these cases.

Another method for obtaining a second linearly independent solution is through a linear combination of $J_{p}(x)$ and $J_{-p}(x)$ as

$N_{p}(x)=Y_{p}(x)=\dfrac{\cos \pi p J_{p}(x)-J_{-p}(x)}{\sin \pi p} . \nonumber$

These functions are called the Neumann functions, or Bessel functions of the second kind of order $p$ .

In Figure $7.8$ we display the first few Bessel functions of the second kind of integer order. Note that these functions are also decaying oscillatory functions. However, they are singular at $x=0$ .

In many applications these functions do not satisfy the boundary condition that one desires a bounded solution at $x=0$ . For example, one standard problem is to describe the oscillations of a circular drumhead. For this problem one solves the wave equation using separation of variables in cylindrical coordinates. The $r$ equation leads to a Bessel equation. The Bessel function solutions describe the radial part of the solution and one does not expect a singular solution at the center of the drum. The amplitude of the oscillation must remain finite. Thus, only Bessel functions of the first kind can be used.

Bessel functions satisfy a variety of properties, which we will only list at this time for Bessel functions of the first kind.

Derivative Identities

Figure 7.8. Plots of the Neumann functions $N_{0}(x), N_{1}(x), N_{2}(x)$ , and $N_{3}(x)$ .

$\dfrac{d}{d x}\left[x^{-p} J_{p}(x)\right]=-x^{-p} J_{p+1}(x) \nonumber$

Recursion Formulae

$\begin{aligned} &J_{p-1}(x)+J_{p+1}(x)=\dfrac{2 p}{x} J_{p}(x) \\[4pt] &J_{p-1}(x)-J_{p+1}(x)=2 J_{p}^{\prime}(x) \end{aligned} \nonumber$

Orthogonality

$\int_{0}^{a} x J_{p}\left(j_{p n} \dfrac{x}{a}\right) J_{p}\left(j_{p m} \dfrac{x}{a}\right) d x=\dfrac{a^{2}}{2}\left[J_{p+1}\left(j_{p n}\right)\right]^{2} \delta_{n, m} \nonumber$

where $j_{p n}$ is the $n$ th root of $J_{p}(x), J_{p}\left(j_{p n}\right)=0, n=1,2, \ldots$ A list of some of these roots are provided in Table 7.4.

$n$	$p=0$	$p=1$	$p=2$	$p=3$	$p=4$	$p=5$
1	$2.405$	$3.832$	$5.135$	$6.379$	$7.586$	$8.780$
2	$5.520$	$7.016$	$8.147$	$9.760$	$11.064$	$12.339$
3	$8.654$	$10.173$	$11.620$	$13.017$	$14.373$	$15.700$
4	$11.792$	$13.323$	$14.796$	$16.224$	$17.616$	$18.982$
5	$14.931$	$16.470$	$17.960$	$19.410$	$20.827$	$22.220$
6	$18.071$	$19.616$	$21.117$	$22.583$	$24.018$	$25.431$
7	$21.212$	$22.760$	$24.270$	$25.749$	$27.200$	$28.628$
8	$24.353$	$25.903$	$27.421$	$28.909$	$30.371$	$31.813$
9	$27.494$	$29.047$	$30.571$	$32.050$	$33.512$	$34.983$

Table $7.4$ . The zeros of Bessel Functions

Generating Function

$e^{x\left(t-\dfrac{1}{t}\right) / 2}=\sum_{n=-\infty}^{\infty} J_{n}(x) t^{n}, \quad x>0, t \neq 0 \nonumber$

Integral Representation

$J_{n}(x)=\dfrac{1}{\pi} \int_{0}^{\pi} \cos (x \sin \theta-n \theta) d \theta, \quad x>0, n \in \mathrm{Z} . \nonumber$

Fourier-Bessel Series

Since the Bessel functions are an orthogonal set of eigenfunctions of a Sturm-Liouville problem, we can expand square integrable functions in this basis. In fact, the eigenvalue problem is given in the form

$x^{2} y^{\prime \prime}+x y^{\prime}+\left(\lambda x^{2}-p^{2}\right) y=0 . \nonumber$

The solutions are then of the form $J_{p}(\sqrt{\lambda} x)$ , as can be shown by making the substitution $t=\sqrt{\lambda} x$ in the differential equation.

Furthermore, one can solve the differential equation on a finite domain, $[0, a]$ , with the boundary conditions: $y(x)$ is bounded at $x=0$ and $y(a)=$ 0 . One can show that $J_{p}\left(j_{p n} \dfrac{x}{a}\right)$ is a basis of eigenfunctions and the resulting Fourier-Bessel series expansion of $f(x)$ defined on $x \in[0, a]$ is

$f(x)=\sum_{n=1}^{\infty} c_{n} J_{p}\left(j_{p n} \dfrac{x}{a}\right), \nonumber$

where the Fourier-Bessel coefficients are found using the orthogonality relation as

$c_{n}=\dfrac{2}{a^{2}\left[J_{p+1}\left(j_{p n}\right)\right]^{2}} \int_{0}^{a} x f(x) J_{p}\left(j_{p n} \dfrac{x}{a}\right) d x . \nonumber$

Example 7.8. Expand $f(x)=1$ for $0 \leq x \leq 1$ in a Fourier-Bessel series of the form

$f(x)=\sum_{n=1}^{\infty} c_{n} J_{0}\left(j_{0 n} x\right) \nonumber$

We need only compute the Fourier-Bessel coefficients in Equation (7.50):

$c_{n}=\dfrac{2}{\left[J_{1}\left(j_{0 n}\right)\right]^{2}} \int_{0}^{1} x J_{0}\left(j_{0 n} x\right) d x . \nonumber$

From Equation (7.41) we have

$\begin{aligned} \int_{0}^{1} x J_{0}\left(j_{0 n} x\right) d x &=\dfrac{1}{j_{0 n}^{2}} \int_{0}^{j_{0 n}} y J_{0}(y) d y \\[4pt] &=\dfrac{1}{j_{0 n}^{2}} \int_{0}^{j_{0 n}} \dfrac{d}{d y}\left[y J_{1}(y)\right] d y \\[4pt] &=\dfrac{1}{j_{0 n}^{2}}\left[y J_{1}(y)\right]_{0}^{j_{0 n}} \\[4pt] &=\dfrac{1}{j_{0 n}} J_{1}\left(j_{0 n}\right) \end{aligned} \nonumber$

As a result, we have found that the desired Fourier-Bessel expansion is

$1=2 \sum_{n=1}^{\infty} \dfrac{J_{0}\left(j_{0 n} x\right)}{j_{0 n} J_{1}\left(j_{0 n}\right)}, \quad 0<x<1 \nonumber$

In Figure $7.9$ we show the partial sum for the first fifty terms of this series. We see that there is slow convergence due to the Gibbs’ phenomenon.

Figure 7.9. Plot of the first 50 terms of the Fourier-Bessel series in Equation (7.53) for $f(x)=1$ on $0<x<1$ .

Hypergeometric Functions

Hypergeometric functions are probably the most useful, but least understood, class of functions. They typically do not make it into the undergraduate curriculum and seldom in graduate curriculum. Most functions that you know can be expressed using hypergeometric functions. There are many approaches to these functions and the literature can fill books. 1

In 1812 Gauss published a study of the hypergeometric series

$\begin{aligned} y(x)=1 &+\dfrac{\alpha \beta}{\gamma} x+\dfrac{\alpha(1+\alpha)(1+\beta)}{2 ! \gamma(1+\gamma)} x^{2} \\[4pt] &+\dfrac{\alpha(1+\alpha)(2+\alpha) \beta(1+\beta)(2+\beta)}{3 ! \gamma(1+\gamma)(2+\gamma)} x^{3}+\ldots \end{aligned} \nonumber$

Here $\alpha, \beta, \gamma$ , and $x$ are real numbers. If one sets $\alpha=1$ and $\beta=\gamma$ , this series reduces to the familiar geometric series

$y(x)=1+x+x^{2}+x^{3}+\ldots . \nonumber$

The hypergeometric series is actually a solution of the differential equation

$x(1-x) y^{\prime \prime}+[\gamma-(\alpha+\beta+1) x] y^{\prime}-\alpha \beta y=0 \nonumber$

This equation was first introduced by Euler and latter studied extensively by Gauss, Kummer and Riemann. It is sometimes called Gauss’ equation. Note that there is a symmetry in that $\alpha$ and $\beta$ may be interchanged without changing the equation. The points $x=0$ and $x=1$ are regular singular points. Series solutions may be sought using the Frobenius method. It can be confirmed that the above hypergeometric series results.

A more compact form for the hypergeometric series may be obtained by introducing new notation. One typically introduces the Pochhammer symbol, $(\alpha)_{n}$ , satisfying (i) $(\alpha)_{0}=1$ if $\alpha \neq 0$ . and (ii) $(\alpha)_{k}=\alpha(1+\alpha) \ldots(k-1+\alpha)$ , for $k=1,2, \ldots$

Consider $(1)_{n}$ . For $n=0,(1)_{0}=1$ . For $n>0$ ,

$(1)_{n}=1(1+1)(2+1) \ldots[(n-1)+1] \nonumber$

This reduces to $(1)_{n}=n !$ . In fact, one can show that

$(k)_{n}=\dfrac{(n+k-1) !}{(k-1) !} \nonumber$

for $k$ and $n$ positive integers. In fact, one can extend this result to noninteger values for $k$ by introducing the gamma function:

$(\alpha)_{n}=\dfrac{\Gamma(\alpha+n)}{\Gamma(\alpha)} \nonumber$

We can now write the hypergeometric series in standard notation as

${ }^{1}$ See for example Special Functions by G. E. Andrews, R. Askey, and R. Roy, 1999, Cambridge University Press.

${ }_{2} F_{1}(\alpha, \beta ; \gamma ; x)=\sum_{n=0}^{\infty} \dfrac{(\alpha)_{n}(\beta)_{n}}{n !(\gamma)_{n}} x^{n} \nonumber$

Using this one can show that the general solution of Gauss’ equation is

$y(x)=A_{2} F_{1}(\alpha, \beta ; \gamma ; x)+B_{2} x_{2}^{1-\gamma} F_{1}(1-\gamma+\alpha, 1-\gamma+\beta ; 2-\gamma ; x) . \nonumber$

By carefully letting $\beta$ approach $\infty$ , one obtains what is called the confluent hypergeometric function. This in effect changes the nature of the differential equation. Gauss’ equation has three regular singular points at $x=0,1, \infty$ . One can transform Gauss’ equation by letting $x=u / \beta$ . This changes the regular singular points to $u=0, \beta, \infty$ . Letting $\beta \rightarrow \infty$ , two of the singular points merge.

The new confluent hypergeometric function is then given as

${ }_{1} F_{1}(\alpha ; \gamma ; u)=\lim _{\beta \rightarrow \infty}{ }_{2} F_{1}\left(\alpha, \beta ; \gamma ; \dfrac{u}{\beta}\right) . \nonumber$

This function satisfies the differential equation

$x y^{\prime \prime}+(\gamma-x) y^{\prime}-\alpha y=0 . \nonumber$

The purpose of this section is only to introduce the hypergeometric function. Many other special functions are related to the hypergeometric function after making some variable transformations. For example, the Legendre polynomials are given by

$P_{n}(x)={ }_{2} F_{1}\left(-n, n+1 ; 1 ; \dfrac{1-x}{2}\right) . \nonumber$

In fact, one can also show that

$\sin ^{-1} x=x_{2} F_{1}\left(\dfrac{1}{2}, \dfrac{1}{2} ; \dfrac{3}{2} ; x^{2}\right) . \nonumber$

The Bessel function $J_{p}(x)$ can be written in terms of confluent geometric functions as

$J_{p}(x)=\dfrac{1}{\Gamma(p+1)}\left(\dfrac{z}{2}\right)^{p} e^{-i z}{ }_{1} F_{1}\left(\dfrac{1}{2}+p, 1+2 p ; 2 i z\right) . \nonumber$

These are just a few connections of the powerful hypergeometric functions to some of the elementary functions that you know.

Appendix: The Binomial Expansion

In this section we had to recall the binomial expansion. This is simply the expansion of the expression $(a+b)^{p}$ . We will investigate this expansion first for nonnegative integer powers $p$ and then derive the expansion for other values of $p$ .

Lets list some of the common expansions for nonnegative integer powers.

$\begin{aligned} &(a+b)^{0}=1 \\[4pt] &(a+b)^{1}=a+b \\[4pt] &(a+b)^{2}=a^{2}+2 a b+b^{2} \\[4pt] &(a+b)^{3}=a^{3}+3 a^{2} b+3 a b^{2}+b^{3} \\[4pt] &(a+b)^{4}=a^{4}+4 a^{3} b+6 a^{2} b^{2}+4 a b^{3}+b^{4} \end{aligned} \nonumber$

We now look at the patterns of the terms in the expansions. First, we note that each term consists of a product of a power of $a$ and a power of $b$ . The powers of $a$ are decreasing from $n$ to 0 in the expansion of $(a+b)^{n}$ . Similarly, the powers of $b$ increase from 0 to $n$ . The sums of the exponents in each term is $n$ . So, we can write the $(k+1)$ st term in the expansion as $a^{n-k} b^{k}$ . For example, in the expansion of $(a+b)^{51}$ the 6 th term is $a^{51-5} b^{5}=a^{46} b^{5}$ . However, we do not know the numerical coefficient in the expansion.

We now list the coefficients for the above expansions.

This pattern is the famous Pascal’s triangle. There are many interesting features of this triangle. But we will first ask how each row can be generated.

We see that each row begins and ends with a one. Next the second term and next to last term has a coefficient of $n$ . Next we note that consecutive pairs in each row can be added to obtain entries in the next row. For example, we have

With this in mind, we can generate the next several rows of our triangle.

Of course, it would take a while to compute each row up to the desired $n$ . We need a simple expression for computing a specific coefficient. Consider

$\begin{aligned} & n=0: \quad 1 \\[4pt] & n=1: \quad 1 \quad 1 \\[4pt] & n=2: \quad 1 \quad 2 \quad 1 \\[4pt] & n=3: \quad 1 \quad 3 \quad 3 \quad 1 \\[4pt] & n=3: \quad 1 \quad 3 \quad 3 \quad 1 \\[4pt] & n=4: \quad 1 \quad 4 \quad 6 \quad 4 \quad 1 \end{aligned} \nonumber$

the $k$ th term in the expansion of $(a+b)^{n}$ . Let $r=k-1$ . Then this term is of the form $C_{r}^{n} a^{n-r} b^{r}$ . We have seen the the coefficients satisfy

$C_{r}^{n}=C_{r}^{n-1}+C_{r-1}^{n-1} \nonumber$

Actually, the coefficients have been found to take a simple form.

$C_{r}^{n}=\dfrac{n !}{(n-r) ! r !}=\left(\begin{array}{c} n \\[4pt] r \end{array}\right) \nonumber$

This is nothing other than the combinatoric symbol for determining how to choose $n$ things $r$ at a time. In our case, this makes sense. We have to count the number of ways that we can arrange the products of $r$ b’s with $n-r a$ ’s. There are $n$ slots to place the $b$ ’s. For example, the $r=2$ case for $n=4$ involves the six products: $a a b b, a b a b, a b b a, b a a b, b a b a$ , and bbaa. Thus, it is natural to use this notation. The original problem that concerned Pascal was in gambling.

So, we have found that

$(a+b)^{n}=\sum_{r=0}^{n}\left(\begin{array}{c} n \\[4pt] r \end{array}\right) a^{n-r} b^{r} \nonumber$

What if $a \gg b$ ? Can we use this to get an approximation to $(a+b)^{n}$ ? If we neglect $b$ then $(a+b)^{n} \simeq a^{n}$ . How good of an approximation is this? This is where it would be nice to know the order of the next term in the expansion, which we could state using big $O$ notation. In order to do this we first divide out $a$ as

$(a+b)^{n}=a^{n}\left(1+\dfrac{b}{a}\right)^{n} . \nonumber$

Now we have a small parameter, $\dfrac{b}{a}$ . According to what we have seen above, we can use the binomial expansion to write

$\left(1+\dfrac{b}{a}\right)^{n}=\sum_{r=0}^{n}\left(\begin{array}{l} n \\[4pt] r \end{array}\right)\left(\dfrac{b}{a}\right)^{r} \nonumber$

Thus, we have a finite sum of terms involving powers of $\dfrac{b}{a}$ . Since $a \gg b$ , most of these terms can be neglected. So, we can write

$\left(1+\dfrac{b}{a}\right)^{n}=1+n \dfrac{b}{a}+O\left(\left(\dfrac{b}{a}\right)^{2}\right) \nonumber$

note that we have used the observation that the second coefficient in the $n$ th row of Pascal’s triangle is $n$ .

Summarizing, this then gives

$\begin{aligned} (a+b)^{n} &=a^{n}\left(1+\dfrac{b}{a}\right)^{n} \\[4pt] &=a^{n}\left(1+n \dfrac{b}{a}+O\left(\left(\dfrac{b}{a}\right)^{2}\right)\right) \\[4pt] &=a^{n}+n a^{n} \dfrac{b}{a}+a^{n} O\left(\left(\dfrac{b}{a}\right)^{2}\right) \end{aligned} \nonumber$

Therefore, we can approximate $(a+b)^{n} \simeq a^{n}+n b a^{n-1}$ , with an error on the order of $b a^{n-2}$ . Note that the order of the error does not include the constant factor from the expansion. We could also use the approximation that $(a+b)^{n} \simeq a^{n}$ , but it is not as good because the error in this case is of the order $b a^{n-1}$ .

We have seen that

$\dfrac{1}{1-x}=1+x+x^{2}+\ldots \nonumber$

But, $\dfrac{1}{1-x}=(1-x)^{-1}$ . This is again a binomial to a power, but the power is not a nonnegative integer. It turns out that the coefficients of such a binomial expansion can be written similar to the form in Equation (7.60).

This example suggests that our sum may no longer be finite. So, for $p$ a real number, we write

$(1+x)^{p}=\sum_{r=0}^{\infty}\left(\begin{array}{l} p \\[4pt] r \end{array}\right) x^{r} . \nonumber$

However, we quickly run into problems with this form. Consider the coefficient for $r=1$ in an expansion of $(1+x)^{-1}$ . This is given by

$\left(\begin{array}{c} -1 \\[4pt] 1 \end{array}\right)=\dfrac{(-1) !}{(-1-1) ! 1 !}=\dfrac{(-1) !}{(-2) ! 1 !} \text {. } \nonumber$

But what is $(-1) ! ?$ By definition, it is

$(-1) !=(-1)(-2)(-3) \cdots . \nonumber$

This product does not seem to exist! But with a little care, we note that

$\dfrac{(-1) !}{(-2) !}=\dfrac{(-1)(-2) !}{(-2) !}=-1 \text {. } \nonumber$

So, we need to be careful not to interpret the combinatorial coefficient literally. There are better ways to write the general binomial expansion. We can write the general coefficient as

$\begin{aligned} \left(\begin{array}{l} p \\[4pt] r \end{array}\right) &=\dfrac{p !}{(p-r) ! r !} \\[4pt] &=\dfrac{p(p-1) \cdots(p-r+1)(p-r) !}{(p-r) ! r !} \\[4pt] &=\dfrac{p(p-1) \cdots(p-r+1)}{r !} . \end{aligned} \nonumber$

With this in mind we now state the theorem:

General Binomial Expansion The general binomial expansion for $(1+$ $x)^{p}$ is a simple generalization of Equation (7.60). For $p$ real, we have that

$\begin{aligned} (1+x)^{p} &=\sum_{r=0}^{\infty} \dfrac{p(p-1) \cdots(p-r+1)}{r !} x^{r} \\[4pt] &=\sum_{r=0}^{\infty} \dfrac{\Gamma(p+1)}{r ! \Gamma(p-r+1)} x^{r} \end{aligned} \nonumber$

Often we need the first few terms for the case that $x \ll 1$ :

$(1+x)^{p}=1+p x+\dfrac{p(p-1)}{2} x^{2}+O\left(x^{3}\right) . \nonumber$

Problems

7.1. Consider the set of vectors $(-1,1,1),(1,-1,1),(1,1,-1)$ .

a. Use the Gram-Schmidt process to find an orthonormal basis for $R^{3}$ using this set in the given order.

b. What do you get if you do reverse the order of these vectors?

7.2. Use the Gram-Schmidt process to find the first four orthogonal polynomials satisfying the following:

a. Interval: $(-\infty, \infty)$ Weight Function: $e^{-x^{2}} .$

b. Interval: $(0, \infty)$ Weight Function: $e^{-x}$ .

7.3. Find $P_{4}(x)$ using

a. The Rodrigues Formula in Equation (7.12)

b. The three term recursion formula in Equation (7.14).

7.4. Use the generating function for Legendre polynomials to derive the recursion formula $P_{n+1}^{\prime}(x)-P_{n-1}^{\prime}(x)=(2 n+1) P_{n}(x)$ . Namely, consider $\dfrac{\partial g(x, t)}{\partial x}$ using Equation (7.18) to derive a three term derivative formula. Then use three term recursion formula (7.14) to obtain the above result.

7.5. Use the recursion relation (7.14) to evaluate $\int_{-1}^{1} x P_{n}(x) P_{m}(x) d x, n \leq m$ .

7.6. Expand the following in a Fourier-Legendre series for $x \in(-1,1)$ .
a. $f(x)=x^{2}$ .
b. $f(x)=5 x^{4}+2 x^{3}-x+3$ .
c. $f(x)=\left\{\begin{array}{c}-1,-1<x<0, \\[4pt] 1, \quad 0<x<1 .\end{array}\right.$ d. $f(x)=\left\{\begin{array}{l}x,-1<x<0 \\[4pt] 0,0<x<1\end{array}\right.$

7.7. Use integration by parts to show $\Gamma(x+1)=x \Gamma(x)$ .

7.8. Express the following as Gamma functions. Namely, noting the form $\Gamma(x+1)=\int_{0}^{\infty} t^{x} e^{-t} d t$ and using an appropriate substitution, each expression can be written in terms of a Gamma function.

a. $\int_{0}^{\infty} x^{2 / 3} e^{-x} d x$

b. $\int_{0}^{\infty} x^{5} e^{-x^{2}} d x$

c. $\int_{0}^{1}\left[\ln \left(\dfrac{1}{x}\right)\right]^{n} d x$

7.9. The Hermite polynomials, $H_{n}(x)$ , satisfy the following:

i. $<H_{n}, H_{m}>=\int_{-\infty}^{\infty} e^{-x^{2}} H_{n}(x) H_{m}(x) d x=\sqrt{\pi} 2^{n} n ! \delta_{n, m}$ .

ii. $H_{n}^{\prime}(x)=2 n H_{n-1}(x)$ .

iii. $H_{n+1}(x)=2 x H_{n}(x)-2 n H_{n-1}(x)$ .

iv. $H_{n}(x)=(-1)^{n} e^{x^{2}} \dfrac{d^{n}}{d x^{n}}\left(e^{-x^{2}}\right)$

Using these, show that

a. $H_{n}^{\prime \prime}-2 x H_{n}^{\prime}+2 n H_{n}=0$ . [Use properties ii. and iii.]

b. $\int_{-\infty}^{\infty} x e^{-x^{2}} H_{n}(x) H_{m}(x) d x=\sqrt{\pi} 2^{n-1} n !\left[\delta_{m, n-1}+2(n+1) \delta_{m, n+1}\right]$ . [Use properties i. and iii.]

c. $H_{n}(0)=\left\{\begin{array}{cc}0, & n \text { odd, } \\[4pt] (-1)^{m} \dfrac{(2 m) !}{m !}, & n=2 m\end{array}\right.$ . [Let $x=0$ in iii. and iterate. Note from iv. that $H_{0}(x)=1$ and $H_{1}(x)=1$ .

7.10. In Maple one can type simplify(LegendreP $\left.\left(2^{*} \mathbf{n}-2,0\right)-\operatorname{Legendre} \mathbf{P}\left(2^{*} \mathbf{n}, 0\right)\right)$ ; to find a value for $P_{2 n-2}(0)-P_{2 n}(0)$ . It gives the result in terms of Gamma functions. However, in Example $7.6$ for Fourier-Legendre series, the value is given in terms of double factorials! So, we have

$P_{2 n-2}(0)-P_{2 n}(0)=\dfrac{\sqrt{\pi}(4 n-1)}{2 \Gamma(n+1) \Gamma\left(\dfrac{3}{2}-n\right)}=(-1)^{n} \dfrac{(2 n-3) ! !}{(2 n-2) ! !} \dfrac{4 n-1}{2 n} . \nonumber$

You will verify that both results are the same by doing the following:

a. Prove that $P_{2 n}(0)=(-1)^{n} \dfrac{(2 n-1) ! !}{(2 n) ! !}$ using the generating function and a binomial expansion.

b. Prove that $\Gamma\left(n+\dfrac{1}{2}\right)=\dfrac{(2 n-1) ! !}{2^{n}} \sqrt{\pi}$ using $\Gamma(x)=(x-1) \Gamma(x-1)$ and iteration.

c. Verify the result from Maple that $P_{2 n-2}(0)-P_{2 n}(0)=\dfrac{\sqrt{\pi}(4 n-1)}{2 \Gamma(n+1) \Gamma\left(\dfrac{3}{2}-n\right)}$ .

d. Can either expression for $P_{2 n-2}(0)-P_{2 n}(0)$ be simplified further? 7.11. A solution Bessel’s equation, $x^{2} y^{\prime \prime}+x y^{\prime}+\left(x^{2}-n^{2}\right) y=0$ , , can be found using the guess $y(x)=\sum_{j=0}^{\infty} a_{j} x^{j+n}$ . One obtains the recurrence relation $a_{j}=\dfrac{-1}{j(2 n+j)} a_{j-2}$ . Show that for $a_{0}=\left(n ! 2^{n}\right)^{-1}$ we get the Bessel function of the first kind of order $n$ from the even values $j=2 k$ :

$J_{n}(x)=\sum_{k=0}^{\infty} \dfrac{(-1)^{k}}{k !(n+k) !}\left(\dfrac{x}{2}\right)^{n+2 k} \nonumber$

7.12. Use the infinite series in the last problem to derive the derivative identities (7.41) and (7.42):

a. $\dfrac{d}{d x}\left[x^{n} J_{n}(x)\right]=x^{n} J_{n-1}(x)$ .

b. $\dfrac{d}{d x}\left[x^{-n} J_{n}(x)\right]=-x^{-n} J_{n+1}(x)$

7.13. Bessel functions $J_{p}(\lambda x)$ are solutions of $x^{2} y^{\prime \prime}+x y^{\prime}+\left(\lambda^{2} x^{2}-p^{2}\right) y=0$ . Assume that $x \in(0,1)$ and that $J_{p}(\lambda)=0$ and $J_{p}(0)$ is finite.

a. Put this differential equation into Sturm-Liouville form.

b. Prove that solutions corresponding to different eigenvalues are orthogonal by first writing the corresponding Green’s identity using these Bessel functions.

c. Prove that

$\int_{0}^{1} x J_{p}(\lambda x) J_{p}(\mu x) d x=\dfrac{1}{2} J_{p+1}^{2}(\lambda)=\dfrac{1}{2} J_{p}^{\prime 2}(\lambda) \nonumber$

Note that $\lambda$ is a zero of $J_{p}(x)$

7.14. We can rewrite our Bessel function in a form which will allow the order to be non-integer by using the gamma function. You will need the results from Problem $7.10 \mathrm{~b}$ for $\Gamma\left(k+\dfrac{1}{2}\right)$ .

a. Extend the series definition of the Bessel function of the first kind of order $\nu, J_{\nu}(x)$ , for $\nu \geq 0$ by writing the series solution for $y(x)$ in Problem $7.11$ using the gamma function.

b. Extend the series to $J_{-\nu(x)}$ , for $\nu \geq 0$ . Discuss the resulting series and what happens when $\nu$ is a positive integer.

c. Use these results to obtain closed form expressions for $J_{1 / 2}(x)$ and $J_{-1 / 2}(x)$ . Use the recursion formula for Bessel functions to obtain a closed form for $J_{3 / 2}(x)$ .

7.15. In this problem you will derive the expansion

$x^{2}=\dfrac{c^{2}}{2}+4 \sum_{j=2}^{\infty} \dfrac{J_{0}\left(\alpha_{j} x\right)}{\alpha_{j}^{2} J_{0}\left(\alpha_{j} c\right)}, \quad 0<x<c \nonumber$

where the $\alpha_{j}^{\prime} s$ are the positive roots of $J_{1}(\alpha c)=0$ , by following the below steps. a. List the first five values of $\alpha$ for $J_{1}(\alpha c)=0$ using the Table $7.4$ and Figure 7.7. [Note: Be careful determining $\alpha_{1}$ .]

b. Show that $\left\|J_{0}\left(\alpha_{1} x\right)\right\|^{2}=\dfrac{c^{2}}{2}$ . Recall,

$\left\|J_{0}\left(\alpha_{j} x\right)\right\|^{2}=\int_{0}^{c} x J_{0}^{2}\left(\alpha_{j} x\right) d x \nonumber$

c. Show that $\left\|J_{0}\left(\alpha_{j} x\right)\right\|^{2}=\dfrac{c^{2}}{2}\left[J_{0}\left(\alpha_{j} c\right)\right]^{2}, j=2,3, \ldots$ (This is the most involved step.) First note from Problem $7.13$ that $y(x)=J_{0}\left(\alpha_{j} x\right)$ is a solution of

$x^{2} y^{\prime \prime}+x y^{\prime}+\alpha_{j}^{2} x^{2} y=0 . \nonumber$

i. Show that the Sturm-Liouville form of this differential equation is $\left(x y^{\prime}\right)^{\prime}=-\alpha_{j}^{2} x y$

ii. Multiply the equation in part i. by $y(x)$ and integrate from $x=0$ to $x=c$ to obtain

$\begin{aligned} \int_{0}^{c}\left(x y^{\prime}\right)^{\prime} y d x &=-\alpha_{j}^{2} \int_{0}^{c} x y^{2} d x \\[4pt] &=-\alpha_{j}^{2} \int_{0}^{c} x J_{0}^{2}\left(\alpha_{j} x\right) d x \end{aligned} \nonumber$

iii. Noting that $y(x)=J_{0}\left(\alpha_{j} x\right)$ , integrate the left hand side by parts and use the following to simplify the resulting equation.

$J_{0}^{\prime}(x)=-J_{1}(x)$ from Equation (7.42).
Equation $(7.45)$
$J_{2}\left(\alpha_{j} c\right)+J_{0}\left(\alpha_{j} c\right)=0$ from Equation (7.43).

iv. Now you should have enough information to complete this part.

d. Use the results from parts b and c to derive the expansion coefficients for

$x^{2}=\sum_{j=1}^{\infty} c_{j} J_{0}\left(\alpha_{j} x\right) \nonumber$

in order to obtain the desired expansion.

7.16. Use the derivative identities of Bessel functions, $(7.41)-(7.42)$ , and integration by parts to show that

$\int x^{3} J_{0}(x) d x=x^{3} J_{1}(x)-2 x^{2} J_{2}(x) \nonumber$

Green’s Functions

In this chapter we will investigate the solution of nonhomogeneous differential equations using Green’s functions. Our goal is to solve the nonhomogeneous differential equation

$L[u]=f, \nonumber$

where $L$ is a differential operator. The solution is formally given by

$u=L^{-1}[f] \nonumber$

The inverse of a differential operator is an integral operator, which we seek to write in the form

$u=\int G(x, \xi) f(\xi) d \xi \nonumber$

The function $G(x, \xi)$ is referred to as the kernel of the integral operator and is called the Green’s function.

The history of the Green’s function dates back to 1828 , when George Green published work in which he sought solutions of Poisson’s equation $\nabla^{2} u=f$ for the electric potential $u$ defined inside a bounded volume with specified boundary conditions on the surface of the volume. He introduced a function now identified as what Riemann later coined the "Green’s function".

We will restrict our discussion to Green’s functions for ordinary differential equations. Extensions to partial differential equations are typically one of the subjects of a PDE course. We will begin our investigations by examining solutions of nonhomogeneous second order linear differential equations using the Method of Variation of Parameters, which is typically seen in a first course on differential equations. We will identify the Green’s function for both initial value and boundary value problems. We will then focus on boundary value Green’s functions and their properties. Determination of Green’s functions is also possible using Sturm-Liouville theory. This leads to series representation of Green’s functions, which we will study in the last section of this chapter.

The Method of Variation of Parameters

We are interested in solving nonhomogeneous second order linear differential equations of the form

$a_{2}(x) y^{\prime \prime}(x)+a_{1}(x) y^{\prime}(x)+a_{0}(x) y(x)=f(x) . \nonumber$

The general solution of this nonhomogeneous second order linear differential equation is found as a sum of the general solution of the homogeneous equation,

$a_{2}(x) y^{\prime \prime}(x)+a_{1}(x) y^{\prime}(x)+a_{0}(x) y(x)=0, \nonumber$

and a particular solution of the nonhomogeneous equation. Recall from Chapter 1 that there are several approaches to finding particular solutions of nonhomogeneous equations. Any guess would be sufficient. An intelligent guess, based upon the Method of Undetermined Coefficients, was reviewed previously in Chapter 1. However, a more methodical method, which is first seen in a first course in differential equations, is the Method of Variation of Parameters. Also, we explored the matrix version of this method in Section $2.8$ . We will review this method in this section and extend it to the solution of boundary value problems.

While it is sufficient to derive the method for the general differential equation above, we will instead consider solving equations that are in SturmLiouville, or self-adjoint, form. Therefore, we will apply the Method of Variation of Parameters to the equation

$\dfrac{d}{d x}\left(p(x) \dfrac{d y(x)}{d x}\right)+q(x) y(x)=f(x) \nonumber$

Note that $f(x)$ in this equation is not the same function as in the general equation posed at the beginning of this section.

We begin by assuming that we have determined two linearly independent solutions of the homogeneous equation. The general solution is then given by

$y(x)=c_{1} y_{1}(x)+c_{2} y_{2}(x) . \nonumber$

In order to determine a particular solution of the nonhomogeneous equation, we vary the parameters $c_{1}$ and $c_{2}$ in the solution of the homogeneous problem by making them functions of the independent variable. Thus, we seek a particular solution of the nonhomogeneous equation in the form

$y_{p}(x)=c_{1}(x) y_{1}(x)+c_{2}(x) y_{2}(x) . \nonumber$

In order for this to be a solution, we need to show that it satisfies the differential equation. We first compute the derivatives of $y_{p}(x)$ . The first derivative

$y_{p}^{\prime}(x)=c_{1}(x) y_{1}^{\prime}(x)+c_{2}(x) y_{2}^{\prime}(x)+c_{1}^{\prime}(x) y_{1}(x)+c_{2}^{\prime}(x) y_{2}(x) . \nonumber$

Without loss of generality, we will set the sum of the last two terms to zero. (One can show that the same results would be obtained if we did not. See Problem 8.2.) Then, we have

$c_{1}^{\prime}(x) y_{1}(x)+c_{2}^{\prime}(x) y_{2}(x)=0 . \nonumber$

Now, we take the second derivative of the remaining terms to obtain

$y_{p}^{\prime \prime}(x)=c_{1}(x) y_{1}^{\prime \prime}(x)+c_{2}(x) y_{2}^{\prime \prime}(x)+c_{1}^{\prime}(x) y_{1}^{\prime}(x)+c_{2}^{\prime}(x) y_{2}^{\prime}(x) \nonumber$

Expanding the derivative term in Equation (8.3),

$p(x) y_{p}^{\prime \prime}(x)+p^{\prime}(x) y_{p}^{\prime}(x)+q(x) y_{p}(x)=f(x), \nonumber$

and inserting the expressions for $y_{p}, y_{p}^{\prime}(x)$ , and $y_{p}^{\prime \prime}(x)$ , we have

$\begin{aligned} f(x)=& p(x)\left[c_{1}(x) y_{1}^{\prime \prime}(x)+c_{2}(x) y_{2}^{\prime \prime}(x)+c_{1}^{\prime}(x) y_{1}^{\prime}(x)+c_{2}^{\prime}(x) y_{2}^{\prime}(x)\right] \\[4pt] &+p^{\prime}(x)\left[c_{1}(x) y_{1}^{\prime}(x)+c_{2}(x) y_{2}^{\prime}(x)\right]+q(x)\left[c_{1}(x) y_{1}(x)+c_{2}(x) y_{2}(x)\right] . \end{aligned} \nonumber$

Rearranging terms, we find

$\begin{aligned} f(x)=& c_{1}(x)\left[p(x) y_{1}^{\prime \prime}(x)+p^{\prime}(x) y_{1}^{\prime}(x)+q(x) y_{1}(x)\right] \\[4pt] &+c_{2}(x)\left[p(x) y_{2}^{\prime \prime}(x)+p^{\prime}(x) y_{2}^{\prime}(x)+q(x) y_{2}(x)\right] \\[4pt] &+p(x)\left[c_{1}^{\prime}(x) y_{1}^{\prime}(x)+c_{2}^{\prime}(x) y_{2}^{\prime}(x)\right] . \end{aligned} \nonumber$

Since $y_{1}(x)$ and $y_{2}(x)$ are both solutions of the homogeneous equation. The first two bracketed expressions vanish. Dividing by $p(x)$ , we have that

$c_{1}^{\prime}(x) y_{1}^{\prime}(x)+c_{2}^{\prime}(x) y_{2}^{\prime}(x)=\dfrac{f(x)}{p(x)} . \nonumber$

Our goal is to determine $c_{1}(x)$ and $c_{2}(x)$ . In this analysis, we have found that the derivatives of these functions satisfy a linear system of equations (in the $c_{i}$ ’s):

Linear System for Variation of Parameters

$c_{1}^{\prime}(x) y_{1}(x)+c_{2}^{\prime}(x) y_{2}(x)=0 .$

$c_{1}^{\prime}(x) y_{1}^{\prime}(x)+c_{2}^{\prime}(x) y_{2}^{\prime}(x)=\dfrac{f(x)}{p(x)}$

This system is easily solved to give

$\begin{aligned} c_{1}^{\prime}(x) &=-\dfrac{f(x) y_{2}(x)}{p(x)\left[y_{1}(x) y_{2}^{\prime}(x)-y_{1}^{\prime}(x) y_{2}(x)\right]} \\[4pt] c_{2}^{\prime}(x) &=\dfrac{f(x) y_{1}(x)}{p(x)\left[y_{1}(x) y_{2}^{\prime}(x)-y_{1}^{\prime}(x) y_{2}(x)\right]} \end{aligned} \nonumber$

We note that the denominator in these expressions involves the Wronskian of the solutions to the homogeneous problem. Recall that

$W\left(y_{1}, y_{2}\right)(x)=\left|\begin{array}{ll} y_{1}(x) & y_{2}(x) \\[4pt] y_{1}^{\prime}(x) & y_{2}^{\prime}(x) \end{array}\right| \nonumber$

Furthermore, we can show that the denominator, $p(x) W(x)$ , is constant. Differentiating this expression and using the homogeneous form of the differential equation proves this assertion.

$\begin{aligned} \dfrac{d}{d x}(p(x) W(x))=& \dfrac{d}{d x}\left[p(x)\left(y_{1}(x) y_{2}^{\prime}(x)-y_{1}^{\prime}(x) y_{2}(x)\right)\right] \\[4pt] =&\left.y_{1}(x) \dfrac{d}{d x}\left(p(x) y_{2}^{\prime}(x)\right)\right)+p(x) y_{2}^{\prime}(x) y_{1}^{\prime}(x) \\[4pt] &\left.-y_{2}(x) \dfrac{d}{d x}\left(p(x) y_{1}^{\prime}(x)\right)\right)-p(x) y_{1}^{\prime}(x) y_{2}^{\prime}(x) \\[4pt] =&-y_{1}(x) q(x) y_{2}(x)+y_{2}(x) q(x) y_{1}(x)=0 \end{aligned} \nonumber$

Therefore,

$p(x) W(x)=\text { constant. } \nonumber$

So, after an integration, we find the parameters as

$\begin{aligned} &c_{1}(x)=-\int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \\[4pt] &c_{2}(x)=\int_{x_{1}}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi \end{aligned} \nonumber$

where $x_{0}$ and $x_{1}$ are arbitrary constants to be determined later.

Therefore, the particular solution of (8.3) can be written as

$y_{p}(x)=y_{2}(x) \int_{x_{1}}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber$

As a further note, we usually do not rewrite our initial value problems in self-adjoint form. Recall that for an equation of the form

$a_{2}(x) y^{\prime \prime}(x)+a_{1}(x) y^{\prime}(x)+a_{0}(x) y(x)=g(x) . \nonumber$

we obtained the self-adjoint form by multiplying the equation by

$\dfrac{1}{a_{2}(x)} e^{\int \dfrac{a_{1}(x)}{a_{2}(x)} d x}=\dfrac{1}{a_{2}(x)} p(x) . \nonumber$

This gives the standard form

$\left(p(x) y^{\prime}(x)\right)^{\prime}+q(x) y(x)=f(x) \nonumber$

where

$f(x)=\dfrac{1}{a_{2}(x)} p(x) g(x) . \nonumber$

With this in mind, Equation (8.13) becomes

$y_{p}(x)=y_{2}(x) \int_{x_{1}}^{x} \dfrac{g(\xi) y_{1}(\xi)}{a_{2}(\xi) W(\xi)} d \xi-y_{1}(x) \int_{x_{0}}^{x} \dfrac{g(\xi) y_{2}(\xi)}{\left.a_{(} \xi\right) W(\xi)} d \xi . \nonumber$

Example 8.1. Consider the nonhomogeneous differential equation

$y^{\prime \prime}-y^{\prime}-6 y=20 e^{-2 x} . \nonumber$

We seek a particular solution to this equation. First, we note two linearly independent solutions of this equation are

$y_{1}(x)=e^{3 x}, \quad y_{2}(x)=e^{-2 x} . \nonumber$

So, the particular solution takes the form

$y_{p}(x)=c_{1}(x) e^{3 x}+c_{2}(x) e^{-2 x} \nonumber$

We just need to determine the $c_{i}$ ’s. Since this problem is not in self-adjoint form, we will use

$\dfrac{f(x)}{p(x)}=\dfrac{g(x)}{a_{2}(x)}=20 e^{-2 x} \nonumber$

as seen above. Then the linear system we have to solve is

$\begin{aligned} c_{1}^{\prime}(x) e^{3 x}+c_{2}^{\prime}(x) e^{-2 x} &=0 \\[4pt] 3 c_{1}^{\prime}(x) e^{3 x}-2 c_{2}^{\prime}(x) e^{-2 x} &=20 e^{-2 x} \end{aligned} \nonumber$

Multiplying the first equation by 2 and adding the equations yields

$5 c_{1}^{\prime}(x) e^{3 x}=20 e^{-2 x} \nonumber$

$c_{1}^{\prime}(x)=4 e^{-5 x} \nonumber$

Inserting this back into the first equation in the system, we have

$4 e^{-2 x}+c_{2}^{\prime}(x) e^{-2 x}=0, \nonumber$

leading to

$c_{2}^{\prime}(x)=-4 . \nonumber$

These equations are easily integrated to give

$c_{1}(x)=-\dfrac{4}{5} e^{-5 x}, \quad c_{2}(x)=-4 x . \nonumber$

Therefore, the particular solution has been found as

$\begin{aligned} y_{p}(x) &=c_{1}(x) e^{3 x}+c_{2}(x) e^{-2 x} \\[4pt] &=-\dfrac{4}{5} e^{-5 x} e^{3 x}-4 x e^{-2 x} \\[4pt] &=-\dfrac{4}{5} e^{-2 x}-4 x e^{-2 x} \end{aligned} \nonumber$

Noting that the first term can be absorbed into the solution of the homogeneous problem. So, the particular solution can simply be written as

$y_{p}(x)=-4 x e^{-2 x} \nonumber$

This is the answer you would have found had you used the Modified Method of Undetermined Coefficients.

Example 8.2. Revisiting the last example, $y^{\prime \prime}-y^{\prime}-6 y=20 e^{-2 x}$ .

The formal solution in Equation (8.13) was not used in the last example. Instead, we proceeded from the Linear System for Variation of Parameters earlier in this section. This is the more natural approach towards finding the particular solution of the nonhomogeneous equation. Since we will be using Equation (8.13) to obtain solutions to initial value and boundary value problems, it might be useful to use it to solve this problem.

From the last example we have

$y_{1}(x)=e^{3 x}, \quad y_{2}(x)=e^{-2 x} \nonumber$

We need to compute the Wronskian:

$W(x)=W\left(y_{1}, y_{2}\right)(x)=\left|\begin{array}{cc} e^{3 x} & e^{-2 x} \\[4pt] 3 e^{3 x} & -2 e^{-2 x} \end{array}\right|=-5 e^{x} \nonumber$

Also, we need $p(x)$ , which is given by

$p(x)=\exp \left(-\int d x\right)=e^{-x} . \nonumber$

So, we see that $p(x) W(x)=-5$ . It is indeed constant, just as we had proven earlier.

Finally, we need $f(x)$ . Here is where one needs to be careful as the original problem was not in self-adjoint form. We have from the original equation that $g(x)=20 e^{-2 x}$ and $a_{2}(x)=1$ . So,

$f(x)=\dfrac{p(x)}{a_{2}(x)} g(x)=20 e^{-3 x} \nonumber$

Now we are ready to construct the solution.

$\begin{aligned} y_{p}(x) &=y_{2}(x) \int_{x_{1}}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \\[4pt] &=e^{-2 x} \int_{x_{1}}^{x} \dfrac{20 e^{-3 \xi} e^{3 \xi}}{-5} d \xi-e^{3 x} \int_{x_{0}}^{x} \dfrac{20 e^{-3 \xi} e^{-2 \xi}}{-5} d \xi \\[4pt] &=-4 e^{-2 x} \int_{x_{1}}^{x} d \xi+4 e^{3 x} \int_{x_{0}}^{x} e^{-5 x} d \xi \\[4pt] &=-\left.4 \xi e^{-2 x}\right|_{x_{1}} ^{x}-\left.\dfrac{4}{5} e^{3 x} e^{-5 \xi}\right|_{x_{0}} ^{x} \\[4pt] &=-4 x e^{-2 x}-\dfrac{4}{5} e^{-2 x}+4 x_{1} e^{-2 x}+\dfrac{4}{5} e^{-5 x_{0}} e^{3 x} \end{aligned} \nonumber$

Note that the first two terms we had found in the last example. The remaining two terms are simply linear combinations of $y_{1}$ and $y_{2}$ . Thus, we really have the solution to the homogeneous problem contained within the solution when we use the arbitrary constant limits in the integrals. In the next section we will make use of these constants when solving initial value and boundary value problems.

In the next section we will determine the unknown constants subject to either initial conditions or boundary conditions. This will allow us to combine the two integrals and then determine the appropriate Green’s functions.

Initial and Boundary Value Green’s Functions

We begin with the particular solution (8.13) of our nonhomogeneous differential equation (8.3). This can be combined with the general solution of the homogeneous problem to give the general solution of the nonhomogeneous differential equation:

$y(x)=c_{1} y_{1}(x)+c_{2} y_{2}(x)+y_{2}(x) \int_{x_{1}}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber$

As seen in the last section, an appropriate choice of $x_{0}$ and $x_{1}$ could be found so that we need not explicitly write out the solution to the homogeneous problem, $c_{1} y_{1}(x)+c_{2} y_{2}(x)$ . However, setting up the solution in this form will allow us to use $x_{0}$ and $x_{1}$ to determine particular solutions which satisfies certain homogeneous conditions.

We will now consider initial value and boundary value problems. Each type of problem will lead to a solution of the form

$y(x)=c_{1} y_{1}(x)+c_{2} y_{2}(x)+\int_{a}^{b} G(x, \xi) f(\xi) d \xi \nonumber$

where the function $G(x, \xi)$ will be identified as the Green’s function and the integration limits will be found on the integral. Having identified the Green’s function, we will look at other methods in the last section for determining the Green’s function.

Initial Value Green’s Function

We begin by considering the solution of the initial value problem

$\begin{array}{r} \dfrac{d}{d x}\left(p(x) \dfrac{d y(x)}{d x}\right)+q(x) y(x)=f(x) . \\[4pt] y(0)=y_{0}, \quad y^{\prime}(0)=v_{0} . \end{array} \nonumber$

Of course, we could have studied the original form of our differential equation without writing it in self-adjoint form. However, this form is useful when studying boundary value problems. We will return to this point later.

We first note that we can solve this initial value problem by solving two separate initial value problems. We assume that the solution of the homogeneous problem satisfies the original initial conditions:

$\begin{aligned} \dfrac{d}{d x}\left(p(x) \dfrac{d y_{h}(x)}{d x}\right)+q(x) y_{h}(x) &=0 \\[4pt] y_{h}(0)=y_{0}, \quad y_{h}^{\prime}(0) &=v_{0} \end{aligned} \nonumber$

We then assume that the particular solution satisfies the problem

$\begin{array}{r} \dfrac{d}{d x}\left(p(x) \dfrac{d y_{p}(x)}{d x}\right)+q(x) y_{p}(x)=f(x) \\[4pt] y_{p}(0)=0, \quad y_{p}^{\prime}(0)=0 \end{array} \nonumber$

Since the differential equation is linear, then we know that $y(x)=y_{h}(x)+$ $y_{p}(x)$ is a solution of the nonhomogeneous equation. However, this solution satisfies the initial conditions:

$\begin{gathered} y(0)=y_{h}(0)+y_{p}(0)=y_{0}+0=y_{0}, \\[4pt] y^{\prime}(0)=y_{h}^{\prime}(0)+y_{p}^{\prime}(0)=v_{0}+0=v_{0} . \end{gathered} \nonumber$

Therefore, we need only focus on solving for the particular solution that satisfies homogeneous initial conditions.

Recall Equation (8.13) from the last section,

$y_{p}(x)=y_{2}(x) \int_{x_{1}}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber$

We now seek values for $x_{0}$ and $x_{1}$ which satisfies the homogeneous initial conditions, $y_{p}(0)=0$ and $y_{p}^{\prime}(0)=0$ .

First, we consider $y_{p}(0)=0$ . We have

$y_{p}(0)=y_{2}(0) \int_{x_{1}}^{0} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(0) \int_{x_{0}}^{0} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber$

Here, $y_{1}(x)$ and $y_{2}(x)$ are taken to be any solutions of the homogeneous differential equation. Let’s assume that $y_{1}(0)=0$ and $y_{2} \neq(0)=0$ . Then we have

$y_{p}(0)=y_{2}(0) \int_{x_{1}}^{0} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber$

We can force $y_{p}(0)=0$ if we set $x_{1}=0$ .

Now, we consider $y_{p}^{\prime}(0)=0$ . First we differentiate the solution and find that

$y_{p}^{\prime}(x)=y_{2}^{\prime}(x) \int_{0}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}^{\prime}(x) \int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber$

since the contributions from differentiating the integrals will cancel. Evaluating this result at $x=0$ , we have

$y_{p}^{\prime}(0)=-y_{1}^{\prime}(0) \int_{x_{0}}^{0} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber$

Assuming that $y_{1}^{\prime}(0) \neq 0$ , we can set $x_{0}=0$ .

Thus, we have found that

$\begin{aligned} y_{p}(x) &=y_{2}(x) \int_{0}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{0}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \\[4pt] &=\int_{0}^{x}\left[\dfrac{y_{1}(\xi) y_{2}(x)-y_{1}(x) y_{2}(\xi)}{p(\xi) W \xi)}\right] f(\xi) d \xi \end{aligned} \nonumber$

This result is in the correct form and we can identify the temporal, or initial value, Green’s function. So, the particular solution is given as

$y_{p}(x)=\int_{0}^{x} G(x, \xi) f(\xi) d \xi, \nonumber$

where the initial value Green’s function is defined as

$G(x, \xi)=\dfrac{y_{1}(\xi) y_{2}(x)-y_{1}(x) y_{2}(\xi)}{p(\xi) W \xi)} \nonumber$

We summarize

Solution of Initial Value Problem (8.21)

The solution of the initial value problem $(8.21)$ takes the form

$y(x)=y_{h}(x)+\int_{0}^{x} G(x, \xi) f(\xi) d \xi, \nonumber$

where

$G(x, \xi)=\dfrac{y_{1}(\xi) y_{2}(x)-y_{1}(x) y_{2}(\xi)}{p(\xi) W \xi)} \nonumber$

and the solution of the homogeneous problem satisfies the initial conditions,

$y_{h}(0)=y_{0}, \quad y_{h}^{\prime}(0)=v_{0} \nonumber$

Example 8.3. Solve the forced oscillator problem

$x^{\prime \prime}+x=2 \cos t, \quad x(0)=4, \quad x^{\prime}(0)=0 . \nonumber$

This problem was solved in Chapter 2 using the theory of nonhomogeneous systems. We first solve the homogeneous problem with nonhomogeneous initial conditions:

$x_{h}^{\prime \prime}+x_{h}=0, \quad x_{h}(0)=4, \quad x_{h}^{\prime}(0)=0 . \nonumber$

The solution is easily seen to be $x_{h}(t)=4 \cos t$ .

Next, we construct the Green’s function. We need two linearly independent solutions, $y_{1}(x), y_{2}(x)$ , to the homogeneous differential equation satisfying $y_{1}(0)=0$ and $y_{2}^{\prime}(0)=0$ . So, we pick $y_{1}(t)=\sin t$ and $y_{2}(t)=\cos t$ . The Wronskian is found as

$W(t)=y_{1}(t) y_{2}^{\prime}(t)-y_{1}^{\prime}(t) y_{2}(t)=-\sin ^{2} t-\cos ^{2} t=-1 . \nonumber$

Since $p(t)=1$ in this problem, we have

$\begin{aligned} G(t, \tau) &=\dfrac{y_{1}(\tau) y_{2}(t)-y_{1}(t) y_{2}(\tau)}{p(\tau) W \tau)} \\[4pt] &=\sin t \cos \tau-\sin \tau \cos t \\[4pt] &=\sin (t-\tau) \end{aligned} \nonumber$

Note that the Green’s function depends on $t-\tau$ . While this is useful in some contexts, we will use the expanded form.

We can now determine the particular solution of the nonhomogeneous differential equation. We have

$\begin{aligned} x_{p}(t) &=\int_{0}^{t} G(t, \tau) f(\tau) d \tau \\[4pt] &=\int_{0}^{t}(\sin t \cos \tau-\sin \tau \cos t)(2 \cos \tau) d \tau \end{aligned} \nonumber$

$\begin{aligned} & 8.2 \text { Initial and Boundary Value Green's Function } \\[4pt] =& 2 \sin t \int_{0}^{t} \cos ^{2} \tau d \tau-2 \cos t \int_{0}^{t} \sin \tau \cos \tau d \tau \\[4pt] =& 2 \sin t\left[\dfrac{\tau}{2}+\dfrac{1}{2} \sin 2 \tau\right]_{0}^{t}-2 \cos t\left[\dfrac{1}{2} \sin ^{2} \tau\right]_{0}^{t} \\[4pt] =& t \sin t \end{aligned} \nonumber$

Therefore, the particular solution is $x(t)=4 \cos t+t \sin t$ . This is the same solution we had found earlier in Chapter $2 .$

As noted in the last section, we usually are not given the differential equation in self-adjoint form. Generally, it takes the form

$a_{2}(x) y^{\prime \prime}(x)+a_{1}(x) y^{\prime}(x)+a_{0}(x) y(x)=g(x) . \nonumber$

The driving term becomes

$f(x)=\dfrac{1}{a_{2}(x)} p(x) g(x) . \nonumber$

Inserting this into the Green’s function form of the particular solution, we obtain the following:

Solution Using the Green’s Function

The solution of the initial value problem,

$a_{2}(x) y^{\prime \prime}(x)+a_{1}(x) y^{\prime}(x)+a_{0}(x) y(x)=g(x) \nonumber$

takes the form

$y(x)=c_{1} y_{1}(x)+c_{2} y_{2}(x)+\int_{0}^{t} G(x, \xi) g(\xi) d \xi, \nonumber$

where the Green’s function is the piecewise defined function

$G(x, \xi)=\dfrac{y_{1}(\xi) y_{2}(x)-y_{1}(x) y_{2}(\xi)}{a_{2}(\xi) W(\xi)} \nonumber$

and $y_{1}(x)$ and $y_{2}(x)$ are solutions of the homogeneous equation satisfying

$y_{1}(0)=0, y_{2}(0) \neq 0, y_{1}^{\prime}(0) \neq 0, y_{2}^{\prime}(0)=0 . \nonumber$

Boundary Value Green’s Function

We now turn to boundary value problems. We will focus on the problem

$\begin{array}{r} \dfrac{d}{d x}\left(p(x) \dfrac{d y(x)}{d x}\right)+q(x) y(x)=f(x), \quad a<x<b \\[4pt] y(a)=0, \quad y(b)=0 \end{array} \nonumber$

However, the general theory works for other forms of homogeneous boundary conditions.

Once again, we seek $x_{0}$ and $x_{1}$ in the form

$y(x)=y_{2}(x) \int_{x_{1}}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber$

so that the solution to the boundary value problem can be written as a single integral involving a Green’s function. Here we absorb $y_{h}(x)$ into the integrals with an appropriate choice of lower limits on the integrals.

We first pick solutions of the homogeneous differential equation such that $y_{1}(a)=0, y_{2}(b)=0$ and $y_{1}(b) \neq 0, y_{2}(a) \neq 0$ . So, we have

$\begin{aligned} y(a) &=y_{2}(a) \int_{x_{1}}^{a} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(a) \int_{x_{0}}^{a} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \\[4pt] &=y_{2}(a) \int_{x_{1}}^{a} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi . \end{aligned} \nonumber$

This expression is zero if $x_{1}=a$ .

At $x=b$ we find that

$\begin{aligned} y(b) &=y_{2}(b) \int_{x_{1}}^{b} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(b) \int_{x_{0}}^{b} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \\[4pt] &=-y_{1}(b) \int_{x_{0}}^{b} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \end{aligned} \nonumber$

This vanishes for $x_{0}=b$ .

So, we have found that

$y(x)=y_{2}(x) \int_{a}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{b}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi . \nonumber$

We are seeking a Green’s function so that the solution can be written as one integral. We can move the functions of $x$ under the integral. Also, since $a<x<b$ , we can flip the limits in the second integral. This gives

$y(x)=\int_{a}^{x} \dfrac{f(\xi) y_{1}(\xi) y_{2}(x)}{p(\xi) W(\xi)} d \xi+\int_{x}^{b} \dfrac{f(\xi) y_{1}(x) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber$

This result can be written in a compact form:

Boundary Value Green’s Function

The solution of the boundary value problem takes the form

$y(x)=\int_{a}^{b} G(x, \xi) f(\xi) d \xi, \nonumber$

where the Green’s function is the piecewise defined function

$G(x, \xi)= \begin{cases}\dfrac{y_{1}(\xi) y_{2}(x)}{p W}, & a \leq \xi \leq x \\[4pt] \dfrac{y_{1}(x) y_{2}(\xi)}{p W}, x \leq \xi \leq b\end{cases} \nonumber$

The Green’s function satisfies several properties, which we will explore further in the next section. For example, the Green’s function satisfies the boundary conditions at $x=a$ and $x=b$ . Thus,

$\begin{aligned} G(a, \xi) &=\dfrac{y_{1}(a) y_{2}(\xi)}{p W}=0 \\[4pt] G(b, \xi) &=\dfrac{y_{1}(\xi) y_{2}(b)}{p W}=0 \end{aligned} \nonumber$

Also, the Green’s function is symmetric in its arguments. Interchanging the arguments gives

$G(\xi, x)=\left\{\begin{array}{l} \dfrac{y_{1}(x) y_{2}(\xi)}{p W}, a \leq x \leq \xi \\[4pt] \dfrac{y_{1}(\xi) y_{2}(x)}{p W} \xi \leq x \leq b \end{array}\right. \nonumber$

But a careful look at the original form shows that

$G(x, \xi)=G(\xi, x) \nonumber$

We will make use of these properties in the next section to quickly determine the Green’s functions for other boundary value problems.

Example 8.4. Solve the boundary value problem $y^{\prime \prime}=x^{2}, \quad y(0)=0=y(1)$ using the boundary value Green’s function.

We first solve the homogeneous equation, $y^{\prime \prime}=0$ . After two integrations, we have $y(x)=A x+B$ , for $A$ and $B$ constants to be determined.

We need one solution satisfying $y_{1}(0)=0$ Thus, $0=y_{1}(0)=B$ . So, we can pick $y_{1}(x)=x$ , since $A$ is arbitrary.

The other solution has to satisfy $y_{2}(1)=0$ . So, $0=y_{2}(1)=A+B$ . This can be solved for $B=-A$ . Again, $A$ is arbitrary and we will choose $A=-1$ . Thus, $y_{2}(x)=1-x$ .

For this problem $p(x)=1$ . Thus, for $y_{1}(x)=x$ and $y_{2}(x)=1-x$

$p(x) W(x)=y_{1}(x) y_{2}^{\prime}(x)-y_{1}^{\prime}(x) y_{2}(x)=x(-1)-1(1-x)=-1 . \nonumber$

Note that $p(x) W(x)$ is a constant, as it should be. Now we construct the Green’s function. We have

$G(x, \xi)=\left\{\begin{array}{l} -\xi(1-x), 0 \leq \xi \leq x \\[4pt] -x(1-\xi), x \leq \xi \leq 1 \end{array}\right. \nonumber$

Notice the symmetry between the two branches of the Green’s function. Also, the Green’s function satisfies homogeneous boundary conditions: $G(0, \xi)=0$ , from the lower branch, and $G(1, \xi)=0$ , from the upper branch.

Finally, we insert the Green’s function into the integral form of the solution:

$\begin{aligned} y(x) &=\int_{0}^{1} G(x, \xi) f(\xi) d \xi \\[4pt] &=\int_{0}^{1} G(x, \xi) \xi^{2} d \xi \\[4pt] &=-\int_{0}^{x} \xi(1-x) \xi^{2} d \xi-\int_{x}^{1} x(1-\xi) \xi^{2} d \xi \\[4pt] &=-(1-x) \int_{0}^{x} \xi^{3} d \xi-x \int_{x}^{1}\left(\xi^{2}-\xi^{3}\right) d \xi \\[4pt] &=-(1-x)\left[\dfrac{\xi^{4}}{4}\right]_{0}^{x}-x\left[\dfrac{\xi^{3}}{3}-\dfrac{\xi^{4}}{4}\right]_{x}^{1} \\[4pt] &=-\dfrac{1}{4}(1-x) x^{4}-\dfrac{1}{12} x(4-3)+\dfrac{1}{12} x\left(4 x^{3}-3 x^{4}\right) \\[4pt] &=\dfrac{1}{12}\left(x^{4}-x\right) \end{aligned} \nonumber$

Properties of Green’s Functions

We have noted some properties of Green’s functions in the last section. In this section we will elaborate on some of these properties as a tool for quickly constructing Green’s functions for boundary value problems. Here is a list of the properties based upon our previous solution.

Properties of the Green’s Function

Differential Equation:

$\dfrac{\partial}{\partial x}\left(p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right)+q(x) G(x, \xi)=0, x \neq \xi$

For $x<\xi$ we are on the second branch and $G(x, \xi)$ is proportional to $y_{1}(x)$ . Thus, since $y_{1}(x)$ is a solution of the homogeneous equation, then so is $G(x, \xi)$ . For $x>\xi$ we are on the first branch and $G(x, \xi)$ is proportional to $y_{2}(x)$ . So, once again $G(x, \xi)$ is a solution of the homogeneous problem.

Boundary Conditions

For $x=a$ we are on the second branch and $G(x, \xi)$ is proportional to $y_{1}(x)$ . Thus, whatever condition $y_{1}(x)$ satisfies, $G(x, \xi)$ will satisfy. A similar statement can be made for $x=b$ .

Symmetry or Reciprocity: $G(x, \xi)=G(\xi, x)$ We had shown this in the last section.
Continuity of $\mathbf{G}$ at $x=\xi: G\left(\xi^{+}, \xi\right)=G\left(\xi^{-}, \xi\right)$ Here we have defined

Setting $x=\xi$ in both branches, we have

$\dfrac{y_{1}(\xi) y_{2}(\xi)}{p W}=\dfrac{y_{1}(\xi) y_{2}(\xi)}{p W} \nonumber$

Jump Discontinuity of $\dfrac{\partial G}{\partial x}$ at $x=\xi$ :

$\dfrac{\partial G\left(\xi^{+}, \xi\right)}{\partial x}-\dfrac{\partial G\left(\xi^{-}, \xi\right)}{\partial x}=\dfrac{1}{p(\xi)} \nonumber$

This case is not as obvious. We first compute the derivatives by noting which branch is involved and then evaluate the derivatives and subtract them. Thus, we have

$\begin{aligned} \dfrac{\partial G\left(\xi^{+}, \xi\right)}{\partial x}-\dfrac{\partial G\left(\xi^{-}, \xi\right)}{\partial x} &=-\dfrac{1}{p W} y_{1}(\xi) y_{2}^{\prime}(\xi)+\dfrac{1}{p W} y_{1}^{\prime}(\xi) y_{2}(\xi) \\[4pt] &=-\dfrac{y_{1}^{\prime}(\xi) y_{2}(\xi)-y_{1}(\xi) y_{2}^{\prime}(\xi)}{p(\xi)\left(y_{1}(\xi) y_{2}^{\prime}(\xi)-y_{1}^{\prime}(\xi) y_{2}(\xi)\right)} \\[4pt] &=\dfrac{1}{p(\xi)} \end{aligned} \nonumber$

We now show how a knowledge of these properties allows one to quickly construct a Green’s function.

$\begin{aligned} & G\left(\xi^{+}, x\right)=\lim _{x \downarrow \xi} G(x, \xi), \quad x>\xi, \\[4pt] & G\left(\xi^{-}, x\right)=\lim _{x \uparrow \xi} G(x, \xi), \quad x<\xi . \end{aligned} \nonumber$

Example 8.5. Construct the Green’s function for the problem

$\begin{gathered} y^{\prime \prime}+\omega^{2} y=f(x), \quad 0<x<1, \\[4pt] y(0)=0=y(1), \end{gathered} \nonumber$

with $\omega \neq 0$ .

I. Find solutions to the homogeneous equation.

A general solution to the homogeneous equation is given as

$y_{h}(x)=c_{1} \sin \omega x+c_{2} \cos \omega x . \nonumber$

Thus, for $x \neq \xi$

$G(x, \xi)=c_{1}(\xi) \sin \omega x+c_{2}(\xi) \cos \omega x \nonumber$

II. Boundary Conditions.

First, we have $G(0, \xi)=0$ for $0 \leq x \leq \xi$ . So,

$G(0, \xi)=c_{2}(\xi) \cos \omega x=0 . \nonumber$

So,

$G(x, \xi)=c_{1}(\xi) \sin \omega x, \quad 0 \leq x \leq \xi \nonumber$

Second, we have $G(1, \xi)=0$ for $\xi \leq x \leq 1$ . So,

$G(1, \xi)=c_{1}(\xi) \sin \omega+c_{2}(\xi) \cos \omega .=0 \nonumber$

A solution can be chosen with

$c_{2}(\xi)=-c_{1}(\xi) \tan \omega . \nonumber$

This gives

$G(x, \xi)=c_{1}(\xi) \sin \omega x-c_{1}(\xi) \tan \omega \cos \omega x . \nonumber$

This can be simplified by factoring out the $c_{1}(\xi)$ and placing the remaining terms over a common denominator. The result is

$\begin{aligned} G(x, \xi) &=\dfrac{c_{1}(\xi)}{\cos \omega}[\sin \omega x \cos \omega-\sin \omega \cos \omega x] \\[4pt] &=-\dfrac{c_{1}(\xi)}{\cos \omega} \sin \omega(1-x) \end{aligned} \nonumber$

Since the coefficient is arbitrary at this point, as can write the result as

$G(x, \xi)=d_{1}(\xi) \sin \omega(1-x), \quad \xi \leq x \leq 1 \nonumber$

We note that we could have started with $y_{2}(x)=\sin \omega(1-x)$ as one of our linearly independent solutions of the homogeneous problem in anticipation that $y_{2}(x)$ satisfies the second boundary condition.

Symmetry or Reciprocity

We now impose that $G(x, \xi)=G(\xi, x)$ . To this point we have that

$G(x, \xi)=\left\{\begin{array}{cl} c_{1}(\xi) \sin \omega x, & 0 \leq x \leq \xi \\[4pt] d_{1}(\xi) \sin \omega(1-x), & \xi \leq x \leq 1 \end{array}\right. \nonumber$

We can make the branches symmetric by picking the right forms for $c_{1}(\xi)$ and $d_{1}(\xi)$ . We choose $c_{1}(\xi)=C \sin \omega(1-\xi)$ and $d_{1}(\xi)=C \sin \omega \xi$ . Then,

$G(x, \xi)=\left\{\begin{array}{l} C \sin \omega(1-\xi) \sin \omega x, 0 \leq x \leq \xi \\[4pt] C \sin \omega(1-x) \sin \omega \xi, \xi \leq x \leq 1 \end{array} .\right. \nonumber$

Now the Green’s function is symmetric and we still have to determine the constant $C$ . We note that we could have gotten to this point using the Method of Variation of Parameters result where $C=\dfrac{1}{p W}$ .

IV. Continuity of $G(x, \xi)$

We note that we already have continuity by virtue of the symmetry imposed in the last step.

V. Jump Discontinuity in $\dfrac{\partial}{\partial x} G(x, \xi)$ .

We still need to determine $C$ . We can do this using the jump discontinuity of the derivative:

$\dfrac{\partial G\left(\xi^{+}, \xi\right)}{\partial x}-\dfrac{\partial G\left(\xi^{-}, \xi\right)}{\partial x}=\dfrac{1}{p(\xi)} \nonumber$

For our problem $p(x)=1$ . So, inserting our Green’s function, we have

$\begin{aligned} 1 &=\dfrac{\partial G\left(\xi^{+}, \xi\right)}{\partial x}-\dfrac{\partial G\left(\xi^{-}, \xi\right)}{\partial x} \\[4pt] &=\dfrac{\partial}{\partial x}[C \sin \omega(1-x) \sin \omega \xi]_{x=\xi}-\dfrac{\partial}{\partial x}[C \sin \omega(1-\xi) \sin \omega x\\[4pt] &=-\omega C \cos \omega(1-\xi) \sin \omega \xi-\omega C \sin \omega(1-\xi) \cos \omega \xi \\[4pt] &=-\omega C \sin \omega(\xi+1-\xi) \\[4pt] &=-\omega C \sin \omega . \end{aligned} \nonumber$

Therefore,

$C=-\dfrac{1}{\omega \sin \omega} . \nonumber$

Finally, we have our Green’s function:

$G(x, \xi)=\left\{\begin{array}{l} -\dfrac{\sin \omega(1-\xi) \sin \omega x}{\omega \sin \omega}, 0 \leq x \leq \xi \\[4pt] -\dfrac{\sin \omega(1-x) \sin \omega \xi}{\omega \sin \omega}, \xi \leq x \leq 1 \end{array} .\right. \nonumber$

It is instructive to compare this result to the Variation of Parameters result. We have the functions $y_{1}(x)=\sin \omega x$ and $y_{2}(x)=\sin \omega(1-x)$ as the solutions of the homogeneous equation satisfying $y_{1}(0)=0$ and $y_{2}(1)=0$ . We need to compute $p W$ :

$\begin{aligned} p(x) W(x) &=y_{1}(x) y_{2}^{\prime}(x)-y_{1}^{\prime}(x) y_{2}(x) \\[4pt] &=-\omega \sin \omega x \cos \omega(1-x)-\omega \cos \omega x \sin \omega(1-x) \\[4pt] &=-\omega \sin \omega \end{aligned} \nonumber$

Inserting this result into the Variation of Parameters result for the Green’s function leads to the same Green’s function as above.

The Dirac Delta Function

We will develop a more general theory of Green’s functions for ordinary differential equations which encompasses some of the listed properties. The Green’s function satisfies a homogeneous differential equation for $x \neq \xi$ ,

$\dfrac{\partial}{\partial x}\left(p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right)+q(x) G(x, \xi)=0, \quad x \neq \xi \nonumber$

When $x=\xi$ , we saw that the derivative has a jump in its value. This is similar to the step, or Heaviside, function,

$H(x)=\left\{\begin{array}{l} 1, x>0 \\[4pt] 0, x<0 \end{array}\right. \nonumber$

In the case of the step function, the derivative is zero everywhere except at the jump. At the jump, there is an infinite slope, though technically, we have learned that there is no derivative at this point. We will try to remedy this by introducing the Dirac delta function,

$\delta(x)=\dfrac{d}{d x} H(x) . \nonumber$

We will then show that the Green’s function satisfies the differential equation

$\dfrac{\partial}{\partial x}\left(p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right)+q(x) G(x, \xi)=\delta(x-\xi) \nonumber$

The Dirac delta function, $\delta(x)$ , is one example of what is known as a generalized function, or a distribution. Dirac had introduced this function in the 1930’s in his study of quantum mechanics as a useful tool. It was later studied in a general theory of distributions and found to be more than a simple tool used by physicists. The Dirac delta function, as any distribution, only makes sense under an integral.

Before defining the Dirac delta function and introducing some of its properties, we will look at some representations that lead to the definition. We will consider the limits of two sequences of functions.

First we define the sequence of functions

$f_{n}(x)=\left\{\begin{array}{l} 0,|x|>\dfrac{1}{n} \\[4pt] \dfrac{n}{2},|x|<\dfrac{1}{n} \end{array} .\right. \nonumber$

This is a sequence of functions as shown in Figure 8.1. As $n \rightarrow \infty$ , we find the limit is zero for $x \neq 0$ and is infinite for $x=0$ . However, the area under each member of the sequences is one since each box has height $\dfrac{n}{2}$ and width $\dfrac{2}{n}$ . Thus, the limiting function is zero at most points but has area one. (At this point the reader who is new to this should be doing some head scratching!)

Figure 8.1. A plot of the functions $f_{n}(x)$ for $n=2,4,8$ .

The limit is not really a function. It is a generalized function. It is called the Dirac delta function, which is defined by

$\delta(x)=0$ for $x \neq 0$
$\int_{-\infty}^{\infty} \delta(x) d x=1$

Another example is the sequence defined by

$D_{n}(x)=\dfrac{2 \sin n x}{x} \nonumber$

We can graph this function. We first rewrite this function as

$D_{n}(x)=2 n \dfrac{\sin n x}{n x} . \nonumber$

Now it is easy to see that as $x \rightarrow 0, D_{n}(x) \rightarrow 2 n$ . For large $x$ , The function tends to zero. A plot of this function is in Figure 8.2. For large $n$ the peak grows and the values of $D_{n}(x)$ for $x \neq 0$ tend to zero as show in Figure 8.3.

We note that in the limit $n \rightarrow \infty, D_{n}(x)=0$ for $x \neq 0$ and it is infinite at $x=0$ . However, using complex analysis one can show that the area is

$\int_{-\infty}^{\infty} D_{n}(x) d x=2 \pi \nonumber$

Thus, the area is constant for each $n$ .

Figure 8.2. A plot of the function $D_{n}(x)$ for $n=4$ .

Figure 8.3. A plot of the function $D_{n}(x)$ for $n=40$ .

There are two main properties that define a Dirac delta function. First one has that the area under the delta function is one,

$\int_{-\infty}^{\infty} \delta(x) d x=1 \nonumber$

Integration over more general intervals gives

$\int_{a}^{b} \delta(x) d x=1, \quad 0 \in[a, b] \nonumber$

and

$\int_{a}^{b} \delta(x) d x=0, \quad 0 \notin[a, b] . \nonumber$

Another common property is what is sometimes called the sifting property. Namely, integrating the product of a function and the delta function "sifts" out a specific value of the function. It is given by

$\int_{-\infty}^{\infty} \delta(x-a) f(x) d x=f(a) \nonumber$

This can be seen by noting that the delta function is zero everywhere except at $x=a$ . Therefore, the integrand is zero everywhere and the only contribution from $f(x)$ will be from $x=a$ . So, we can replace $f(x)$ with $f(a)$ under the integral. Since $f(a)$ is a constant, we have that

$\int_{-\infty}^{\infty} \delta(x-a) f(x) d x=\int_{-\infty}^{\infty} \delta(x-a) f(a) d x=f(a) \int_{-\infty}^{\infty} \delta(x-a) d x=f(a) \nonumber$

Another property results from using a scaled argument, ax. In this case we show that

$\delta(a x)=|a|^{-1} \delta(x) \nonumber$

As usual, this only has meaning under an integral sign. So, we place $\delta(a x)$ inside an integral and make a substitution $y=a x$ :

$\begin{aligned} \int_{-\infty}^{\infty} \delta(a x) d x &=\lim _{L \rightarrow \infty} \int_{-L}^{L} \delta(a x) d x \\[4pt] &=\lim _{L \rightarrow \infty} \dfrac{1}{a} \int_{-a L}^{a L} \delta(y) d y \end{aligned} \nonumber$

If $a>0$ then

$\int_{-\infty}^{\infty} \delta(a x) d x=\dfrac{1}{a} \int_{-\infty}^{\infty} \delta(y) d y \nonumber$

However, if $a<0$ then

$\int_{-\infty}^{\infty} \delta(a x) d x=\dfrac{1}{a} \int_{\infty}^{-\infty} \delta(y) d y=-\dfrac{1}{a} \int_{-\infty}^{\infty} \delta(y) d y \nonumber$

The overall difference in a multiplicative minus sign can be absorbed into one expression by changing the factor $1 / a$ to $1 /|a|$ . Thus,

$\int_{-\infty}^{\infty} \delta(a x) d x=\dfrac{1}{|a|} \int_{-\infty}^{\infty} \delta(y) d y . \nonumber$

Example 8.6. Evaluate $\int_{-\infty}^{\infty}(5 x+1) \delta(4(x-2)) d x$ . This is a straight forward integration:

$\int_{-\infty}^{\infty}(5 x+1) \delta(4(x-2)) d x=\dfrac{1}{4} \int_{-\infty}^{\infty}(5 x+1) \delta(x-2) d x=\dfrac{11}{4} \nonumber$

A more general scaling of the argument takes the form $\delta(f(x))$ . The integral of $\delta(f(x))$ can be evaluated depending upon the number of zeros of $f(x)$ . If there is only one zero, $f\left(x_{1}\right)=0$ , then one has that

$\int_{-\infty}^{\infty} \delta(f(x)) d x=\int_{-\infty}^{\infty} \dfrac{1}{\left|f^{\prime}\left(x_{1}\right)\right|} \delta\left(x-x_{1}\right) d x \nonumber$

This can be proven using the substitution $y=f(x)$ and is left as an exercise for the reader. This result is often written as

$\delta(f(x))=\dfrac{1}{\left|f^{\prime}\left(x_{1}\right)\right|} \delta\left(x-x_{1}\right) . \nonumber$

Example 8.7. Evaluate $\int_{-\infty}^{\infty} \delta(3 x-2) x^{2} d x$ .

This is not a simple $\delta(x-a)$ . So, we need to find the zeros of $f(x)=3 x-2$ . There is only one, $x=\dfrac{2}{3}$ . Also, $\left|f^{\prime}(x)\right|=3$ . Therefore, we have

$\int_{-\infty}^{\infty} \delta(3 x-2) x^{2} d x=\int_{-\infty}^{\infty} \dfrac{1}{3} \delta\left(x-\dfrac{2}{3}\right) x^{2} d x=\dfrac{1}{3}\left(\dfrac{2}{3}\right)^{2}=\dfrac{4}{27} . \nonumber$

Note that this integral can be evaluated the long way by using the substitution $y=3 x-2$ . Then, $d y=3 d x$ and $x=(y+2) / 3$ . This gives

$\int_{-\infty}^{\infty} \delta(3 x-2) x^{2} d x=\dfrac{1}{3} \int_{-\infty}^{\infty} \delta(y)\left(\dfrac{y+2}{3}\right)^{2} d y=\dfrac{1}{3}\left(\dfrac{4}{9}\right)=\dfrac{4}{27} \nonumber$

More generally, one can show that when $f\left(x_{j}\right)=0$ and $f^{\prime}\left(x_{j}\right) \neq 0$ for $x_{j}$ , $j=1,2, \ldots, n$ , (i.e.; when one has $n$ simple zeros), then

$\delta(f(x))=\sum_{j=1}^{n} \dfrac{1}{\left|f^{\prime}\left(x_{j}\right)\right|} \delta\left(x-x_{j}\right) . \nonumber$

Example 8.8. Evaluate $\int_{0}^{2 \pi} \cos x \delta\left(x^{2}-\pi^{2}\right) d x$

In this case the argument of the delta function has two simple roots. Namely, $f(x)=x^{2}-\pi^{2}=0$ when $x=\pm \pi$ . Furthermore, $f^{\prime}(x)=2 x$ . Therefore, $\left|f^{\prime}(\pm \pi)\right|=2 \pi$ . This gives

$\delta\left(x^{2}-\pi^{2}\right)=\dfrac{1}{2 \pi}[\delta(x-\pi)+\delta(x+\pi)] . \nonumber$

Inserting this expression into the integral and noting that $x=-\pi$ is not in the integration interval, we have

$\begin{aligned} \int_{0}^{2 \pi} \cos x \delta\left(x^{2}-\pi^{2}\right) d x &=\dfrac{1}{2 \pi} \int_{0}^{2 \pi} \cos x[\delta(x-\pi)+\delta(x+\pi)] d x \\[4pt] &=\dfrac{1}{2 \pi} \cos \pi=-\dfrac{1}{2 \pi} \end{aligned} \nonumber$

Finally, we previously noted there is a relationship between the Heaviside, or step, function and the Dirac delta function. We defined the Heaviside function as

$H(x)=\left\{\begin{array}{l} 0, x<0 \\[4pt] 1, x>0 \end{array}\right. \nonumber$

Then, it is easy to see that $H^{\prime}(x)=\delta(x)$ .

Green’s Function Differential Equation

As noted, the Green’s function satisfies the differential equation

$\dfrac{\partial}{\partial x}\left(p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right)+q(x) G(x, \xi)=\delta(x-\xi) \nonumber$

and satisfies homogeneous conditions. We have used the Green’s function to solve the nonhomogeneous equation

$\dfrac{d}{d x}\left(p(x) \dfrac{d y(x)}{d x}\right)+q(x) y(x)=f(x) \nonumber$

These equations can be written in the more compact forms

$\begin{gathered} \mathcal{L}[y]=f(x) \\[4pt] \mathcal{L}[G]=\delta(x-\xi) \end{gathered} \nonumber$

Multiplying the first equation by $G(x, \xi)$ , the second equation by $y(x)$ , and then subtracting, we have

$G \mathcal{L}[y]-y \mathcal{L}[G]=f(x) G(x, \xi)-\delta(x-\xi) y(x) . \nonumber$

Now, integrate both sides from $x=a$ to $x=b$ . The left side becomes

$\int_{a}^{b}[f(x) G(x, \xi)-\delta(x-\xi) y(x)] d x=\int_{a}^{b} f(x) G(x, \xi) d x-y(\xi) \nonumber$

and, using Green’s Identity, the right side is

$\int_{a}^{b}(G \mathcal{L}[y]-y \mathcal{L}[G]) d x=\left[p(x)\left(G(x, \xi) y^{\prime}(x)-y(x) \dfrac{\partial G}{\partial x}(x, \xi)\right)\right]_{x=a}^{x=b} \nonumber$

Combining these results and rearranging, we obtain

$y(\xi)=\int_{a}^{b} f(x) G(x, \xi) d x-\left[p(x)\left(y(x) \dfrac{\partial G}{\partial x}(x, \xi)-G(x, \xi) y^{\prime}(x)\right)\right]_{x=a}^{x=b} \nonumber$

Next, one uses the boundary conditions in the problem in order to determine which conditions the Green’s function needs to satisfy. For example, if we have the boundary condition $y(a)=0$ and $y(b)=0$ , then the boundary terms yield

$\begin{aligned} y(\xi)=& \int_{a}^{b} f(x) G(x, \xi) d x-\left[p(b)\left(y(b) \dfrac{\partial G}{\partial x}(b, \xi)-G(b, \xi) y^{\prime}(b)\right)\right] \\[4pt] &+\left[p(a)\left(y(a) \dfrac{\partial G}{\partial x}(a, \xi)-G(a, \xi) y^{\prime}(a)\right)\right] \\[4pt] =& \int_{a}^{b} f(x) G(x, \xi) d x+p(b) G(b, \xi) y^{\prime}(b)-p(a) G(a, \xi) y^{\prime}(a) . \end{aligned} \nonumber$

The right hand side will only vanish if $G(x, \xi)$ also satisfies these homogeneous boundary conditions. This then leaves us with the solution

$y(\xi)=\int_{a}^{b} f(x) G(x, \xi) d x . \nonumber$

We should rewrite this as a function of $x$ . So, we replace $\xi$ with $x$ and $x$ with $\xi$ . This gives

$y(x)=\int_{a}^{b} f(\xi) G(\xi, x) d \xi . \nonumber$

However, this is not yet in the desirable form. The arguments of the Green’s function are reversed. But, $G(x, \xi)$ is symmetric in its arguments. So, we can simply switch the arguments getting the desired result.

We can now see that the theory works for other boundary conditions. If we had $y^{\prime}(a)=0$ , then the $y(a) \dfrac{\partial G}{\partial x}(a, \xi)$ term in the boundary terms could be made to vanish if we set $\dfrac{\partial G}{\partial x}(a, \xi)=0$ . So, this confirms that other boundary value problems can be posed besides the one elaborated upon in the chapter so far.

We can even adapt this theory to nonhomogeneous boundary conditions. We first rewrite Equation (8.62) as

$y(x)=\int_{a}^{b} G(x, \xi) f(\xi) d \xi-\left[p(\xi)\left(y(\xi) \dfrac{\partial G}{\partial \xi}(x, \xi)-G(x, \xi) y^{\prime}(\xi)\right)\right]_{\xi=a}^{\xi=b} \nonumber$

Let’s consider the boundary conditions $y(a)=\alpha$ and $y^{\prime}(b)=$ beta. We also assume that $G(x, \xi)$ satisfies homogeneous boundary conditions,

$G(a, \xi)=0, \quad \dfrac{\partial G}{\partial \xi}(b, \xi)=0 . \nonumber$

in both $x$ and $\xi$ since the Green’s function is symmetric in its variables. Then, we need only focus on the boundary terms to examine the effect on the solution. We have

$\begin{aligned} {\left[p(\xi)\left(y(\xi) \dfrac{\partial G}{\partial \xi}(x, \xi)-G(x, \xi) y^{\prime}(\xi)\right)\right]_{\xi=a}^{\xi=b} } &=\left[p(b)\left(y(b) \dfrac{\partial G}{\partial \xi}(x, b)-G(x, b) y^{\prime}(b)\right)\right] \\[4pt] &-\left[p(a)\left(y(a) \dfrac{\partial G}{\partial \xi}(x, a)-G(x, a) y^{\prime}(a)\right)\right] \\[4pt] &=-\beta p(b) G(x, b)-\alpha p(a) \dfrac{\partial G}{\partial \xi}(x, a) . \end{aligned} \nonumber$

Therefore, we have the solution

$y(x)=\int_{a}^{b} G(x, \xi) f(\xi) d \xi+\beta p(b) G(x, b)+\alpha p(a) \dfrac{\partial G}{\partial \xi}(x, a) . \nonumber$

This solution satisfies the nonhomogeneous boundary conditions. Let’s see how it works. Example 8.9. Modify Example $8.4$ to solve the boundary value problem $y^{\prime \prime}=$ $x^{2}, \quad y(0)=1, y(1)=2$ using the boundary value Green’s function that we found:

$G(x, \xi)=\left\{\begin{array}{l} -\xi(1-x), 0 \leq \xi \leq x \\[4pt] -x(1-\xi), x \leq \xi \leq 1 \end{array}\right. \nonumber$

We insert the Green’s function into the solution and use the given conditions to obtain

$\begin{aligned} y(x) &=\int_{0}^{1} G(x, \xi) \xi^{2} d \xi-\left[y(\xi) \dfrac{\partial G}{\partial \xi}(x, \xi)-G(x, \xi) y^{\prime}(\xi)\right]_{\xi=0}^{\xi=1} \\[4pt] &=\int_{0}^{x}(x-1) \xi^{3} d \xi+\int_{x}^{1} x(\xi-1) \xi^{2} d \xi+y(0) \dfrac{\partial G}{\partial \xi}(x, 0)-y(1) \dfrac{\partial G}{\partial \xi}(x) \\[4pt] &=\dfrac{(x-1) x^{4}}{4}+\dfrac{x\left(1-x^{4}\right)}{4}-\dfrac{x\left(1-x^{3}\right)}{3}+(x-1)-2 x \\[4pt] &=\dfrac{x^{4}}{12}+\dfrac{35}{12} x-1 \end{aligned} \nonumber$

Of course, this problem can be solved more directly by direct integration. The general solution is

$y(x)=\dfrac{x^{4}}{12}+c_{1} x+c_{2} . \nonumber$

Inserting this solution into each boundary condition yields the same result.

We have seen how the introduction of the Dirac delta function in the differential equation satisfied by the Green’s function, Equation (8.59), can lead to the solution of boundary value problems. The Dirac delta function also aids in our interpretation of the Green’s function. We note that the Green’s function is a solution of an equation in which the nonhomogeneous function is $\delta(x-\xi)$ . Note that if we multiply the delta function by $f(\xi)$ and integrate we obtain

$\int_{-\infty}^{\infty} \delta(x-\xi) f(\xi) d \xi=f(x) \nonumber$

We can view the delta function as a unit impulse at $x=\xi$ which can be used to build $f(x)$ as a sum of impulses of different strengths, $f(\xi)$ . Thus, the Green’s function is the response to the impulse as governed by the differential equation and given boundary conditions.

In particular, the delta function forced equation can be used to derive the jump condition. We begin with the equation in the form

$\dfrac{\partial}{\partial x}\left(p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right)+q(x) G(x, \xi)=\delta(x-\xi) \nonumber$

Now, integrate both sides from $\xi-\epsilon$ to $\xi+\epsilon$ and take the limit as $\epsilon \rightarrow 0$ . Then,

$\begin{aligned} \lim _{\epsilon \rightarrow 0} \int_{\xi-\epsilon}^{\xi+\epsilon}\left[\dfrac{\partial}{\partial x}\left(p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right)+q(x) G(x, \xi)\right] d x &=\lim _{\epsilon \rightarrow 0} \int_{\xi-\epsilon}^{\xi+\epsilon} \delta(x-\xi) d x \\[4pt] &=1 \end{aligned} \nonumber$

Since the $q(x)$ term is continuous, the limit of that term vanishes. Using the Fundamental Theorem of Calculus, we then have

$\lim _{\epsilon \rightarrow 0}\left[p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right]_{\xi-\epsilon}^{\xi+\epsilon}=1 . \nonumber$

This is the jump condition that we have been using!

Series Representations of Green’s Functions

There are times that it might not be so simple to find the Green’s function in the simple closed form that we have seen so far. However, there is a method for determining the Green’s functions of Sturm-Liouville boundary value problems in the form of an eigenfunction expansion. We will finish our discussion of Green’s functions for ordinary differential equations by showing how one obtains such series representations. (Note that we are really just repeating the steps towards developing eigenfunction expansion which we had seen in Chapter 6.)

We will make use of the complete set of eigenfunctions of the differential operator, $\mathcal{L}$ , satisfying the homogeneous boundary conditions:

$\mathcal{L}\left[\phi_{n}\right]=-\lambda_{n} \sigma \phi_{n}, \quad n=1,2, \ldots \nonumber$

We want to find the particular solution $y$ satisfying $\mathcal{L}[y]=f$ and homogeneous boundary conditions. We assume that

$y(x)=\sum_{n=1}^{\infty} a_{n} \phi_{n}(x) . \nonumber$

Inserting this into the differential equation, we obtain

$\mathcal{L}[y]=\sum_{n=1}^{\infty} a_{n} \mathcal{L}\left[\phi_{n}\right]=-\sum_{n=1}^{\infty} \lambda_{n} a_{n} \sigma \phi_{n}=f \nonumber$

This has resulted in the generalized Fourier expansion

$f(x)=\sum_{n=1}^{\infty} c_{n} \sigma \phi_{n}(x) \nonumber$

with coefficients

$c_{n}=-\lambda_{n} a_{n} \nonumber$

We have seen how to compute these coefficients earlier in the text. We multiply both sides by $\phi_{k}(x)$ and integrate. Using the orthogonality of the eigenfunctions,

$\int_{a}^{b} \phi_{n}(x) \phi_{k}(x) \sigma(x) d x=N_{k} \delta_{n k} \nonumber$

one obtains the expansion coefficients (if $\lambda_{k} \neq 0$ )

$a_{k}=-\dfrac{\left(f, \phi_{k}\right)}{N_{k} \lambda_{k}}, \nonumber$

where $\left(f, \phi_{k}\right) \equiv \int_{a}^{b} f(x) \phi_{k}(x) d x$ .

As before, we can rearrange the solution to obtain the Green’s function. Namely, we have

$y(x)=\sum_{n=1}^{\infty} \dfrac{\left(f, \phi_{n}\right)}{-N_{n} \lambda_{n}} \phi_{n}(x)=\int_{a}^{b} \underbrace{\sum_{n=1}^{\infty} \dfrac{\phi_{n}(x) \phi_{n}(\xi)}{-N_{n} \lambda_{n}}}_{G(x, \xi)} f(\xi) d \xi \nonumber$

Therefore, we have found the Green’s function as an expansion in the eigenfunctions:

$G(x, \xi)=\sum_{n=1}^{\infty} \dfrac{\phi_{n}(x) \phi_{n}(\xi)}{-\lambda_{n} N_{n}} . \nonumber$

Example 8.10. Eigenfunction Expansion Example

We will conclude this discussion with an example. Consider the boundary value problem

$y^{\prime \prime}+4 y=x^{2}, \quad x \in(0,1), \quad y(0)=y(1)=0 . \nonumber$

The Green’s function for this problem can be constructed fairly quickly for this problem once the eigenvalue problem is solved. We will solve this problem three different ways in order to summarize the methods we have used in the text.

The eigenvalue problem is

$\phi^{\prime \prime}(x)+4 \phi(x)=-\lambda \phi(x) \nonumber$

where $\phi(0)=0$ and $\phi(1)=0$ . The general solution is obtained by rewriting the equation as

$\phi^{\prime \prime}(x)+k^{2} \phi(x)=0 \nonumber$

where

$k^{2}=4+\lambda \nonumber$

Solutions satisfying the boundary condition at $x=0$ are of the form

$\phi(x)=A \sin k x . \nonumber$

Forcing $\phi(1)=0$ gives

$0=A \sin k \Rightarrow k=n \pi, \quad k=1,2,3 \ldots \nonumber$

So, the eigenvalues are

$\lambda_{n}=n^{2} \pi^{2}-4, \quad n=1,2, \ldots \nonumber$

and the eigenfunctions are

$\phi_{n}=\sin n \pi x, \quad n=1,2, \ldots \nonumber$

We need the normalization constant, $N_{n}$ . We have that

$N_{n}=\left\|\phi_{n}\right\|^{2}=\int_{0}^{1} \sin ^{2} n \pi x=\dfrac{1}{2} . \nonumber$

We can now construct the Green’s function for this problem using Equation $(8.72)$

$G(x, \xi)=2 \sum_{n=1}^{\infty} \dfrac{\sin n \pi x \sin n \pi \xi}{\left(4-n^{2} \pi^{2}\right)} . \nonumber$

We can use this Green’s function to determine the solution of the boundary value problem. Thus, we have

$\begin{aligned} y(x) &=\int_{0}^{1} G(x, \xi) f(\xi) d \xi \\[4pt] &=\int_{0}^{1}\left(2 \sum_{n=1}^{\infty} \dfrac{\sin n \pi x \sin n \pi \xi}{\left(4-n^{2} \pi^{2}\right)}\right) \xi^{2} d \xi \\[4pt] &=2 \sum_{n=1}^{\infty} \dfrac{\sin n \pi x}{\left(4-n^{2} \pi^{2}\right)} \int_{0}^{1} \xi^{2} \sin n \pi \xi d \xi \\[4pt] &=2 \sum_{n=1}^{\infty} \dfrac{\sin n \pi x}{\left(4-n^{2} \pi^{2}\right)}\left[\dfrac{\left(2-n^{2} \pi^{2}\right)(-1)^{n}-2}{n^{3} \pi^{3}}\right] \end{aligned} \nonumber$

We can compare this solution to the one one would obtain if we did not employ Green’s functions directly. The eigenfunction expansion method for solving boundary value problems, which we saw earlier proceeds as follows. We assume that our solution is in the form

$y(x)=\sum_{n=1}^{\infty} c_{n} \phi_{n}(x) . \nonumber$

Inserting this into the differential equation $\mathcal{L}[y]=x^{2}$ gives

$\begin{aligned} x^{2} &=\mathcal{L}\left[\sum_{n=1}^{\infty} c_{n} \sin n \pi x\right] \\[4pt] &=\sum_{n=1}^{\infty} c_{n}\left[\dfrac{d^{2}}{d x^{2}} \sin n \pi x+4 \sin n \pi x\right] \\[4pt] &=\sum_{n=1}^{\infty} c_{n}\left[4-n^{2} \pi^{2}\right] \sin n \pi x \end{aligned} \nonumber$

We need the Fourier sine series expansion of $x^{2}$ on $[0,1]$ in order to determine the $c_{n}$ ’s. Thus, we need

$\begin{aligned} b_{n} &=\dfrac{2}{1} \int_{0}^{1} x^{2} \sin n \pi x \\[4pt] &=2\left[\dfrac{\left(2-n^{2} \pi^{2}\right)(-1)^{n}-2}{n^{3} \pi^{3}}\right], \quad n=1,2, \ldots \end{aligned} \nonumber$

Thus,

$x^{2}=2 \sum_{n=1}^{\infty}\left[\dfrac{\left(2-n^{2} \pi^{2}\right)(-1)^{n}-2}{n^{3} \pi^{3}}\right] \sin n \pi x . \nonumber$

Inserting this in Equation $(8.75)$ , we find

$2 \sum_{n=1}^{\infty}\left[\dfrac{\left(2-n^{2} \pi^{2}\right)(-1)^{n}-2}{n^{3} \pi^{3}}\right] \sin n \pi x=\sum_{n=1}^{\infty} c_{n}\left[4-n^{2} \pi^{2}\right] \sin n \pi x . \nonumber$

Due to the linear independence of the eigenfunctions, we can solve for the unknown coefficients to obtain

$c_{n}=2 \dfrac{\left(2-n^{2} \pi^{2}\right)(-1)^{n}-2}{\left(4-n^{2} \pi^{2}\right) n^{3} \pi^{3}} \nonumber$

Therefore, the solution using the eigenfunction expansion method is

$\begin{aligned} y(x) &=\sum_{n=1}^{\infty} c_{n} \phi_{n}(x) \\[4pt] &=2 \sum_{n=1}^{\infty} \dfrac{\sin n \pi x}{\left(4-n^{2} \pi^{2}\right)}\left[\dfrac{\left(2-n^{2} \pi^{2}\right)(-1)^{n}-2}{n^{3} \pi^{3}}\right] \end{aligned} \nonumber$

We note that this is the same solution as we had obtained using the Green’s function obtained in series form.

One remaining question is the following: Is there a closed form for the Green’s function and the solution to this problem? The answer is yes! We note that the differential operator is a special case of the example done is section 8.2.2. Namely, we pick $\omega=2$ . The Green’s function was already found in that section. For this special case, we have

$G(x, \xi)=\left\{\begin{array}{l} -\dfrac{\sin 2(1-\xi) \sin 2 x}{2 \sin 2}, 0 \leq x \leq \xi \\[4pt] -\dfrac{\sin 2(1-x) \sin 2 \xi}{2 \sin 2}, \xi \leq x \leq 1 \end{array}\right. \nonumber$

What about the solution to the boundary value problem? This solution is given by

$\begin{aligned} y(x) &=\int_{0}^{1} G(x, \xi) f(\xi) d \xi \\[4pt] &=-\int_{0}^{x} \dfrac{\sin 2(1-x) \sin 2 \xi}{2 \sin 2} \xi^{2} d \xi+\int_{x}^{1} \dfrac{\sin 2(\xi-1) \sin 2 x}{2 \sin 2} \xi^{2} d \xi \\[4pt] &=-\dfrac{1}{4 \sin 2}\left[-x^{2} \sin 2-\sin 2 \cos ^{2} x+\sin 2+\cos 2 \sin x \cos x+\sin x \cos x\right] \\[4pt] &=-\dfrac{1}{4 \sin 2}\left[-x^{2} \sin 2+\left(1-\cos ^{2} x\right) \sin 2+\sin x \cos x(1+\cos 2)\right] \\[4pt] &\left.=-\dfrac{1}{4 \sin 2}\left[-x^{2} \sin 2+2 \sin ^{2} x \sin 1 \cos 1+2 \sin x \cos x \cos ^{2} 1\right)\right] \\[4pt] &=-\dfrac{1}{8 \sin 1 \cos 1}\left[-x^{2} \sin 2+2 \sin x \cos 1(\sin x \sin 1+\cos x \cos 1)\right] \\[4pt] &=\dfrac{x^{2}}{4}-\dfrac{\sin x \cos (1-x)}{4 \sin 1} . \end{aligned} \nonumber$

In Figure $8.4$ we show a plot of this solution along with the first five terms of the series solution. The series solution converges quickly.

Figure 8.4. Plots of the exact solution to Example 8.10 with the first five terms of the series solution.

As one last check, we solve the boundary value problem directly, as we had done in Chapter 4. Again, the problem is

$y^{\prime \prime}+4 y=x^{2}, \quad x \in(0,1), \quad y(0)=y(1)=0 . \nonumber$

The problem has the general solution

$y(x)=c_{1} \cos 2 x+c_{2} \sin 2 x+y_{p}(x), \nonumber$

where $y_{p}$ is a particular solution of the nonhomogeneous differential equation. Using the Method of Undetermined Coefficients, we assume a solution of the form

$y_{p}(x)=A x^{2}+B x+C . \nonumber$

Inserting this in the nonhomogeneous equation, we have

$2 A+4\left(A x^{2}+B x+C\right)=x^{2}, \nonumber$

Thus, $B=0,4 A=1$ and $2 A+4 C=0$ . The solution of this system is

$A=\dfrac{1}{4}, \quad B=0, \quad C=-\dfrac{1}{8} \nonumber$

So, the general solution of the nonhomogeneous differential equation is

$y(x)=c_{1} \cos 2 x+c_{2} \sin 2 x+\dfrac{x^{2}}{4}-\dfrac{1}{8} . \nonumber$

We now determine the arbitrary constants using the boundary conditions. We have

$\begin{aligned} 0 &=y(0) \\[4pt] &=c_{1}-\dfrac{1}{8} \\[4pt] 0 &=y(1) \\[4pt] &=c_{1} \cos 2+c_{2} \sin 2+\dfrac{1}{8} \end{aligned} \nonumber$

Thus, $c_{1}=\dfrac{1}{8}$ and

$c_{2}=-\dfrac{\dfrac{1}{8}+\dfrac{1}{8} \cos 2}{\sin 2} \nonumber$

Inserting these constants in the solution we find the same solution as before.

$\begin{aligned} y(x) &=\dfrac{1}{8} \cos 2 x-\left[\dfrac{\dfrac{1}{8}+\dfrac{1}{8} \cos 2}{\sin 2}\right] \sin 2 x+\dfrac{x^{2}}{4}-\dfrac{1}{8} \\[4pt] &=\dfrac{\cos 2 x \sin 2-\sin 2 x \cos 2-\sin 2 x}{8 \sin 2}+\dfrac{x^{2}}{4}-\dfrac{1}{8} \\[4pt] &=\dfrac{\left(1-2 \sin ^{2} x\right) \sin 1 \cos 1-\sin x \cos x\left(2 \cos ^{2} 1-1\right)-\sin x \cos x-\sin 1 \cos 1}{8 \sin 1 \cos 1}+\dfrac{x^{2}}{4} \\[4pt] &=-\dfrac{\sin ^{2} x \sin 1+\sin x \cos x \cos 1}{4 \sin 1}+\dfrac{x^{2}}{4} \\[4pt] &=\dfrac{x^{2}}{4}-\dfrac{\sin x \cos (1-x)}{4 \sin 1} . \end{aligned} \nonumber$

Problems

8.1. Use the Method of Variation of Parameters to determine the general solution for the following problems.

a. $y^{\prime \prime}+y=\tan x$ .

b. $y^{\prime \prime}-4 y^{\prime}+4 y=6 x e^{2 x}$

8.2. Instead of assuming that $c_{1}^{\prime} y_{1}+c_{2}^{\prime} y_{2}=0$ in the derivation of the solution using Variation of Parameters, assume that $c_{1}^{\prime} y_{1}+c_{2}^{\prime} y_{2}=h(x)$ for an arbitrary function $h(x)$ and show that one gets the same particular solution.

8.3. Find the solution of each initial value problem using the appropriate initial value Green’s function.

a. $y^{\prime \prime}-3 y^{\prime}+2 y=20 e^{-2 x}, \quad y(0)=0, \quad y^{\prime}(0)=6$ .

b. $y^{\prime \prime}+y=2 \sin 3 x, \quad y(0)=5, \quad y^{\prime}(0)=0$ .

c. $y^{\prime \prime}+y=1+2 \cos x, \quad y(0)=2, \quad y^{\prime}(0)=0$ .

d. $x^{2} y^{\prime \prime}-2 x y^{\prime}+2 y=3 x^{2}-x, \quad y(1)=\pi, \quad y^{\prime}(1)=0$ .

8.4. Consider the problem $y^{\prime \prime}=\sin x, y^{\prime}(0)=0, y(\pi)=0$ .

a. Solve by direct integration.

b. Determine the Green’s function.

c. Solve the boundary value problem using the Green’s function.

d. Change the boundary conditions to $y^{\prime}(0)=5, y(\pi)=-3$ .

i. Solve by direct integration.

ii. Solve using the Green’s function.

8.5. Consider the problem:

$\dfrac{\partial^{2} G}{\partial x^{2}}=\delta\left(x-x_{0}\right), \quad \dfrac{\partial G}{\partial x}\left(0, x_{0}\right)=0, \quad G\left(\pi, x_{0}\right)=0 \nonumber$

a. Solve by direct integration.

b. Compare this result to the Green’s function in part b of the last problem.

c. Verify that $G$ is symmetric in its arguments.

8.6. In this problem you will show that the sequence of functions

$f_{n}(x)=\dfrac{n}{\pi}\left(\dfrac{1}{1+n^{2} x^{2}}\right) \nonumber$

approaches $\delta(x)$ as $n \rightarrow \infty$ . Use the following to support your argument:

a. Show that $\lim _{n \rightarrow \infty} f_{n}(x)=0$ for $x \neq 0$ .

b. Show that the area under each function is one.

8.7. Verify that the sequence of functions $\left\{f_{n}(x)\right\}_{n=1}^{\infty}$ , defined by $f_{n}(x)=$ $\dfrac{n}{2} e^{-n|x|}$ , approaches a delta function. 8.8. Evaluate the following integrals:
a. $\int_{0}^{\pi} \sin x \delta\left(x-\dfrac{\pi}{2}\right) d x$ .
b. $\int_{-\infty}^{\infty} \delta\left(\dfrac{x-5}{3} e^{2 x}\right)\left(3 x^{2}-7 x+2\right) d x$
c. $\int_{0}^{\pi} x^{2} \delta\left(x+\dfrac{\pi}{2}\right) d x$
d. $\int_{0}^{\infty} e^{-2 x} \delta\left(x^{2}-5 x+6\right) d x$ . [See Problem 8.10.]
e. $\int_{-\infty}^{\infty}\left(x^{2}-2 x+3\right) \delta\left(x^{2}-9\right) d x$ . [See Problem 8.10.]

8.9. Find a Fourier series representation of the Dirac delta function, $\delta(x)$ , on $[-L, L]$

8.10. For the case that a function has multiple simple roots, $f\left(x_{i}\right)=0$ , $f^{\prime}\left(x_{i}\right) \neq 0, i=1,2, \ldots$ , it can be shown that

$\delta(f(x))=\sum_{i=1}^{n} \dfrac{\delta\left(x-x_{i}\right)}{\left|f^{\prime}\left(x_{i}\right)\right|} \nonumber$

Use this result to evaluate $\int_{-\infty}^{\infty} \delta\left(x^{2}-5 x+6\right)\left(3 x^{2}-7 x+2\right) d x$ .

8.11. Consider the boundary value problem: $y^{\prime \prime}-y=x, x \in(0,1)$ , with boundary conditions $y(0)=y(1)=0$ .

a. Find a closed form solution without using Green’s functions.

b. Determine the closed form Green’s function using the properties of Green’s functions. Use this Green’s function to obtain a solution of the boundary value problem.

c. Determine a series representation of the Green’s function. Use this Green’s function to obtain a solution of the boundary value problem.

d. Confirm that all of the solutions obtained give the same results.

Search

Text Color

Text Size

Margin Size

Font Type

Planar Systems - Summary

In Search of Solutions

Maple Code for Phase Plane Plots

Solving the Heat Equation

Eigenfunction Expansions for PDEs

Eigenfunction Expansions for Nonhomogeneous ODEs

Linear Vector Spaces

Fourier Series on $[a, b]$

Adjoint Operators

Lagrange’s and Green’s Identities

Orthogonality and Reality

The Rayleigh Quotient

Initial Value Green’s Function

Introduction

1.11.1 Review of the First Course

First Order Differential Equations

Second Order Linear Differential Equations

Constant Coefficient Equations

Method of Undetermined Coefficients

Modified Method of Undetermined Coefficients

Cauchy-Euler Equations

Overview of the Course

Appendix: Reduction of Order and Complex Roots

Method of Reduction of Order

Complex Roots

Problems

Systems of Differential Equations

Introduction

2.22.2 Equilibrium Solutions and Nearby Behaviors

Example 2.3. Stable Node (\operatorname{sink})(\operatorname{sink})

Example 2.5. Saddle

Example 2.6. Unstable Node (source)

Example 2.9. Degenerate Node

Polar Representation of Spirals

Time Derivatives of Polar Variables

Matrix Formulation

2.42.4 Eigenvalue Problems

Solving Constant Coefficient Systems in 2 \mathrm{D}2 \mathrm{D}

Case II: One Repeated Root

Examples of the Matrix Method

Planar Systems - Summary

Theory of Homogeneous Constant Coefficient Systems

Procedure for Determining the General Solution

Nonhomogeneous Systems

2.92.9 Applications

Spring-Mass Systems

Electrical Circuits

Example 2.19. RC Circuits

Charagiacgi tor

Discolaøgíingor

Example 2.20. LC Circuits

Love Affairs

Predator Prey Models

Mixture Problems

Example 2.21. Single Tank Problem

Chemical Kinetics

Epidemics

Appendix: Diagonalization and Linear Systems

Problems

Nonlinear Systems

Introduction

Autonomous First Order Equations

Solution of the Logistic Equation

3.43.4 Bifurcations for First Order Equations

Nonlinear Pendulum

In Search of Solutions

The Stability of Fixed Points in Nonlinear Systems

Example 3.6. Return to the Nonlinear Pendulum

Nonlinear Population Models

Limit Cycles

Nonautonomous Nonlinear Systems

Maple Code for Phase Plane Plots

Appendix: Period of the Nonlinear Pendulum

Problems

Boundary Value Problems

Introduction

Partial Differential Equations

Solving the Heat Equation

Connections to Linear Algebra

Eigenfunction Expansions for PDEs

Eigenfunction Expansions for Nonhomogeneous ODEs

Linear Vector Spaces

Generalized Basis Expansion

Problems

Fourier Series

Introduction

Goal

Fourier Trigonometric Series

Fourier Coefficients

Fourier Series Over Other Intervals

Fourier Series on [a, b][a, b]

5.45.4 Sine and Cosine Series

Example 5.11. Periodic Extension - Trigonometric Fourier Series

$1.1$ Review of the First Course

$2.2$ Equilibrium Solutions and Nearby Behaviors

Example 2.3. Stable Node $(\operatorname{sink})$

$2.4$ Eigenvalue Problems

Solving Constant Coefficient Systems in $2 \mathrm{D}$

$2.9$ Applications

$3.4$ Bifurcations for First Order Equations

Fourier Series on $[a, b]$

$5.4$ Sine and Cosine Series

Gram-Schmidt Orthogonalization in $N$ -Dimensions

$7.2$ Legendre Polynomials

$7.4$ Bessel Functions