Processing math: 6%
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Mathematics LibreTexts

Preface

( \newcommand{\kernel}{\mathrm{null}\,}\)

1 Introduction ...

1.1 Review of the First Course ,,,

1.1.1 First Order Differential Equations .............. 2

1.1.2 Second Order Linear Differential Equations ......... 7

1.1.3 Constant Coefficient Equations ................ 8

1.1.4 Method of Undetermined Coefficients ............ 10

1.1.5 Cauchy-Euler Equations ...................... 13

1.2 Overview of the Course ...

1.3 Appendix: Reduction of Order and Complex Roots .17

 Problems ........................................ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

2 Systems of Differential Equations .

2.1 Introduction .

2.2 Equilibrium Solutions and Nearby Behaviors ............ 25

2.2.1 Polar Representation of Spirals ................. . 40

2.3 Matrix Formulation ............................ 43

2.4 Eigenvalue Problems ..........................

2.5 Solving Constant Coefficient Systems in 2D ............. 45

2.6 Examples of the Matrix Method ..................... 48 4..42

2.6.1 Planar Systems - Summary ................... 52

2.7 Theory of Homogeneous Constant Coefficient Systems ....... 52

2.8 Nonhomogeneous Systems ..

2.9 Applications ................................. . . . . . . . . . . . . . . . 

2.9.1 Spring-Mass Systems .....6

2.9.2 Electrical Circuits

2.9.3 Love Affairs

2.9.4 Predator Prey Models . . . . . . . . . . . . . . . . 72

2.9.5 Mixture Problems ........................ 73

2.9.6 Chemical Kinetics . . . . . . . . . . . . . . . . . 75

2.9.7 Epidemics

2.10 Appendix: Diagonalization and Linear Systems . . . . . . . . 77 Problems ......

3 Nonlinear Systems .........................

3.1 Introduction ......

3.2 Autonomous First Order Equations .................... 90

3.3 Solution of the Logistic Equation .................... 92

3.4 Bifurcations for First Order Equations . . ............. . 94

3.5 Nonlinear Pendulum ........................... . 98

3.5.1 In Search of Solutions ....

3.6 The Stability of Fixed Points in Nonlinear Systems .102

3.7 Nonlinear Population Models . . . . . . . . . . . . . . . . . . 107

3.8 Limit Cycles ................................. . . . . . . . . . . . . . . .

3.9 Nonautonomous Nonlinear Systems . . . . . . . . . . . . . . . 117

3.9.1 Maple Code for Phase Plane Plots . . . . . . . . . . . . 122

3.10 Appendix: Period of the Nonlinear Pendulum .............. . . 124

Problems ....................................127

4 Boundary Value Problems . . . . . . . . . . . . . . . . . 131

4.1 Introduction .

4.2 Partial Differential Equations .133

4.2.1 Solving the Heat Equation ..................... . . 134

4.3 Connections to Linear Algebra . . . . . . . . . . . . . . . . 137

4.3.1 Eigenfunction Expansions for PDEs ..137

4.3.2 Eigenfunction Expansions for Nonhomogeneous ODEs . 140

4.3.3 Linear Vector Spaces ......................

Problems ..................................... 147

5 Fourier Series ,,,,,,,,,,

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 149

5.2 Fourier Trigonometric Series . . . . . . . . . . . . . . . . 154

5.3 Fourier Series Over Other Intervals

5.3.1 Fourier Series on [a,b],

5.4 Sine and Cosine Series ....

5.5 Appendix: The Gibbs Phenomenon . . . . . . . . . . . . . 175

 Problems ....................................... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

6 Sturm-Liouville Eigenvalue Problems .

6.1 Introduction .....

6.2 Properties of Sturm-Liouville Eigenvalue Problems . . . . . . . 189

6.2.1 Adjoint Operators .......................... . 191

6.2.2 Lagrange’s and Green’s Identities . . . . . . . . . . . . 193

6.2.3 Orthogonality and Reality .................... 194

6.2.4 The Rayleigh Quotient

6.3 The Eigenfunction Expansion Method ................ . 197

6.4 The Fredholm Alternative Theorem . . . . . . . . . . . . . 199 Problems .....

7 Special Functions . . . . . . . . . . . . . . . . . . . . . . . 205

7.1 Classical Orthogonal Polynomials ..................... . . 

7.2 Legendre Polynomials ......... . . . . . . . . . . . . . . . . . 209

7.2.1 The Rodrigues Formula . . . . . . . . . . . . . . . . 211

7.2.2 Three Term Recursion Formula ...213

7.2.3 The Generating Function ....................214

7.2.4 Eigenfunction Expansions ................... 218

7.3 Gamma Function ............................... 221

7.4 Bessel Functions ........................

7.5 Hypergeometric Functions . . . . . . . . . . . . . . . . . 227

7.6 Appendix: The Binomial Expansion . . . . . . . . . . . . . . . 229

Problems ..

8 Green’s Functions

8.1 The Method of Variation of Parameters .238

8.2 Initial and Boundary Value Green’s Functions . . . . . . . . . . 243

8.2.1 Initial Value Green’s Function ................ . . 244

8.2.2 Boundary Value Green’s Function .............. . 247

8.3 Properties of Green’s Functions ......256

 8.3.1 The Dirac Delta Function . . . . . . . . . . . . . . . 25

8.3.2 Green’s Function Differential Equation ............ 259

8.4 Series Representations of Green’s Functions . . . . . . . . . 262

Problems ........

Introduction

These are notes for a second course in differential equations originally taught in the Spring semester of 2005 at the University of North Carolina Wilmington to upper level and first year graduate students and later updated in Fall 2007 and Fall 2008. It is assumed that you have had an introductory course in differential equations. However, we will begin this chapter with a review of some of the material from your first course in differential equations and then give an overview of the material we are about to cover.

Typically an introductory course in differential equations introduces students to analytical solutions of first order differential equations which are separable, first order linear differential equations, and sometimes to some other special types of equations. Students then explore the theory of second order differential equations generally restricted the study of exact solutions of constant coefficient linear differential equations or even equations of the CauchyEuler type. These are later followed by the study of special techniques, such as power series methods or Laplace transform methods. If time permits, ones explores a few special functions, such as Legendre polynomials and Bessel functions, while exploring power series methods for solving differential equations.

More recently, variations on this inventory of topics have been introduced through the early introduction of systems of differential equations, qualitative studies of these systems and a more intense use of technology for understanding the behavior of solutions of differential equations. This is typically done at the expense of not covering power series methods, special functions, or Laplace transforms. In either case, the types of problems solved are initial value problems in which the differential equation to be solved is accompanied by a set of initial conditions.

In this course we will assume some exposure to the overlap of these two approaches. We will first give a quick review of the solution of separable and linear first order equations. Then we will review second order linear differential equations and Cauchy-Euler equations. This will then be followed by an overview of some of the topics covered. As with any course in differential equa- tions, we will emphasize analytical, graphical and (sometimes) approximate solutions of differential equations. Throughout we will present applications from physics, chemistry and biology.

1.1 Review of the First Course

In this section we review a few of the solution techniques encountered in a first course in differential equations. We will not review the basic theory except in possible references as reminders as to what we are doing.

We first recall that an n-th order ordinary differential equation is an equation for an unknown function y(x) that expresses a relationship between the unknown function and its first n derivatives. One could write this generally

F(y(n)(x),y(n1)(x),,y(x),y(x),x)=0.

Here y(n)(x) represents the nth derivative of y(x).

An initial value problem consists of the differential equation plus the values of the first n1 derivatives at a particular value of the independent variable, say x0 :

y(n1)(x0)=yn1,y(n2)(x0)=yn2,,y(x0)=y0

A linear nth order differential equation takes the form

an(x)y(n)(x)+an1(x)y(n1)(x)++a1(x)y(x)+a0(x)y(x))=f(x).

If f(x)0, then the equation is said to be homogeneous, otherwise it is nonhomogeneous.

First Order Differential Equations

Typically, the first differential equations encountered are first order equations. A first order differential equation takes the form

F(y,y,x)=0

There are two general forms for which one can formally obtain a solution. The first is the separable case and the second is a first order equation. We indicate that we can formally obtain solutions, as one can display the needed integration that leads to a solution. However, the resulting integrals are not always reducible to elementary functions nor does one obtain explicit solutions when the integrals are doable.

A first order equation is separable if it can be written the form

dydx=f(x)g(y)

Special cases result when either f(x)=1 or g(y)=1. In the first case the equation is said to be autonomous.

The general solution to equation (1.5) is obtained in terms of two integrals:

dyg(y)=f(x)dx+C,

where C is an integration constant. This yields a 1-parameter family of solutions to the differential equation corresponding to different values of C. If one can solve (1.6) for y(x), then one obtains an explicit solution. Otherwise, one has a family of implicit solutions. If an initial condition is given as well, then one might be able to find a member of the family that satisfies this condition, which is often called a particular solution.

Example 1.1. y=2xy,y(0)=2.

Applying (1.6), one has

dyy=2xdx+C

Integrating yields

ln|y|=x2+C.

Exponentiating, one obtains the general solution,

y(x)=±ex2+C=Aex2.

Here we have defined A=±eC. Since C is an arbitrary constant, A is an arbitrary constant. Several solutions in this 1-parameter family are shown in Figure 1.1.

Next, one seeks a particular solution satisfying the initial condition. For y(0)=2, one finds that A=2. So, the particular solution satisfying the initial conditions is y(x)=2ex2.

Example 1.2. yy=x.

Following the same procedure as in the last example, one obtains:

ydy=xdx+Cy2=x2+A, where A=2C.

Thus, we obtain an implicit solution. Writing the solution as x2+y2=A, we see that this is a family of circles for A>0 and the origin for A=0. Plots of some solutions in this family are shown in Figure 1.2.

image
Figure 1.1. Plots of solutions from the 1-parameter family of solutions of Example 1.1 for several initial conditions.

In this case one seeks an integrating factor, μ(x), which is a function that one can multiply through the equation making the left side a perfect derivative. Thus, obtaining,

ddx[μ(x)y(x)]=μ(x)q(x).

The integrating factor that works is μ(x)=exp(xp(ξ)dξ). One can show this by expanding the derivative in Equation (1.8),

μ(x)y(x)+μ(x)y(x)=μ(x)q(x)

and comparing this equation to the one obtained from multiplying (1.7) by μ(x) :

μ(x)y(x)+μ(x)p(x)y(x)=μ(x)q(x).

Note that these last two equations would be the same if

dμ(x)dx=μ(x)p(x).

This is a separable first order equation whose solution is the above given form for the integrating factor,

image
Figure 1.2. Plots of solutions of Example 1.2 for several initial conditions.

μ(x)=exp(xp(ξ)dξ)

Equation (1.8) is easily integrated to obtain

y(x)=1μ(x)[xμ(ξ)q(ξ)dξ+C].

Example 1.3. xy+y=x,x>0,y(1)=0.

One first notes that this is a linear first order differential equation. Solving for y, one can see that the original equation is not separable. However, it is not in the standard form. So, we first rewrite the equation as

dydx+1xy=1.

Noting that p(x)=1x, we determine the integrating factor

μ(x)=exp[xdξξ]=elnx=x.

Multiplying equation (1.13) by μ(x)=x, we actually get back the original equation! In this case we have found that xy+y must have been the derivative of something to start. In fact, (xy)=xy+x. Therefore, equation (1.8) becomes

(xy)=x

Integrating one obtains

xy=12x2+C,

or

y(x)=12x+Cx.

Inserting the initial condition into this solution, we have 0=12+C. Therefore, C=12. Thus, the solution of the initial value problem is y(x)=12(x1x)

Example 1.4. (sinx)y+(cosx)y=x2sinx.

Actually, this problem is easy if you realize that

ddx((sinx)y)=(sinx)y+(cosx)y

But, we will go through the process of finding the integrating factor for practice.

First, rewrite the original differential equation in standard form:

y+(cotx)y=x2

Then, compute the integrating factor as

μ(x)=exp(xcotξdξ)=eln(sinx)=1sinx

Using the integrating factor, the original equation becomes

ddx((sinx)y)=x2.

Integrating, we have

ysinx=13x3+C

So, the solution is

y=(13x3+C)cscx

There are other first order equations that one can solve for closed form solutions. However, many equations are not solvable, or one is simply interested in the behavior of solutions. In such cases one turns to direction fields. We will return to a discussion of the qualitative behavior of differential equations later in the course.

Second Order Linear Differential Equations

Second order differential equations are typically harder than first order. In most cases students are only exposed to second order linear differential equations. A general form for a second order linear differential equation is given by

a(x)y(x)+b(x)y(x)+c(x)y(x)=f(x).

One can rewrite this equation using operator terminology. Namely, one first defines the differential operator L=a(x)D2+b(x)D+c(x), where D=ddx. Then equation (1.14) becomes

Ly=f

The solutions of linear differential equations are found by making use of the linearity of L. Namely, we consider the vector space 1 consisting of realvalued functions over some domain. Let f and g be vectors in this function space. L is a linear operator if for two vectors f and g and scalar a, we have that

a. L(f+g)=Lf+Lg

b. L(af)=aLf.

One typically solves (1.14) by finding the general solution of the homogeneous problem,

Lyh=0

and a particular solution of the nonhomogeneous problem,

Lyp=f

Then the general solution of (1.14) is simply given as y=yh+yp. This is true because of the linearity of L. Namely,

Ly=L(yh+yp)=Lyh+Lyp=0+f=f

There are methods for finding a particular solution of a differential equation. These range from pure guessing to the Method of Undetermined Coefficients, or by making use of the Method of Variation of Parameters. We will review some of these methods later.

Determining solutions to the homogeneous problem, Lyh=0, is not always easy. However, others have studied a variety of second order linear equations

{ }^{1} We assume that the reader has been introduced to concepts in linear algebra. Late in the text we will recall the definition of a vector space and see that linear algebra is in the background of the study of many concepts in the solution of differential equations. and have saved us the trouble for some of the differential equations that often appear in applications.

Again, linearity is useful in producing the general solution of a homogeneous linear differential equation. If y_{1} and y_{2} are solutions of the homogeneous equation, then the linear combination y=c_{1} y_{1}+c_{2} y_{2} is also a solution of the homogeneous equation. In fact, if y_{1} and y_{2} are linearly independent, { }^{2} then y=c_{1} y_{1}+c_{2} y_{2} is the general solution of the homogeneous problem. As you may recall, linear independence is established if the Wronskian of the solutions in not zero. In this case, we have

W\left(y_{1}, y_{2}\right)=y_{1}(x) y_{2}^{\prime}(x)-y_{1}^{\prime}(x) y_{2}(x) \neq 0 \nonumber

Constant Coefficient Equations

The simplest and most seen second order differential equations are those with constant coefficients. The general form for a homogeneous constant coefficient second order linear differential equation is given as

a y^{\prime \prime}(x)+b y^{\prime}(x)+c y(x)=0, \nonumber

where a, b, and c are constants.

Solutions to (1.18) are obtained by making a guess of y(x)=e^{r x}. Inserting this guess into (1.18) leads to the characteristic equation

a r^{2}+b r+c=0 . \nonumber

The roots of this equation in turn lead to three types of solution depending upon the nature of the roots as shown below.

Example 1.5. y^{\prime \prime}-y^{\prime}-6 y=0 y(0)=2, y^{\prime}(0)=0.

The characteristic equation for this problem is r^{2}-r-6=0. The roots of this equation are found as r=-2,3. Therefore, the general solution can be quickly written down:

y(x)=c_{1} e^{-2 x}+c_{2} e^{3 x} . \nonumber

Note that there are two arbitrary constants in the general solution. Therefore, one needs two pieces of information to find a particular solution. Of course, we have the needed information in the form of the initial conditions.

One also needs to evaluate the first derivative

y^{\prime}(x)=-2 c_{1} e^{-2 x}+3 c_{2} e^{3 x} \nonumber

{ }^{2} Recall, a set of functions \left\{y_{i}(x)\right\}_{i=1}^{n} is a linearly independent set if and only if

c_{1} y\left(1(x)+\ldots+c_{n} y_{n}(x)=0\right. \nonumber

implies c_{i}=0, for i=1, \ldots, n. in order to attempt to satisfy the initial conditions. Evaluating y and y^{\prime} at x=0 yields

\begin{aligned} &2=c_{1}+c_{2} \\[4pt] &0=-2 c_{1}+3 c_{2} \end{aligned} \nonumber

These two equations in two unknowns can readily be solved to give c_{1}=6 / 5 and c_{2}=4 / 5. Therefore, the solution of the initial value problem is obtained as y(x)=\dfrac{6}{5} e^{-2 x}+\dfrac{4}{5} e^{3 x}.

Classification of Roots of the Characteristic Equation for Second Order Constant Coefficient ODEs

  1. Real, distinct roots r_{1}, r_{2}. In this case the solutions corresponding to each root are linearly independent. Therefore, the general solution is simply y(x)=c_{1} e^{r_{1} x}+c_{2} e^{r_{2} x}.
  2. Real, equal roots r_{1}=r_{2}=r. In this case the solutions corresponding to each root are linearly dependent. To find a second linearly independent solution, one uses the Method of Reduction of Order. This gives the second solution as x e^{r x}. Therefore, the general solution is found as y(x)=\left(c_{1}+c_{2} x\right) e^{r x}. [This is covered in the appendix to this chapter.]
  3. Complex conjugate roots r_{1}, r_{2}=\alpha \pm i \beta. In this case the solutions corresponding to each root are linearly independent. Making use of Euler’s identity, e^{i \theta}=\cos (\theta)+i \sin (\theta), these complex exponentials can be rewritten in terms of trigonometric functions. Namely, one has that e^{\alpha x} \cos (\beta x) and e^{\alpha x} \sin (\beta x) are two linearly independent solutions. Therefore, the general solution becomes y(x)=e^{\alpha x}\left(c_{1} \cos (\beta x)+c_{2} \sin (\beta x)\right). [This is covered in the appendix to this chapter.]

Example 1.6. y^{\prime \prime}+6 y^{\prime}+9 y=0.

In this example we have r^{2}+6 r+9=0. There is only one root, r=-3. Again, the solution is easily obtained as y(x)=\left(c_{1}+c_{2} x\right) e^{-3 x}.

Example 1.7. y^{\prime \prime}+4 y=0.

The characteristic equation in this case is r^{2}+4=0. The roots are pure imaginary roots, r=\pm 2 i and the general solution consists purely of sinusoidal functions: y(x)=c_{1} \cos (2 x)+c_{2} \sin (2 x)

Example 1.8. y^{\prime \prime}+2 y^{\prime}+4 y=0.

The characteristic equation in this case is r^{2}+2 r+4=0. The roots are complex, r=-1 \pm \sqrt{3} i and the general solution can be written as y(x)= \left[c_{1} \cos (\sqrt{3} x)+c_{2} \sin (\sqrt{3} x)\right] e^{-x} One of the most important applications of the equations in the last two examples is in the study of oscillations. Typical systems are a mass on a spring, or a simple pendulum. For a mass m on a spring with spring constant k>0, one has from Hooke’s law that the position as a function of time, x(t), satisfies the equation

m x^{\prime \prime}+k x=0 . \nonumber

This constant coefficient equation has pure imaginary roots (\alpha=0) and the solutions are pure sines and cosines. Such motion is called simple harmonic motion.

Adding a damping term and periodic forcing complicates the dynamics, but is nonetheless solvable. The next example shows a forced harmonic oscillator.

Example 1.9. y^{\prime \prime}+4 y=\sin x.

This is an example of a nonhomogeneous problem. The homogeneous problem was actually solved in Example 1.7. According to the theory, we need only seek a particular solution to the nonhomogeneous problem and add it to the solution of the last example to get the general solution.

The particular solution can be obtained by purely guessing, making an educated guess, or using the Method of Variation of Parameters. We will not review all of these techniques at this time. Due to the simple form of the driving term, we will make an intelligent guess of y_{p}(x)=A \sin x and determine what A needs to be. Recall, this is the Method of Undetermined Coefficients which we review in the next section. Inserting our guess in the equation gives (-A+4 A) \sin x=\sin x. So, we see that A=1 / 3 works. The general solution of the nonhomogeneous problem is therefore y(x)= c_{1} \cos (2 x)+c_{2} \sin (2 x)+\dfrac{1}{3} \sin x

Method of Undetermined Coefficients

To date, we only know how to solve constant coefficient, homogeneous equations. How does one solve a nonhomogeneous equation like that in Equation (1.14)

a(x) y^{\prime \prime}(x)+b(x) y^{\prime}(x)+c(x) y(x)=f(x) \nonumber

Recall, that one solves this equation by finding the general solution of the homogeneous problem,

L y_{h}=0 \nonumber

and a particular solution of the nonhomogeneous problem,

L y_{p}=f \nonumber

Then the general solution of (1.14) is simply given as y=y_{h}+y_{p}. So, how do we find the particular solution? You could guess a solution, but that is not usually possible without a little bit of experience. So we need some other methods. There are two main methods. In the first case, the Method of Undetermined Coefficients, one makes an intelligent guess based on the form of f(x). In the second method, one can systematically develop the particular solution. We will come back to this method the Method of Variation of Parameters, later in the book.

Let’s solve a simple differential equation highlighting how we can handle nonhomogeneous equations.

Example 1.10. Consider the equation

y^{\prime \prime}+2 y^{\prime}-3 y=4 \nonumber

The first step is to determine the solution of the homogeneous equation. Thus, we solve

y_{h}^{\prime \prime}+2 y_{h}^{\prime}-3 y_{h}=0 . \nonumber

The characteristic equation is r^{2}+2 r-3=0. The roots are r=1,-3. So, we can immediately write the solution

y_{h}(x)=c_{1} e^{x}+c_{2} e^{-3 x} . \nonumber

The second step is to find a particular solution of (1.22). What possible function can we insert into this equation such that only a 4 remains? If we try something proportional to x, then we are left with a linear function after inserting x and its derivatives. Perhaps a constant function you might think. y=4 does not work. But, we could try an arbitrary constant, y=A.

Let’s see. Inserting y=A into (1.22), we obtain

-3 A=4 . \nonumber

Ah ha! We see that we can choose A=-\dfrac{4}{3} and this works. So, we have a particular solution, y_{p}(x)=-\dfrac{4}{3}. This step is done.

Combining our two solutions, we have the general solution to the original nonhomogeneous equation (1.22). Namely,

y(x)=y_{h}(x)+y_{p}(x)=c_{1} e^{x}+c_{2} e^{-3 x}-\dfrac{4}{3} \nonumber

Insert this solution into the equation and verify that it is indeed a solution. If we had been given initial conditions, we could now use them to determine our arbitrary constants.

What if we had a different source term? Consider the equation

y^{\prime \prime}+2 y^{\prime}-3 y=4 x . \nonumber

The only thing that would change is our particular solution. So, we need a guess. We know a constant function does not work by the last example. So, let’s try y_{p}=A x. Inserting this function into Equation (??), we obtain

2 A-3 A x=4 x . \nonumber

Picking A=-4 / 3 would get rid of the x terms, but will not cancel everything. We still have a constant left. So, we need something more general.

Let’s try a linear function, y_{p}(x)=A x+B. Then we get after substitution \operatorname{into}(1.24)

2 A-3(A x+B)=4 x . \nonumber

Equating the coefficients of the different powers of x on both sides, we find a system of equations for the undetermined coefficients:

\begin{array}{r} 2 A-3 B=0 \\[4pt] -3 A=4 . \end{array} \nonumber

These are easily solved to obtain

\begin{aligned} &A=-\dfrac{4}{3} \\[4pt] &B=\dfrac{2}{3} A=-\dfrac{8}{9} \end{aligned} \nonumber

So, our particular solution is

y_{p}(x)=-\dfrac{4}{3} x-\dfrac{8}{9} . \nonumber

This gives the general solution to the nonhomogeneous problem as

y(x)=y_{h}(x)+y_{p}(x)=c_{1} e^{x}+c_{2} e^{-3 x}-\dfrac{4}{3} x-\dfrac{8}{9} \nonumber

There are general forms that you can guess based upon the form of the driving term, f(x). Some examples are given in Table 1.1.4. More general applications are covered in a standard text on differential equations. However, the procedure is simple. Given f(x) in a particular form, you make an appropriate guess up to some unknown parameters, or coefficients. Inserting the guess leads to a system of equations for the unknown coefficients. Solve the system and you have your solution. This solution is then added to the general solution of the homogeneous differential equation.

f(x) Guess
a_{n} x^{n}+a_{n-1} x^{n-1}+\cdots+a_{1} x+a_{0}+A_{n} x^{n}+A_{n-1} x^{n-1}+\cdots+A_{1} x+A_{0}  
a e^{b x} A e^{b x}
a \cos \omega x+b \sin \omega x A \cos \omega x+B \sin \omega x

Example 1.11. As a final example, let’s consider the equation

y^{\prime \prime}+2 y^{\prime}-3 y=2 e^{-3 x} \nonumber

According to the above, we would guess a solution of the form y_{p}=A e^{-3 x}. Inserting our guess, we find

0=2 e^{-3 x} \nonumber

Oops! The coefficient, A, disappeared! We cannot solve for it. What went wrong?

The answer lies in the general solution of the homogeneous problem. Note that e^{x} and e^{-3 x} are solutions to the homogeneous problem. So, a multiple of e^{-3 x} will not get us anywhere. It turns out that there is one further modification of the method. If our driving term contains terms that are solutions of the homogeneous problem, then we need to make a guess consisting of the smallest possible power of x times the function which is no longer a solution of the homogeneous problem. Namely, we guess y_{p}(x)=A x e^{-3 x}. We compute the derivative of our guess, y_{p}^{\prime}=A(1-3 x) e^{-3 x} and y_{p}^{\prime \prime}=A(9 x-6) e^{-3 x}. Inserting these into the equation, we obtain

[(9 x-6)+2(1-3 x)-3 x] A e^{-3 x}=2 e^{-3 x} \nonumber

-4 A=2 . \nonumber

So, A=-1 / 2 and y_{p}(x)=-\dfrac{1}{2} x e^{-3 x}.

Modified Method of Undetermined Coefficients

In general, if any term in the guess y_{p}(x) is a solution of the homogeneous equation, then multiply the guess by x^{k}, where k is the smallest positive integer such that no term in x^{k} y_{p}(x) is a solution of the homogeneous problem.

Cauchy-Euler Equations

Another class of solvable linear differential equations that is of interest are the Cauchy-Euler type of equations. These are given by

a x^{2} y^{\prime \prime}(x)+b x y^{\prime}(x)+c y(x)=0 . \nonumber

Note that in such equations the power of x in each of the coefficients matches the order of the derivative in that term. These equations are solved in a manner similar to the constant coefficient equations.

One begins by making the guess y(x)=x^{r}. Inserting this function and its derivatives,

y^{\prime}(x)=r x^{r-1}, \quad y^{\prime \prime}(x)=r(r-1) x^{r-2}, \nonumber

into Equation (1.28), we have

[a r(r-1)+b r+c] x^{r}=0 \nonumber

Since this has to be true for all x in the problem domain, we obtain the characteristic equation

a r(r-1)+b r+c=0 \nonumber

Just like the constant coefficient differential equation, we have a quadratic equation and the nature of the roots again leads to three classes of solutions. These are shown below. Some of the details are provided in the next section.

Classification of Roots of the Characteristic Equation for Cauchy-Euler Differential Equations

  1. Real, distinct roots r_{1}, r_{2}. In this case the solutions corresponding to each root are linearly independent. Therefore, the general solution is simply y(x)=c_{1} x^{r_{1}}+c_{2} x^{r_{2}}.
  2. Real, equal roots r_{1}=r_{2}=r. In this case the solutions corresponding to each root are linearly dependent. To find a second linearly independent solution, one uses the Method of Reduction of Order. This gives the second solution as x^{r} \ln |x|. Therefore, the general solution is found as y(x)=\left(c_{1}+c_{2} \ln |x|\right) x^{r}.
  3. Complex conjugate roots r_{1}, r_{2}=\alpha \pm i \beta. In this case the solutions corresponding to each root are linearly independent. These complex exponentials can be rewritten in terms of trigonometric functions. Namely, one has that x^{\alpha} \cos (\beta \ln |x|) and x^{\alpha} \sin (\beta \ln |x|) are two linearly independent solutions. Therefore, the general solution becomes y(x)= x^{\alpha}\left(c_{1} \cos (\beta \ln |x|)+c_{2} \sin (\beta \ln |x|)\right).

Example 1.12. x^{2} y^{\prime \prime}+5 x y^{\prime}+12 y=0

As with the constant coefficient equations, we begin by writing down the characteristic equation. Doing a simple computation,

\begin{aligned} 0 &=r(r-1)+5 r+12 \\[4pt] &=r^{2}+4 r+12 \\[4pt] &=(r+2)^{2}+8 \\[4pt] -8 &=(r+2)^{2} \end{aligned} \nonumber

one determines the roots are r=-2 \pm 2 \sqrt{2} i. Therefore, the general solution is y(x)=\left[c_{1} \cos (2 \sqrt{2} \ln |x|)+c_{2} \sin (2 \sqrt{2} \ln |x|)\right] x^{-2} Example 1.13. t^{2} y^{\prime \prime}+3 t y^{\prime}+y=0, \quad y(1)=0, y^{\prime}(1)=1.

For this example the characteristic equation takes the form

r(r-1)+3 r+1=0, \nonumber

r^{2}+2 r+1=0 . \nonumber

There is only one real root, r=-1. Therefore, the general solution is

y(t)=\left(c_{1}+c_{2} \ln |t|\right) t^{-1} . \nonumber

However, this problem is an initial value problem. At t=1 we know the values of y and y^{\prime}. Using the general solution, we first have that

0=y(1)=c_{1} \nonumber

Thus, we have so far that y(t)=c_{2} \ln |t| t^{-1}. Now, using the second condition and

y^{\prime}(t)=c_{2}(1-\ln |t|) t^{-2}, \nonumber

we have

1=y(1)=c_{2} \text {. } \nonumber

Therefore, the solution of the initial value problem is y(t)=\ln |t| t^{-1}.

Nonhomogeneous Cauchy-Euler Equations We can also solve some nonhomogeneous Cauchy-Euler equations using the Method of Undetermined Coefficients. We will demonstrate this with a couple of examples.

Example 1.14. Find the solution of x^{2} y^{\prime \prime}-x y^{\prime}-3 y=2 x^{2}.

First we find the solution of the homogeneous equation. The characteristic equation is r^{2}-2 r-3=0. So, the roots are r=-1,3 and the solution is y_{h}(x)=c_{1} x^{-1}+c_{2} x^{3}

We next need a particular solution. Let’s guess y_{p}(x)=A x^{2}. Inserting the guess into the nonhomogeneous differential equation, we have

\begin{aligned} 2 x^{2} &=x^{2} y^{\prime \prime}-x y^{\prime}-3 y=2 x^{2} \\[4pt] &=2 A x^{2}-2 A x^{2}-3 A x^{2} \\[4pt] &=-3 A x^{2} \end{aligned} \nonumber

So, A=-2 / 3. Therefore, the general solution of the problem is

y(x)=c_{1} x^{-1}+c_{2} x^{3}-\dfrac{2}{3} x^{2} . \nonumber

Example 1.15. Find the solution of x^{2} y^{\prime \prime}-x y^{\prime}-3 y=2 x^{3}.

In this case the nonhomogeneous term is a solution of the homogeneous problem, which we solved in the last example. So, we will need a modification of the method. We have a problem of the form

a x^{2} y^{\prime \prime}+b x y^{\prime}+c y=d x^{r} \nonumber

where r is a solution of a r(r-1)+b r+c=0. Let’s guess a solution of the form y=A x^{r} \ln x. Then one finds that the differential equation reduces to A x^{r}(2 a r-a+b)=d x^{r}. [You should verify this for yourself.]

With this in mind, we can now solve the problem at hand. Let y_{p}= A x^{3} \ln x. Inserting into the equation, we obtain 4 A x^{3}=2 x^{3}, or A=1 / 2. The general solution of the problem can now be written as

y(x)=c_{1} x^{-1}+c_{2} x^{3}+\dfrac{1}{2} x^{3} \ln x . \nonumber

Overview of the Course

For the most part, your first course in differential equations was about solving initial value problems. When second order equations did not fall into the above cases, then you might have learned how to obtain approximate solutions using power series methods, or even finding new functions from these methods. In this course we will explore two broad topics: systems of differential equations and boundary value problems.

We will see that there are interesting initial value problems when studying systems of differential equations. In fact, many of the second order equations that you have seen in the past can be written as a system of two first order equations. For example, the equation for simple harmonic motion,

x^{\prime \prime}+\omega^{2} x=0 \nonumber

can be written as the system

\begin{gathered} x^{\prime}=y \\[4pt] y^{\prime}=-\omega^{2} x \end{gathered} . \nonumber

Just note that x^{\prime \prime}=y^{\prime}=-\omega^{2} x. Of course, one can generalize this to systems with more complicated right hand sides. The behavior of such systems can be fairly interesting and these systems result from a variety of physical models.

In the second part of the course we will explore boundary value problems. Often these problems evolve from the study of partial differential equations. Such examples stem from vibrating strings, temperature distributions, bending beams, etc. Boundary conditions are conditions that are imposed at more than one point, while for initial value problems the conditions are specified at one point. For example, we could take the oscillation equation above and ask when solutions of the equation would satisfy the conditions x(0)=0 and x(1)=0. The general solution, as we have determined earlier, is

x(t)=c_{1} \cos \omega t+c_{2} \sin \omega t \nonumber

Requiring x(0)=0, we find that c_{1}=0, leaving x(t)=c_{2} \sin \omega t. Also imposing that 0=x(1)=c_{2} \sin \omega, we are forced to make \omega=n \pi, for n=1,2, \ldots. (Making c_{2}=0 would not give a nonzero solution of the problem.) Thus, there are an infinite number of solutions possible, if we have the freedom to choose our \omega. In the second half of the course we will investigate techniques for solving boundary value problems and look at several applications, including seeing the connections with partial differential equations and Fourier series.

Appendix: Reduction of Order and Complex Roots

In this section we provide some of the details leading to the general forms for the constant coefficient and Cauchy-Euler differential equations. In the first subsection we review how the Method of Reduction of Order is used to obtain the second linearly independent solutions for the case of one repeated root. In the second subsection we review how the complex solutions can be used to produce two linearly independent real solutions.

Method of Reduction of Order

First we consider constant coefficient equations. In the case when there is a repeated real root, one has only one independent solution, y_{1}(x)=e^{r x}. The question is how does one obtain the second solution? Since the solutions are independent, we must have that the ratio y_{2}(x) / y_{1}(x) is not a constant. So, we guess the form y_{2}(x)=v(x) y_{1}(x)=v(x) e^{r x}. For constant coefficient second order equations, we can write the equation as

(D-r)^{2} y=0 \nonumber

where D=\dfrac{d}{d x}

We now insert y_{2}(x) into this equation. First we compute

(D-r) v e^{r x}=v^{\prime} e^{r x} . \nonumber

Then,

(D-r)^{2} v e^{r x}=(D-r) v^{\prime} e^{r x}=v^{\prime \prime} e^{r x} . \nonumber

So, if y_{2}(x) is to be a solution to the differential equation, (D-r)^{2} y_{2}=0, then v^{\prime \prime}(x) e^{r x}=0 for all x. So, v^{\prime \prime}(x)=0, which implies that

v(x)=a x+b \nonumber

So,

y_{2}(x)=(a x+b) e^{r x} . \nonumber

Without loss of generality, we can take b=0 and a=1 to obtain the second linearly independent solution, y_{2}(x)=x e^{r x}. Deriving the solution for Case 2 for the Cauchy-Euler equations is messier, but works in the same way. First note that for the real root, r=r_{1}, the characteristic equation has to factor as \left(r-r_{1}\right)^{2}=0. Expanding, we have

r^{2}-2 r_{1} r+r_{1}^{2}=0 \nonumber

The general characteristic equation is

\operatorname{ar}(r-1)+b r+c=0 \nonumber

Rewriting this, we have

r^{2}+\left(\dfrac{b}{a}-1\right) r+\dfrac{c}{a}=0 . \nonumber

Comparing equations, we find

\dfrac{b}{a}=1-2 r_{1}, \quad \dfrac{c}{a}=r_{1}^{2} \nonumber

So, the general Cauchy-Euler equation in this case takes the form

x^{2} y^{\prime \prime}+\left(1-2 r_{1}\right) x y^{\prime}+r_{1}^{2} y=0 . \nonumber

Now we seek the second linearly independent solution in the form y_{2}(x)= v(x) x^{r_{1}}. We first list this function and its derivatives,

\begin{aligned} y_{2}(x) &=v x^{r_{1}} \\[4pt] y_{2}^{\prime}(x) &=\left(x v^{\prime}+r_{1} v\right) x^{r_{1}-1}, \\[4pt] y_{2}^{\prime \prime}(x) &=\left(x^{2} v^{\prime \prime}+2 r_{1} x v^{\prime}+r_{1}\left(r_{1}-1\right) v\right) x^{r_{1}-2} . \end{aligned} \nonumber

Inserting these forms into the differential equation, we have

\begin{aligned} 0 &=x^{2} y^{\prime \prime}+\left(1-2 r_{1}\right) x y^{\prime}+r_{1}^{2} y \\[4pt] &=\left(x v^{\prime \prime}+v^{\prime}\right) x^{r_{1}+1} \end{aligned} \nonumber

Thus, we need to solve the equation

x v^{\prime \prime}+v^{\prime}=0, \nonumber

\dfrac{v^{\prime \prime}}{v^{\prime}}=-\dfrac{1}{x} \nonumber

Integrating, we have

\ln \left|v^{\prime}\right|=-\ln |x|+C . \nonumber

Exponentiating, we have one last differential equation to solve,

v^{\prime}=\dfrac{A}{x} . \nonumber

Thus,

v(x)=A \ln |x|+k . \nonumber

So, we have found that the second linearly independent equation can be written as

y_{2}(x)=x^{r_{1}} \ln |x| \nonumber

Complex Roots

When one has complex roots in the solution of constant coefficient equations, one needs to look at the solutions

y_{1,2}(x)=e^{(\alpha \pm i \beta) x} \nonumber

We make use of Euler’s formula

e^{i \beta x}=\cos \beta x+i \sin \beta x . \nonumber

Then the linear combination of y_{1}(x) and y_{2}(x) becomes

\begin{aligned} A e^{(\alpha+i \beta) x}+B e^{(\alpha-i \beta) x} &=e^{\alpha x}\left[A e^{i \beta x}+B e^{-i \beta x}\right] \\[4pt] &=e^{\alpha x}[(A+B) \cos \beta x+i(A-B) \sin \beta x] \\[4pt] & \equiv e^{\alpha x}\left(c_{1} \cos \beta x+c_{2} \sin \beta x\right) . \end{aligned} \nonumber

Thus, we see that we have a linear combination of two real, linearly independent solutions, e^{\alpha x} \cos \beta x and e^{\alpha x} \sin \beta x.

When dealing with the Cauchy-Euler equations, we have solutions of the form y(x)=x^{\alpha+i \beta}. The key to obtaining real solutions is to first recall that

x^{y}=e^{\ln x^{y}}=e^{y \ln x} . \nonumber

Thus, a power can be written as an exponential and the solution can be written as

y(x)=x^{\alpha+i \beta}=x^{\alpha} e^{i \beta \ln x}, \quad x>0 . \nonumber

We can now find two real, linearly independent solutions, x^{\alpha} \cos (\beta \ln |x|) and x^{\alpha} \sin (\beta \ln |x|) following the same steps as above for the constant coefficient case.

Problems

1.1. Find all of the solutions of the first order differential equations. When an initial condition is given, find the particular solution satisfying that condition.
a. \dfrac{d y}{d x}=\dfrac{\sqrt{1-y^{2}}}{x}
b. x y^{\prime}=y(1-2 y), \quad y(1)=2.
c. y^{\prime}-(\sin x) y=\sin x.
d. x y^{\prime}-2 y=x^{2}, y(1)=1.
e. \dfrac{d s}{d t}+2 s=s t^{2}, \quad, s(0)=1.
f. x^{\prime}-2 x=t e^{2 t}.

1.2. Find all of the solutions of the second order differential equations. When an initial condition is given, find the particular solution satisfying that condition.
a. y^{\prime \prime}-9 y^{\prime}+20 y=0
b. y^{\prime \prime}-3 y^{\prime}+4 y=0, \quad y(0)=0, \quad y^{\prime}(0)=1.
c. x^{2} y^{\prime \prime}+5 x y^{\prime}+4 y=0, \quad x>0.
d. x^{2} y^{\prime \prime}-2 x y^{\prime}+3 y=0, \quad x>0.

1.3. Consider the differential equation

\dfrac{d y}{d x}=\dfrac{x}{y}-\dfrac{x}{1+y} . \nonumber

a. Find the 1-parameter family of solutions (general solution) of this equation.

b. Find the solution of this equation satisfying the initial condition y(0)=1. Is this a member of the 1-parameter family?

1.4. The initial value problem

\dfrac{d y}{d x}=\dfrac{y^{2}+x y}{x^{2}}, \quad y(1)=1 \nonumber

does not fall into the class of problems considered in our review. However, if one substitutes y(x)=x z(x) into the differential equation, one obtains an equation for z(x) which can be solved. Use this substitution to solve the initial value problem for y(x)

1.5. Consider the nonhomogeneous differential equation x^{\prime \prime}-3 x^{\prime}+2 x=6 e^{3 t}.

a. Find the general solution of the homogenous equation.

b. Find a particular solution using the Method of Undetermined Coefficients by guessing x_{p}(t)=A e^{3 t}.

c. Use your answers in the previous parts to write down the general solution for this problem.

1.6. Find the general solution of each differential equation. When an initial condition is given, find the particular solution satisfying that condition.

a. y^{\prime \prime}-3 y^{\prime}+2 y=20 e^{-2 x}, \quad y(0)=0, \quad y^{\prime}(0)=6.

b. y^{\prime \prime}+y=2 \sin 3 x. c. y^{\prime \prime}+y=1+2 \cos x.

d. x^{2} y^{\prime \prime}-2 x y^{\prime}+2 y=3 x^{2}-x, \quad x>0

1.7. Verify that the given function is a solution and use Reduction of Order to find a second linearly independent solution.

a. x^{2} y^{\prime \prime}-2 x y^{\prime}-4 y=0, \quad y_{1}(x)=x^{4}.

b. x y^{\prime \prime}-y^{\prime}+4 x^{3} y=0, \quad y_{1}(x)=\sin \left(x^{2}\right).

1.8. A certain model of the motion of a tossed whiffle ball is given by

m x^{\prime \prime}+c x^{\prime}+m g=0, \quad x(0)=0, \quad x^{\prime}(0)=v_{0} . \nonumber

Here m is the mass of the ball, g=9.8 \mathrm{~m} / \mathrm{s}^{2} is the acceleration due to gravity and c is a measure of the damping. Since there is no x term, we can write this as a first order equation for the velocity v(t)=x^{\prime}(t) :

m v^{\prime}+c v+m g=0 . \nonumber

a. Find the general solution for the velocity v(t) of the linear first order differential equation above.

b. Use the solution of part a to find the general solution for the position x(t).

c. Find an expression to determine how long it takes for the ball to reach it’s maximum height?

d. Assume that c / m=10 \mathrm{~s}^{-1}. For v_{0}=5,10,15,20 \mathrm{~m} / \mathrm{s}, plot the solution, x(t), versus the time.

e. From your plots and the expression in part c, determine the rise time. Do these answers agree?

f. What can you say about the time it takes for the ball to fall as compared to the rise time?

Systems of Differential Equations

Introduction

In this chapter we will begin our study of systems of differential equations. After defining first order systems, we will look at constant coefficient systems and the behavior of solutions for these systems. Also, most of the discussion will focus on planar, or two dimensional, systems. For such systems we will be able to look at a variety of graphical representations of the family of solutions and discuss the qualitative features of systems we can solve in preparation for the study of systems whose solutions cannot be found in an algebraic form.

A general form for first order systems in the plane is given by a system of two equations for unknowns x(t) and y(t) :

\begin{aligned} &x^{\prime}(t)=P(x, y, t) \\[4pt] &y^{\prime}(t)=Q(x, y, t) \end{aligned} \nonumber

An autonomous system is one in which there is no explicit time dependence:

\begin{aligned} &x^{\prime}(t)=P(x, y) \\[4pt] &y^{\prime}(t)=Q(x, y) \end{aligned} \nonumber

Otherwise the system is called nonautonomous.

A linear system takes the form

\begin{aligned} x^{\prime} &=a(t) x+b(t) y+e(t) \\[4pt] y^{\prime} &=c(t) x+d(t) y+f(t) \end{aligned} \nonumber

A homogeneous linear system results when e(t)=0 and f(t)=0.

A linear, constant coefficient system of first order differential equations is given by

\begin{aligned} &x^{\prime}=a x+b y+e \\[4pt] &y^{\prime}=c x+d y+f \end{aligned} \nonumber

We will focus on linear, homogeneous systems of constant coefficient first order differential equations:

image

As we will see later, such systems can result by a simple translation of the unknown functions. These equations are said to be coupled if either b \neq 0 or c \neq 0.

We begin by noting that the system (2.5) can be rewritten as a second order constant coefficient linear differential equation, which we already know how to solve. We differentiate the first equation in system system (2.5) and systematically replace occurrences of y and y^{\prime}, since we also know from the first equation that y=\dfrac{1}{b}\left(x^{\prime}-a x\right). Thus, we have

\begin{aligned} x^{\prime \prime} &=a x^{\prime}+b y^{\prime} \\[4pt] &=a x^{\prime}+b(c x+d y) \\[4pt] &=a x^{\prime}+b c x+d\left(x^{\prime}-a x\right) . \end{aligned} \nonumber

Rewriting the last line, we have

x^{\prime \prime}-(a+d) x^{\prime}+(a d-b c) x=0 .

This is a linear, homogeneous, constant coefficient ordinary differential equation. We know that we can solve this by first looking at the roots of the characteristic equation

r^{2}-(a+d) r+a d-b c=0 \nonumber

and writing down the appropriate general solution for x(t). Then we can find y(t) using Equation (2.5):

y=\dfrac{1}{b}\left(x^{\prime}-a x\right) . \nonumber

We now demonstrate this for a specific example.

Example 2.1. Consider the system of differential equations

\begin{aligned} &x^{\prime}=-x+6 y \\[4pt] &y^{\prime}=x-2 y . \end{aligned} \nonumber

Carrying out the above outlined steps, we have that x^{\prime \prime}+3 x^{\prime}-4 x=0. This can be shown as follows:

\begin{aligned} x^{\prime \prime} &=-x^{\prime}+6 y^{\prime} \\[4pt] &=-x^{\prime}+6(x-2 y) \\[4pt] &=-x^{\prime}+6 x-12\left(\dfrac{x^{\prime}+x}{6}\right) \\[4pt] &=-3 x^{\prime}+4 x \end{aligned} \nonumber

The resulting differential equation has a characteristic equation of r^{2}+3 r- 4=0. The roots of this equation are r=1,-4. Therefore, x(t)=c_{1} e^{t}+c_{2} e^{-4 t} . But, we still need y(t). From the first equation of the system we have

y(t)=\dfrac{1}{6}\left(x^{\prime}+x\right)=\dfrac{1}{6}\left(2 c_{1} e^{t}-3 c_{2} e^{-4 t}\right) . \nonumber

Thus, the solution to our system is

\begin{aligned} &x(t)=c_{1} e^{t}+c_{2} e^{-4 t}, \\[4pt] &y(t)=\dfrac{1}{3} c_{1} e^{t}-\dfrac{1}{2} c_{2} e^{-4 t} \end{aligned} \nonumber

Sometimes one needs initial conditions. For these systems we would specify conditions like x(0)=x_{0} and y(0)=y_{0}. These would allow the determination of the arbitrary constants as before.

Example 2.2. Solve

\begin{aligned} &x^{\prime}=-x+6 y \\[4pt] &y^{\prime}=x-2 y \end{aligned} \nonumber

given x(0)=2, y(0)=0.

We already have the general solution of this system in (2.11). Inserting the initial conditions, we have

\begin{aligned} &2=c_{1}+c_{2}, \\[4pt] &0=\dfrac{1}{3} c_{1}-\dfrac{1}{2} c_{2} . \end{aligned} \nonumber

Solving for c_{1} and c_{2} gives c_{1}=6 / 5 and c_{2}=4 / 5. Therefore, the solution of the initial value problem is

\begin{aligned} &x(t)=\dfrac{2}{5}\left(3 e^{t}+2 e^{-4 t}\right) \\[4pt] &y(t)=\dfrac{2}{5}\left(e^{t}-e^{-4 t}\right) \end{aligned} \nonumber

2.2 Equilibrium Solutions and Nearby Behaviors

In studying systems of differential equations, it is often useful to study the behavior of solutions without obtaining an algebraic form for the solution. This is done by exploring equilibrium solutions and solutions nearby equilibrium solutions. Such techniques will be seen to be useful later in studying nonlinear systems.

We begin this section by studying equilibrium solutions of system (2.4). For equilibrium solutions the system does not change in time. Therefore, equilibrium solutions satisfy the equations x^{\prime}=0 and y^{\prime}=0. Of course, this can only happen for constant solutions. Let x_{0} and y_{0} be the (constant) equilibrium solutions. Then, x_{0} and y_{0} must satisfy the system

\begin{aligned} &0=a x_{0}+b y_{0}+e \\[4pt] &0=c x_{0}+d y_{0}+f \end{aligned} \nonumber

This is a linear system of nonhomogeneous algebraic equations. One only has a unique solution when the determinant of the system is not zero, i.e., a d-b c \neq 0. Using Cramer’s (determinant) Rule for solving such systems, we have

x_{0}=-\dfrac{\left|\begin{array}{ll} e & b \\[4pt] f & d \end{array}\right|}{\left|\begin{array}{ll} a & b \\[4pt] c & d \end{array}\right|}, \quad y_{0}=-\dfrac{\left|\begin{array}{ll} a & e \\[4pt] c & f \end{array}\right|}{\left|\begin{array}{ll} a & b \\[4pt] c & d \end{array}\right|} . \quad \text { (2.16) } \nonumber

If the system is homogeneous, e=f=0, then we have that the origin is the equilibrium solution; i.e., \left(x_{0}, y_{0}\right)=(0,0). Often we will have this case since one can always make a change of coordinates from (x, y) to (u, v) by u=x-x_{0} and v=y-y_{0}. Then, u_{0}=v_{0}=0.

Next we are interested in the behavior of solutions near the equilibrium solutions. Later this behavior will be useful in analyzing more complicated nonlinear systems. We will look at some simple systems that are readily solved.

Example 2.3. Stable Node (\operatorname{sink})

Consider the system

\begin{aligned} &x^{\prime}=-2 x \\[4pt] &y^{\prime}=-y \end{aligned} \nonumber

This is a simple uncoupled system. Each equation is simply solved to give

x(t)=c_{1} e^{-2 t} \text { and } y(t)=c_{2} e^{-t} . \nonumber

In this case we see that all solutions tend towards the equilibrium point, (0,0). This will be called a stable node, or a sink.

Before looking at other types of solutions, we will explore the stable node in the above example. There are several methods of looking at the behavior of solutions. We can look at solution plots of the dependent versus the independent variables, or we can look in the x y-plane at the parametric curves (x(t), y(t))

Solution Plots: One can plot each solution as a function of t given a set of initial conditions. Examples are are shown in Figure 2.1 for several initial conditions. Note that the solutions decay for large t. Special cases result for various initial conditions. Note that for t=0, x(0)=c_{1} and y(0)=c_{2}. (Of course, one can provide initial conditions at any t=t_{0}. It is generally easier to pick t=0 in our general explanations.) If we pick an initial condition with c_{1}=0, then x(t)=0 for all t. One obtains similar results when setting y(0)=0

image
Figure 2.1. Plots of solutions of Example 2.3 for several initial conditions.

Phase Portrait: There are other types of plots which can provide additional information about our solutions even if we cannot find the exact solutions as we can for these simple examples. In particular, one can consider the solutions x(t) and y(t) as the coordinates along a parameterized path, or curve, in the plane: \mathbf{r}=(x(t), y(t)) Such curves are called trajectories or orbits. The x y-plane is called the phase plane and a collection of such orbits gives a phase portrait for the family of solutions of the given system.

One method for determining the equations of the orbits in the phase plane is to eliminate the parameter t between the known solutions to get a relationship between x and y. In the above example we can do this, since the solutions are known. In particular, we have

x=c_{1} e^{-2 t}=c_{1}\left(\dfrac{y}{c_{2}}\right)^{2} \equiv A y^{2} . \nonumber

Another way to obtain information about the orbits comes from noting that the slopes of the orbits in the x y-plane are given by d y / d x. For autonomous systems, we can write this slope just in terms of x and y. This leads to a first order differential equation, which possibly could be solved analytically, solved numerically, or just used to produce a direction field. We will see that direction fields are useful in determining qualitative behaviors of the solutions without actually finding explicit solutions.

First we will obtain the orbits for Example 2.3 by solving the corresponding slope equation. First, recall that for trajectories defined parametrically by x=x(t) and y=y(t), we have from the Chain Rule for y=y(x(t)) that

\dfrac{d y}{d t}=\dfrac{d y}{d x} \dfrac{d x}{d t} \nonumber

Therefore,

\dfrac{d y}{d x}=\dfrac{\dfrac{d y}{d t}}{\dfrac{d x}{d t}} \nonumber

For the system in (2.17) we use Equation (2.18) to obtain the equation for the slope at a point on the orbit:

\dfrac{d y}{d x}=\dfrac{y}{2 x} \nonumber

The general solution of this first order differential equation is found using separation of variables as x=A y^{2} for A an arbitrary constant. Plots of these solutions in the phase plane are given in Figure 2.2. [Note that this is the same form for the orbits that we had obtained above by eliminating t from the solution of the system.]

image
Figure 2.2. Orbits for Example 2.3.

Once one has solutions to differential equations, we often are interested in the long time behavior of the solutions. Given a particular initial condition \left(x_{0}, y_{0}\right), how does the solution behave as time increases? For orbits near an equilibrium solution, do the solutions tend towards, or away from, the equilibrium point? The answer is obvious when one has the exact solutions x(t) and y(t). However, this is not always the case. Let’s consider the above example for initial conditions in the first quadrant of the phase plane. For a point in the first quadrant we have that

d x / d t=-2 x<0 \nonumber

meaning that as t \rightarrow \infty, x(t) get more negative. Similarly,

d y / d t=-y<0, \nonumber

indicates that y(t) is also getting smaller for this problem. Thus, these orbits tend towards the origin as t \rightarrow \infty. This qualitative information was obtained without relying on the known solutions to the problem.

Direction Fields: Another way to determine the behavior of our system is to draw the direction field. Recall that a direction field is a vector field in which one plots arrows in the direction of tangents to the orbits. This is done because the slopes of the tangent lines are given by d y / d x. For our system (2.5), the slope is

\dfrac{d y}{d x}=\dfrac{a x+b y}{c x+d y} . \nonumber

In general, for nonautonomous systems, we obtain a first order differential equation of the form

\dfrac{d y}{d x}=F(x, y) . \nonumber

This particular equation can be solved by the reader. See homework problem 2.2

Example 2.4. Draw the direction field for Example 2.3.

We can use software to draw direction fields. However, one can sketch these fields by hand. we have that the slope of the tangent at this point is given by

\dfrac{d y}{d x}=\dfrac{-y}{-2 x}=\dfrac{y}{2 x} \nonumber

For each point in the plane one draws a piece of tangent line with this slope. In Figure 2.3 we show a few of these. For (x, y)=(1,1) the slope is d y / d x=1 / 2. So, we draw an arrow with slope 1 / 2 at this point. From system (2.17), we have that x^{\prime} and y^{\prime} are both negative at this point. Therefore, the vector points down and to the left.

We can do this for several points, as shown in Figure 2.3. Sometimes one can quickly sketch vectors with the same slope. For this example, when y=0, the slope is zero and when x=0 the slope is infinite. So, several vectors can be provided. Such vectors are tangent to curves known as isoclines in which \dfrac{d y}{d x}= constant.

It is often difficult to provide an accurate sketch of a direction field. Computer software can be used to provide a better rendition. For Example 2.3 the direction field is shown in Figure 2.4. Looking at this direction field, one can begin to "see" the orbits by following the tangent vectors.

image
Figure 2.3. A sketch of several tangent vectors for Example 2.3.
image
Figure 2.4. Direction field for Example 2.3.

Of course, one can superimpose the orbits on the direction field. This is shown in Figure 2.5. Are these the patterns you saw in Figure 2.4 ?

In this example we see all orbits "flow" towards the origin, or equilibrium point. Again, this is an example of what is called a stable node or a sink. (Imagine what happens to the water in a sink when the drain is unplugged.)

Example 2.5. Saddle

image
Figure 2.5. Phase portrait for Example 2.3.

y^{\prime}=y \nonumber

This is another uncoupled system. The solutions are again simply gotten by integration. We have that x(t)=c_{1} e^{-t} and y(t)=c_{2} e^{t}. Here we have that x decays as t gets large and y increases as t gets large. In particular, if one picks initial conditions with c_{2}=0, then orbits follow the x-axis towards the origin. For initial points with c_{1}=0, orbits originating on the y-axis will flow away from the origin. Of course, in these cases the origin is an equilibrium point and once at equilibrium, one remains there.

In fact, there is only one line on which to pick initial conditions such that the orbit leads towards the equilibrium point. No matter how small c_{2} is, sooner, or later, the exponential growth term will dominate the solution. One can see this behavior in Figure 2.6.

Similar to the first example, we can look at a variety of plots. These are given by Figures 2.6-2.7. The orbits can be obtained from the system as

\dfrac{d y}{d x}=\dfrac{d y / d t}{d x / d t}=-\dfrac{y}{x} \nonumber

The solution is y=\dfrac{A}{x}. For different values of A \neq 0 we obtain a family of hyperbolae. These are the same curves one might obtain for the level curves of a surface known as a saddle surface, z=x y. Thus, this type of equilibrium point is classified as a saddle point. From the phase portrait we can verify that there are many orbits that lead away from the origin (equilibrium point), but there is one line of initial conditions that leads to the origin and that is the x-axis. In this case, the line of initial conditions is given by the x-axis.

image
Figure 2.6. Plots of solutions of Example 2.5 for several initial conditions.
image
Figure 2.7. Phase portrait for Example 2.5, a saddle.

Example 2.6. Unstable Node (source)

\begin{aligned} &x^{\prime}=2 x \\[4pt] &y^{\prime}=y \end{aligned} \nonumber

This example is similar to Example 2.3. The solutions are obtained by replacing t with -t. The solutions, orbits and direction fields can be seen in Figures 2.8-2.9. This is once again a node, but all orbits lead away from the equilibrium point. It is called an unstable node or a source.

image
Figure 2.8. Plots of solutions of Example 2.6 for several initial conditions.

Example 2.7. Center

\begin{aligned} &x^{\prime}=y \\[4pt] &y^{\prime}=-x \end{aligned} \nonumber

This system is a simple, coupled system. Neither equation can be solved without some information about the other unknown function. However, we can differentiate the first equation and use the second equation to obtain

x^{\prime \prime}+x=0 . \nonumber

We recognize this equation from the last chapter as one that appears in the study of simple harmonic motion. The solutions are pure sinusoidal oscillations:

x(t)=c_{1} \cos t+c_{2} \sin t, \quad y(t)=-c_{1} \sin t+c_{2} \cos t . \nonumber

In the phase plane the trajectories can be determined either by looking at the direction field, or solving the first order equation

image
Figure 2.9. Phase portrait for Example 2.6, an unstable node or source.

\dfrac{d y}{d x}=-\dfrac{x}{y} \text {. } \nonumber

Performing a separation of variables and integrating, we find that

x^{2}+y^{2}=C . \nonumber

Thus, we have a family of circles for C>0. (Can you prove this using the general solution?) Looking at the results graphically in Figures 2.10-2.11 confirms this result. This type of point is called a center.

Example 2.8. Focus (spiral)

\begin{aligned} &x^{\prime}=\alpha x+y \\[4pt] &y^{\prime}=-x . \end{aligned} \nonumber

In this example, we will see an additional set of behaviors of equilibrium points in planar systems. We have added one term, \alpha x, to the system in Example 2.7. We will consider the effects for two specific values of the parameter: \alpha=0.1,-0.2. The resulting behaviors are shown in the remaining graphs. We see orbits that look like spirals. These orbits are stable and unstable spirals (or foci, the plural of focus.)

We can understand these behaviors by once again relating the system of first order differential equations to a second order differential equation. Using our usual method for obtaining a second order equation form a system, we find that x(t) satisfies the differential equation

image
Figure 2.10. Plots of solutions of Example 2.7 for several initial conditions.
image
Figure 2.11. Phase portrait for Example 2.7, a center.

x^{\prime \prime}-\alpha x^{\prime}+x=0 \nonumber

We recall from our first course that this is a form of damped simple harmonic motion. We will explore the different types of solutions that will result for various \alpha ’s.

image
Figure 2.12. Plots of solutions of Example 2.8 for several initial conditions with \alpha=0.1
image
Figure 2.13. Plots of solutions of Example 2.8 for several initial conditions with \alpha=-0.2

The characteristic equation is r^{2}-\alpha r+1=0. The solution of this quadratic equation is

r=\dfrac{\alpha \pm \sqrt{\alpha^{2}-4}}{2} \nonumber

There are five special cases to consider as shown below.

\text { Classification of Solutions of } x^{\prime \prime}-\alpha x^{\prime}+x=0 \nonumber

  1. \alpha=-2. There is one real solution. This case is called critical damping since the solution r=-1 leads to exponential decay. The solution is x(t)=\left(c_{1}+c_{2} t\right) e^{-t}.
  2. \alpha<-2. There are two real, negative solutions, r=-\mu,-\nu, \mu, \nu>0. The solution is x(t)=c_{1} e^{-\mu t}+c_{2} e^{-\nu t}. In this case we have what is called overdamped motion. There are no oscillations
  3. -2<\alpha<0. There are two complex conjugate solutions r= \alpha / 2 \pm i \beta with real part less than zero and \beta=\dfrac{\sqrt{4-\alpha^{2}}}{2}. The solution is x(t)=\left(c_{1} \cos \beta t+c_{2} \sin \beta t\right) e^{\alpha t / 2}. Since \alpha<0, this consists of a decaying exponential times oscillations. This is often called an underdamped oscillation.
  4. \alpha=0. This leads to simple harmonic motion.
  5. 0<\alpha<2. This is similar to the underdamped case, except \alpha>0. The solutions are growing oscillations.
  6. \alpha=2. There is one real solution. The solution is x(t)=\left(c_{1}+\right. \left.c_{2} t\right) e^{t}. It leads to unbounded growth in time.
  7. For \alpha>2. There are two real, positive solutions r=\mu, \nu>0. The solution is x(t)=c_{1} e^{\mu t}+c_{2} e^{\nu t}, which grows in time.

For \alpha<0 the solutions are losing energy, so the solutions can oscillate with a diminishing amplitude. For \alpha>0, there is a growth in the amplitude, which is not typical. Of course, there can be overdamped motion if the magnitude of \alpha is too large.

Example 2.9. Degenerate Node

\begin{aligned} &x^{\prime}=-x \\[4pt] &y^{\prime}=-2 x-y \end{aligned} \nonumber

For this example, we write out the solutions. While it is a coupled system, only the second equation is coupled. There are two possible approaches.

a. We could solve the first equation to find x(t)=c_{1} e^{-t}. Inserting this solution into the second equation, we have

y^{\prime}+y=-2 c_{1} e^{-t} \nonumber

This is a relatively simple linear first order equation for y=y(t). The integrating factor is \mu=e^{t}. The solution is found as y(t)=\left(c_{2}-2 c_{1} t\right) e^{-t}.

b. Another method would be to proceed to rewrite this as a second order equation. Computing x^{\prime \prime} does not get us very far. So, we look at

image
Figure 2.14. Phase portrait for Example 2.8 with \alpha=0.1. This is an unstable focus, or spiral.

\begin{aligned} y^{\prime \prime} &=-2 x^{\prime}-y^{\prime} \\[4pt] &=2 x-y^{\prime} \\[4pt] &=-2 y^{\prime}-y \end{aligned} \nonumber

Therefore, y satisfies

y^{\prime \prime}+2 y^{\prime}+y=0 . \nonumber

The characteristic equation has one real root, r=-1. So, we write

y(t)=\left(k_{1}+k_{2} t\right) e^{-t} . \nonumber

This is a stable degenerate node. Combining this with the solution x(t)= c_{1} e^{-t}, we can show that y(t)=\left(c_{2}-2 c_{1} t\right) e^{-t} as before.

In Figure 2.16 we see several orbits in this system. It differs from the stable node show in Figure 2.2 in that there is only one direction along which the orbits approach the origin instead of two. If one picks c_{1}=0, then x(t)=0 and y(t)=c_{2} e^{-t}. This leads to orbits running along the y-axis as seen in the figure.

image
Figure 2.15. Phase portrait for Example 2.8 with \alpha=-0.2. This is a stable focus, or spiral.

In this last example, we have a coupled set of equations. We rewrite it as a second order differential equation:

\begin{aligned} x^{\prime \prime} &=2 x^{\prime}-y^{\prime} \\[4pt] &=2 x^{\prime}-(-2 x+y) \\[4pt] &=2 x^{\prime}+2 x+\left(x^{\prime}-2 x\right)=3 x^{\prime} \end{aligned} \nonumber

So, the second order equation is

x^{\prime \prime}-3 x^{\prime}=0 \nonumber

and the characteristic equation is 0=r(r-3). This gives the general solution as

x(t)=c_{1}+c_{2} e^{3} t \nonumber

and thus

y=2 x-x^{\prime}=2\left(c_{1}+c_{2}^{3} t\right)-\left(3 c_{2} e^{3 t}\right)=2 c_{1}-c_{2} e^{3 t} \nonumber

In Figure 2.17 we show the direction field. The constant slope field seen in this example is confirmed by a simple computation:

\dfrac{d y}{d x}=\dfrac{-2 x+y}{2 x-y}=-1 \nonumber

Furthermore, looking at initial conditions with y=2 x, we have at t=0,

image
Figure 2.16. Plots of solutions of Example 2.9 for several initial conditions.

2 c_{1}-c_{2}=2\left(c_{1}+c_{2}\right) \quad \Rightarrow \quad c_{2}=0 \nonumber

Therefore, points on this line remain on this line forever, (x, y)=\left(c_{1}, 2 c_{1}\right). This line of fixed points is called a line of equilibria.

Polar Representation of Spirals

In the examples with a center or a spiral, one might be able to write the solutions in polar coordinates. Recall that a point in the plane can be described by either Cartesian (x, y) or polar (r, \theta) coordinates. Given the polar form, one can find the Cartesian components using

x=r \cos \theta \text { and } y=r \sin \theta . \nonumber

Given the Cartesian coordinates, one can find the polar coordinates using

r^{2}=x^{2}+y^{2} \text { and } \tan \theta=\dfrac{y}{x} . \nonumber

Since x and y are functions of t, then naturally we can think of r and \theta as functions of t. The equations that they satisfy are obtained by differentiating the above relations with respect to t.

image
Figure 2.17. Plots of direction field of Example 2.10.

Differentiating the first equation in (2.27) gives

r r^{\prime}=x x^{\prime}+y y^{\prime} . \nonumber

Inserting the expressions for x^{\prime} and y^{\prime} from system 2.5, we have

r r^{\prime}=x(a x+b y)+y(c x+d y) . \nonumber

In some cases this may be written entirely in terms of r ’s. Similarly, we have that

\theta^{\prime}=\dfrac{x y^{\prime}-y x^{\prime}}{r^{2}} \nonumber

which the reader can prove for homework.

In summary, when converting first order equations from rectangular to polar form, one needs the relations below.

Time Derivatives of Polar Variables

\begin{aligned} r^{\prime} &=\dfrac{x x^{\prime}+y y^{\prime}}{r} \\[4pt] \theta^{\prime} &=\dfrac{x y^{\prime}-y x^{\prime}}{r^{2}} \end{aligned} \nonumber

Example 2.11. Rewrite the following system in polar form and solve the resulting system.

\begin{aligned} &x^{\prime}=a x+b y \\[4pt] &y^{\prime}=-b x+a y \end{aligned} \nonumber

We first compute r^{\prime} and \theta^{\prime} :

\begin{gathered} r r^{\prime}=x x^{\prime}+y y^{\prime}=x(a x+b y)+y(-b x+a y)=a r^{2} \\[4pt] r^{2} \theta^{\prime}=x y^{\prime}-y x^{\prime}=x(-b x+a y)-y(a x+b y)=-b r^{2} . \end{gathered} \nonumber

This leads to simpler system

\begin{aligned} &r^{\prime}=a r \\[4pt] &\theta^{\prime}=-b \end{aligned} \nonumber

This system is uncoupled. The second equation in this system indicates that we traverse the orbit at a constant rate in the clockwise direction. Solving these equations, we have that r(t)=r_{0} e^{a t}, \quad \theta(t)=\theta_{0}-b t. Eliminating t between these solutions, we finally find the polar equation of the orbits:

r=r_{0} e^{-a\left(\theta-\theta_{0}\right) t / b} \nonumber

If you graph this for a \neq 0, you will get stable or unstable spirals.

Example 2.12. Consider the specific system

\begin{aligned} &x^{\prime}=-y+x \\[4pt] &y^{\prime}=x+y . \end{aligned} \nonumber

In order to convert this system into polar form, we compute

\begin{gathered} r r^{\prime}=x x^{\prime}+y y^{\prime}=x(-y+x)+y(x+y)=r^{2} \\[4pt] r^{2} \theta^{\prime}=x y^{\prime}-y x^{\prime}=x(x+y)-y(-y+x)=r^{2} \end{gathered} \nonumber

This leads to simpler system

\begin{aligned} &r^{\prime}=r \\[4pt] &\theta^{\prime}=1 \end{aligned} \nonumber

Solving these equations yields

r(t)=r_{0} e^{t}, \quad \theta(t)=t+\theta_{0} . \nonumber

Eliminating t from this solution gives the orbits in the phase plane, r(\theta)= r_{0} e^{\theta-\theta_{0}} A more complicated example arises for a nonlinear system of differential equations. Consider the following example.

Example 2.13.

\begin{aligned} &x^{\prime}=-y+x\left(1-x^{2}-y^{2}\right) \\[4pt] &y^{\prime}=x+y\left(1-x^{2}-y^{2}\right) \end{aligned} \nonumber

Transforming to polar coordinates, one can show that In order to convert this system into polar form, we compute

r^{\prime}=r\left(1-r^{2}\right), \quad \theta^{\prime}=1 \nonumber

This uncoupled system can be solved and such nonlinear systems will be studied in the next chapter.

Matrix Formulation

We have investigated several linear systems in the plane and in the next chapter we will use some of these ideas to investigate nonlinear systems. We need a deeper insight into the solutions of planar systems. So, in this section we will recast the first order linear systems into matrix form. This will lead to a better understanding of first order systems and allow for extensions to higher dimensions and the solution of nonhomogeneous equations later in this chapter.

We start with the usual homogeneous system in Equation (2.5). Let the unknowns be represented by the vector

\mathbf{x}(t)=\left(\begin{array}{c} x(t) \\[4pt] y(t) \end{array}\right) \nonumber

Then we have that

\mathbf{x}^{\prime}=\left(\begin{array}{l} x^{\prime} \\[4pt] y^{\prime} \end{array}\right)=\left(\begin{array}{c} a x+b y \\[4pt] c x+d y \end{array}\right)=\left(\begin{array}{ll} a & b \\[4pt] c & d \end{array}\right)\left(\begin{array}{l} x \\[4pt] y \end{array}\right) \equiv A \mathbf{x} \nonumber

Here we have introduced the coefficient matrix A. This is a first order vector differential equation,

\mathbf{x}^{\prime}=A \mathbf{x} \nonumber

Formerly, we can write the solution as

\mathbf{x}=\mathbf{x}_{0} e^{A t} \nonumber

\overline{1} The exponential of a matrix is defined using the Maclaurin series expansion We would like to investigate the solution of our system. Our investigations will lead to new techniques for solving linear systems using matrix methods.

We begin by recalling the solution to the specific problem (2.12). We obtained the solution to this system as

\begin{gathered} x(t)=c_{1} e^{t}+c_{2} e^{-4 t}, \\[4pt] y(t)=\dfrac{1}{3} c_{1} e^{t}-\dfrac{1}{2} c_{2} e^{-4 t} \end{gathered} \nonumber

This can be rewritten using matrix operations. Namely, we first write the solution in vector form.

\begin{aligned} \mathbf{x} &=\left(\begin{array}{c} x(t) \\[4pt] y(t) \end{array}\right) \\[4pt] &=\left(\begin{array}{c} c_{1} e^{t}+c_{2} e^{-4 t} \\[4pt] \dfrac{1}{3} c_{1} e^{t}-\dfrac{1}{2} c_{2} e^{-4 t} \end{array}\right) \\[4pt] &=\left(\begin{array}{c} c_{1} e^{t} \\[4pt] \dfrac{1}{3} c_{1} e^{t} \end{array}\right)+\left(\begin{array}{c} c_{2} e^{-4 t} \\[4pt] -\dfrac{1}{2} c_{2} e^{-4 t} \end{array}\right) \\[4pt] &=c_{1}\left(\begin{array}{c} 1 \\[4pt] \dfrac{1}{3} \end{array}\right) e^{t}+c_{2}\left(\begin{array}{c} 1 \\[4pt] -\dfrac{1}{2} \end{array}\right) e^{-4 t} \end{aligned} \nonumber

We see that our solution is in the form of a linear combination of vectors of the form

\mathbf{x}=\mathbf{v} e^{\lambda t} \nonumber

with \mathbf{v} a constant vector and \lambda a constant number. This is similar to how we began to find solutions to second order constant coefficient equations. So, for the general problem (2.3) we insert this guess. Thus,

\begin{aligned} \mathbf{x}^{\prime} &=A \mathbf{x} \Rightarrow \\[4pt] \lambda \mathbf{v} e^{\lambda t} &=A \mathbf{v} e^{\lambda t} . \end{aligned} \nonumber

For this to be true for all t, we have that

A \mathbf{v}=\lambda \mathbf{v} . \nonumber

This is an eigenvalue problem. A is a 2 \times 2 matrix for our problem, but could easily be generalized to a system of n first order differential equations. We will confine our remarks for now to planar systems. However, we need to recall how to solve eigenvalue problems and then see how solutions of eigenvalue problems can be used to obtain solutions to our systems of differential equations..

e^{x}=\sum_{k=0}^{\infty}=1+x+\dfrac{x^{2}}{2 !}+\dfrac{x^{3}}{3 !}+\cdots \nonumber

So, we define

e^{A}=\sum_{k=0}^{\infty}=I+A+\dfrac{A^{2}}{2 !}+\dfrac{A^{3}}{3 !}+\cdots \nonumber

In general, it is difficult computing e^{A} unless A is diagonal.

2.4 Eigenvalue Problems

We seek nontrivial solutions to the eigenvalue problem

A \mathbf{v}=\lambda \mathbf{v} . \nonumber

We note that \mathbf{v}=\mathbf{0} is an obvious solution. Furthermore, it does not lead to anything useful. So, it is called a trivial solution. Typically, we are given the matrix A and have to determine the eigenvalues, \lambda, and the associated eigenvectors, \mathbf{v}, satisfying the above eigenvalue problem. Later in the course we will explore other types of eigenvalue problems.

For now we begin to solve the eigenvalue problem for \mathbf{v}=\left(\begin{array}{l}v_{1} \\[4pt] v_{2}\end{array}\right). Inserting this into Equation (2.39), we obtain the homogeneous algebraic system

\begin{aligned} &(a-\lambda) v_{1}+b v_{2}=0 \\[4pt] &c v_{1}+(d-\lambda) v_{2}=0 . \end{aligned} \nonumber

The solution of such a system would be unique if the determinant of the system is not zero. However, this would give the trivial solution v_{1}=0, v_{2}=0. To get a nontrivial solution, we need to force the determinant to be zero. This yields the eigenvalue equation

0=\left|\begin{array}{cc} a-\lambda & b \\[4pt] c & d-\lambda \end{array}\right|=(a-\lambda)(d-\lambda)-b c . \nonumber

This is a quadratic equation for the eigenvalues that would lead to nontrivial solutions. If we expand the right side of the equation, we find that

\lambda^{2}-(a+d) \lambda+a d-b c=0 . \nonumber

This is the same equation as the characteristic equation (2.8) for the general constant coefficient differential equation considered in the first chapter. Thus, the eigenvalues correspond to the solutions of the characteristic polynomial for the system.

Once we find the eigenvalues, then there are possibly an infinite number solutions to the algebraic system. We will see this in the examples.

So, the process is to

a) Write the coefficient matrix;

b) Find the eigenvalues from the equation \operatorname{det}(A-\lambda I)=0; and,

c) Find the eigenvectors by solving the linear system (A-\lambda I) \mathbf{v}=0 for each \lambda.

Solving Constant Coefficient Systems in 2 \mathrm{D}

Before proceeding to examples, we first indicate the types of solutions that could result from the solution of a homogeneous, constant coefficient system of first order differential equations. We begin with the linear system of differential equations in matrix form.

\dfrac{d \mathbf{x}}{d t}=\left(\begin{array}{ll} a & b \\[4pt] c & d \end{array}\right) \mathbf{x}=A \mathbf{x} . \nonumber

The type of behavior depends upon the eigenvalues of matrix A. The procedure is to determine the eigenvalues and eigenvectors and use them to construct the general solution.

If we have an initial condition, \mathbf{x}\left(t_{0}\right)=\mathbf{x}_{0}, we can determine the two arbitrary constants in the general solution in order to obtain the particular solution. Thus, if \mathbf{x}_{1}(t) and \mathbf{x}_{2}(t) are two linearly independent solutions { }^{2}, then the general solution is given as

\mathbf{x}(t)=c_{1} \mathbf{x}_{1}(t)+c_{2} \mathbf{x}_{2}(t) \nonumber

Then, setting t=0, we get two linear equations for c_{1} and c_{2} :

c_{1} \mathbf{x}_{1}(0)+c_{2} \mathbf{x}_{2}(0)=\mathbf{x}_{0} \nonumber

The major work is in finding the linearly independent solutions. This depends upon the different types of eigenvalues that one obtains from solving the eigenvalue equation, \operatorname{det}(A-\lambda I)=0. The nature of these roots indicate the form of the general solution. On the next page we summarize the classification of solutions in terms of the eigenvalues of the coefficient matrix. We first make some general remarks about the plausibility of these solutions and then provide examples in the following section to clarify the matrix methods for our two dimensional systems.

The construction of the general solution in Case I is straight forward. However, the other two cases need a little explanation.

{ }^{2} Recall that linear independence means c_{1} \mathbf{x}_{1}(t)+c_{2} \mathbf{x}_{2}(t)=\mathbf{0} if and only if c_{1}, c_{2}= 0 . The reader should derive the condition on the \mathbf{x}_{i} for linear independence.

Classification of the Solutions for Two
Linear First Order Differential Equations
  1. Case I: Two real, distinct roots.

Solve the eigenvalue problem A \mathbf{v}=\lambda \mathbf{v} for each eigenvalue obtaining two eigenvectors \mathbf{v}_{1}, \mathbf{v}_{2}. Then write the general solution as a linear combination \mathbf{x}(t)=c_{1} e^{\lambda_{1} t} \mathbf{v}_{1}+c_{2} e^{\lambda_{2} t} \mathbf{v}_{2}

Case II: One Repeated Root

Solve the eigenvalue problem A \mathbf{v}=\lambda \mathbf{v} for one eigenvalue \lambda, obtaining the first eigenvector \mathbf{v}_{1}. One then needs a second linearly independent solution. This is obtained by solving the nonhomogeneous problem A \mathbf{v}_{2}-\lambda \mathbf{v}_{2}=\mathbf{v}_{1} for \mathbf{v}_{2}.

The general solution is then given by \mathbf{x}(t)=c_{1} e^{\lambda t} \mathbf{v}_{1}+c_{2} e^{\lambda t}\left(\mathbf{v}_{2}+t \mathbf{v}_{1}\right). 3. Case III: Two complex conjugate roots.

Solve the eigenvalue problem A \mathbf{x}=\lambda \mathbf{x} for one eigenvalue, \lambda=\alpha+i \beta, obtaining one eigenvector \mathbf{v}. Note that this eigenvector may have complex entries. Thus, one can write the vector \mathbf{y}(t)=e^{\lambda t} \mathbf{v}=e^{\alpha t}(\cos \beta t+ i \sin \beta t) \mathbf{v}. Now, construct two linearly independent solutions to the problem using the real and imaginary parts of \mathbf{y}(t): \mathbf{y}_{1}(t)=\operatorname{Re}(\mathbf{y}(t)) and \mathbf{y}_{2}(t)=\operatorname{Im}(\mathbf{y}(t)). Then the general solution can be written as \mathbf{x}(t)=c_{1} \mathbf{y}_{1}(t)+c_{2} \mathbf{y}_{2}(t)

Let’s consider Case III. Note that since the original system of equations does not have any i ’s, then we would expect real solutions. So, we look at the real and imaginary parts of the complex solution. We have that the complex solution satisfies the equation

\dfrac{d}{d t}[\operatorname{Re}(\mathbf{y}(t))+i \operatorname{Im}(\mathbf{y}(t))]=A[\operatorname{Re}(\mathbf{y}(t))+i \operatorname{Im}(\mathbf{y}(t))] \nonumber

Differentiating the sum and splitting the real and imaginary parts of the equation, gives

\dfrac{d}{d t} \operatorname{Re}(\mathbf{y}(t))+i \dfrac{d}{d t} \operatorname{Im}(\mathbf{y}(t))=A[\operatorname{Re}(\mathbf{y}(t))]+i A[\operatorname{Im}(\mathbf{y}(t))] . \nonumber

Setting the real and imaginary parts equal, we have

\dfrac{d}{d t} \operatorname{Re}(\mathbf{y}(t))=A[\operatorname{Re}(\mathbf{y}(t))] \nonumber

and

\dfrac{d}{d t} \operatorname{Im}(\mathbf{y}(t))=A[\operatorname{Im}(\mathbf{y}(t))] \nonumber

Therefore, the real and imaginary parts each are linearly independent solutions of the system and the general solution can be written as a linear combination of these expressions. We now turn to Case II. Writing the system of first order equations as a second order equation for x(t) with the sole solution of the characteristic equation, \lambda=\dfrac{1}{2}(a+d), we have that the general solution takes the form

x(t)=\left(c_{1}+c_{2} t\right) e^{\lambda t} . \nonumber

This suggests that the second linearly independent solution involves a term of the form v t e^{\lambda t}. It turns out that the guess that works is

\mathbf{x}=t e^{\lambda t} \mathbf{v}_{1}+e^{\lambda t} \mathbf{v}_{2} \nonumber

Inserting this guess into the system \mathbf{x}^{\prime}=A \mathbf{x} yields

\begin{aligned} \left(t e^{\lambda t} \mathbf{v}_{1}+e^{\lambda t} \mathbf{v}_{2}\right)^{\prime} &=A\left[t e^{\lambda t} \mathbf{v}_{1}+e^{\lambda t} \mathbf{v}_{2}\right] . \\[4pt] e^{\lambda t} \mathbf{v}_{1}+\lambda t e^{\lambda t} \mathbf{v}_{1}+\lambda e^{\lambda t} \mathbf{v}_{2} &=\lambda t e^{\lambda t} \mathbf{v}_{1}+e^{\lambda t} A \mathbf{v}_{2} . \\[4pt] e^{\lambda t}\left(\mathbf{v}_{1}+\lambda \mathbf{v}_{2}\right) &=e^{\lambda t} A \mathbf{v}_{2} . \end{aligned} \nonumber

Noting this is true for all t, we find that

\mathbf{v}_{1}+\lambda \mathbf{v}_{2}=A \mathbf{v}_{2} . \nonumber

Therefore,

(A-\lambda I) \mathbf{v}_{2}=\mathbf{v}_{1} \text {. } \nonumber

We know everything except for \mathbf{v}_{2}. So, we just solve for it and obtain the second linearly independent solution.

Examples of the Matrix Method

Here we will give some examples for typical systems for the three cases mentioned in the last section.

Example 2.14. A=\left(\begin{array}{ll}4 & 2 \\[4pt] 3 & 3\end{array}\right).

Eigenvalues: We first determine the eigenvalues.

0=\left|\begin{array}{cc} 4-\lambda & 2 \\[4pt] 3 & 3-\lambda \end{array}\right| \nonumber

Therefore,

\begin{aligned} &0=(4-\lambda)(3-\lambda)-6 \\[4pt] &0=\lambda^{2}-7 \lambda+6 \\[4pt] &0=(\lambda-1)(\lambda-6) \end{aligned} \nonumber

The eigenvalues are then \lambda=1,6. This is an example of Case I.

Eigenvectors: Next we determine the eigenvectors associated with each of these eigenvalues. We have to solve the system A \mathbf{v}=\lambda \mathbf{v} in each case. Case \lambda=1

\begin{gathered} \left(\begin{array}{ll} 4 & 2 \\[4pt] 3 & 3 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right) \\[4pt] \left(\begin{array}{ll} 3 & 2 \\[4pt] 3 & 2 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} 0 \\[4pt] 0 \end{array}\right) \end{gathered} \nonumber

This gives 3 v_{1}+2 v_{2}=0. One possible solution yields an eigenvector of

\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{c} 2 \\[4pt] -3 \end{array}\right) \nonumber

Case \lambda=6

\begin{aligned} &\left(\begin{array}{cc} 4 & 2 \\[4pt] 3 & 3 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=6\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right) \\[4pt] &\left(\begin{array}{cc} -2 & 2 \\[4pt] 3 & -3 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} 0 \\[4pt] 0 \end{array}\right) \end{aligned} \nonumber

For this case we need to solve -2 v_{1}+2 v_{2}=0. This yields

\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} 1 \\[4pt] 1 \end{array}\right) \nonumber

General Solution: We can now construct the general solution.

\begin{aligned} \mathbf{x}(t) &=c_{1} e^{\lambda_{1} t} \mathbf{v}_{1}+c_{2} e^{\lambda_{2} t} \mathbf{v}_{2} \\[4pt] &=c_{1} e^{t}\left(\begin{array}{c} 2 \\[4pt] -3 \end{array}\right)+c_{2} e^{6 t}\left(\begin{array}{l} 1 \\[4pt] 1 \end{array}\right) \\[4pt] &=\left(\begin{array}{c} 2 c_{1} e^{t}+c_{2} e^{6 t} \\[4pt] -3 c_{1} e^{t}+c_{2} e^{6 t} \end{array}\right) . \end{aligned} \nonumber

Example 2.15. A=\left(\begin{array}{ll}3 & -5 \\[4pt] 1 & -1\end{array}\right).

Eigenvalues: Again, one solves the eigenvalue equation.

0=\left|\begin{array}{cc} 3-\lambda & -5 \\[4pt] 1 & -1-\lambda \end{array}\right| \nonumber

Therefore,

\begin{aligned} &0=(3-\lambda)(-1-\lambda)+5 \\[4pt] &0=\lambda^{2}-2 \lambda+2 \\[4pt] &\lambda=\dfrac{-(-2) \pm \sqrt{4-4(1)(2)}}{2}=1 \pm i \end{aligned} \nonumber

The eigenvalues are then \lambda=1+i, 1-i. This is an example of Case III.

Eigenvectors: In order to find the general solution, we need only find the eigenvector associated with 1+i.

\begin{gathered} \left(\begin{array}{l} 3-5 \\[4pt] 1-1 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=(1+i)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right) \\[4pt] \left(\begin{array}{cc} 2-i & -5 \\[4pt] 1 & -2-i \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} 0 \\[4pt] 0 \end{array}\right) \end{gathered} \nonumber

We need to solve (2-i) v_{1}-5 v_{2}=0. Thus,

\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{c} 2+i \\[4pt] 1 \end{array}\right) \nonumber

Complex Solution: In order to get the two real linearly independent solutions, we need to compute the real and imaginary parts of \mathbf{v} e^{\lambda t}.

\begin{aligned} e^{\lambda t}\left(\begin{array}{c} 2+i \\[4pt] 1 \end{array}\right) &=e^{(1+i) t}\left(\begin{array}{c} 2+i \\[4pt] 1 \end{array}\right) \\[4pt] &=e^{t}(\cos t+i \sin t)\left(\begin{array}{c} 2+i \\[4pt] 1 \end{array}\right) \\[4pt] &=e^{t}\left(\begin{array}{c} (2+i)(\cos t+i \sin t) \\[4pt] \cos t+i \sin t \end{array}\right) \\[4pt] &=e^{t}\left(\begin{array}{c} (2 \cos t-\sin t)+i(\cos t+2 \sin t) \\[4pt] \cos t+i \sin t \end{array}\right)+i e^{t}\left(\begin{array}{c} \cos t+2 \sin t \\[4pt] \sin t \end{array}\right) \\[4pt] &=e^{t}\left(\begin{array}{c} 2 \cos t-\sin t \\[4pt] \cos t \end{array}\right) \end{aligned} \nonumber

General Solution: Now we can construct the general solution.

\begin{aligned} \mathbf{x}(t) &=c_{1} e^{t}\left(\begin{array}{l} 2 \cos t-\sin t \\[4pt] \cos t \end{array}\right)+c_{2} e^{t}\left(\begin{array}{c} \cos t+2 \sin t \\[4pt] \sin t \end{array}\right) \\[4pt] &=e^{t}\left(\begin{array}{c} c_{1}(2 \cos t-\sin t)+c_{2}(\cos t+2 \sin t) \\[4pt] c_{1} \cos t+c_{2} \sin t \end{array}\right) \end{aligned} \nonumber

Note: This can be rewritten as

\mathbf{x}(t)=e^{t} \cos t\left(\begin{array}{c} 2 c_{1}+c_{2} \\[4pt] c_{1} \end{array}\right)+e^{t} \sin t\left(\begin{array}{c} 2 c_{2}-c_{1} \\[4pt] c_{2} \end{array}\right) \nonumber

Example 2.16. A=\left(\begin{array}{cc}7 & -1 \\[4pt] 9 & 1\end{array}\right).

Eigenvalues:

0=\left|\begin{array}{cc} 7-\lambda & -1 \\[4pt] 9 & 1-\lambda \end{array}\right| \nonumber

Therefore,

\begin{aligned} &0=(7-\lambda)(1-\lambda)+9 \\[4pt] &0=\lambda^{2}-8 \lambda+16 \\[4pt] &0=(\lambda-4)^{2} \end{aligned} \nonumber

There is only one real eigenvalue, \lambda=4. This is an example of Case II.

Eigenvectors: In this case we first solve for \mathbf{v}_{1} and then get the second linearly independent vector.

\begin{aligned} &\left(\begin{array}{cc} 7 & -1 \\[4pt] 9 & 1 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=4\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right) \\[4pt] &\left(\begin{array}{ll} 3 & -1 \\[4pt] 9 & -3 \end{array}\right)\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} 0 \\[4pt] 0 \end{array}\right) \end{aligned} \nonumber

Therefore, we have

3 v_{1}-v_{2}=0, \quad \Rightarrow \quad\left(\begin{array}{l} v_{1} \\[4pt] v_{2} \end{array}\right)=\left(\begin{array}{l} 1 \\[4pt] 3 \end{array}\right) \text {. } \nonumber

Second Linearly Independent Solution:

Now we need to solve A \mathbf{v}_{2}-\lambda \mathbf{v}_{2}=\mathbf{v}_{1}.

Expanding the matrix product, we obtain the system of equations

\begin{array}{r} 3 u_{1}-u_{2}=1 \\[4pt] 9 u_{1}-3 u_{2}=3 . \end{array} \nonumber

The solution of this system is \left(\begin{array}{l}u_{1} \\[4pt] u_{2}\end{array}\right)=\left(\begin{array}{l}1 \\[4pt] 2\end{array}\right).

General Solution: We construct the general solution as

\begin{aligned} \mathbf{y}(t) &=c_{1} e^{\lambda t} \mathbf{v}_{1}+c_{2} e^{\lambda t}\left(\mathbf{v}_{2}+t \mathbf{v}_{1}\right) \\[4pt] &=c_{1} e^{4 t}\left(\begin{array}{l} 1 \\[4pt] 3 \end{array}\right)+c_{2} e^{4 t}\left[\left(\begin{array}{l} 1 \\[4pt] 2 \end{array}\right)+t\left(\begin{array}{l} 1 \\[4pt] 3 \end{array}\right)\right] \\[4pt] &=e^{4 t}\left(\begin{array}{c} c_{1}+c_{2}(1+t) \\[4pt] 3 c_{1}+c_{2}(2+3 t) \end{array}\right) \end{aligned} \nonumber

\begin{aligned} & \left(\begin{array}{cc}7 & -1 \\[4pt]9 & 1\end{array}\right)\left(\begin{array}{l}u_{1} \\[4pt]u_{2}\end{array}\right)-4\left(\begin{array}{l}u_{1} \\[4pt]u_{2}\end{array}\right)=\left(\begin{array}{l}1 \\[4pt]3\end{array}\right) \\[4pt] & \left(\begin{array}{ll}3 & -1 \\[4pt]9 & -3\end{array}\right)\left(\begin{array}{l}u_{1} \\[4pt]u_{2}\end{array}\right)=\left(\begin{array}{l}1 \\[4pt]3\end{array}\right) \text {. } \end{aligned} \nonumber

Planar Systems - Summary

The reader should have noted by now that there is a connection between the behavior of the solutions obtained in Section 2.2 and the eigenvalues found from the coefficient matrices in the previous examples. Here we summarize some of these cases.

image

Table 2.1. List of typical behaviors in planar systems.

The connection, as we have seen, is that the characteristic equation for the associated second order differential equation is the same as the eigenvalue equation of the coefficient matrix for the linear system. However, one should be a little careful in cases in which the coefficient matrix in not diagonalizable. In Table 2.2 are three examples of systems with repeated roots. The reader should look at these systems and look at the commonalities and differences in these systems and their solutions. In these cases one has unstable nodes, though they are degenerate in that there is only one accessible eigenvector.

Theory of Homogeneous Constant Coefficient Systems

There is a general theory for solving homogeneous, constant coefficient systems of first order differential equations. We begin by once again recalling the specific problem (2.12). We obtained the solution to this system as

\begin{gathered} x(t)=c_{1} e^{t}+c_{2} e^{-4 t}, \\[4pt] y(t)=\dfrac{1}{3} c_{1} e^{t}-\dfrac{1}{2} c_{2} e^{-4 t} \end{gathered} \nonumber

image

Table 2.2. Three examples of systems with a repeated root of \lambda=2.

This time we rewrite the solution as

\begin{aligned} \mathbf{x} &=\left(\begin{array}{c} c_{1} e^{t}+c_{2} e^{-4 t} \\[4pt] \dfrac{1}{3} c_{1} e^{t}-\dfrac{1}{2} c_{2} e^{-4 t} \end{array}\right) \\[4pt] &=\left(\begin{array}{cc} e^{t} & e^{-4 t} \\[4pt] \dfrac{1}{3} e^{t}-\dfrac{1}{2} e^{-4 t} \end{array}\right)\left(\begin{array}{c} c_{1} \\[4pt] c_{2} \end{array}\right) \\[4pt] & \equiv \Phi(t) \mathbf{C} \end{aligned} \nonumber

Thus, we can write the general solution as a 2 \times 2 matrix \Phi times an arbitrary constant vector. The matrix \Phi consists of two columns that are linearly independent solutions of the original system. This matrix is an example of what we will define as the Fundamental Matrix of solutions of the system. So, determining the Fundamental Matrix will allow us to find the general solution of the system upon multiplication by a constant matrix. In fact, we will see that it will also lead to a simple representation of the solution of the initial value problem for our system. We will outline the general theory.

Consider the homogeneous, constant coefficient system of first order differential equations

\begin{aligned} \dfrac{d x_{1}}{d t} &=a_{11} x_{1}+a_{12} x_{2}+\ldots+a_{1 n} x_{n} \\[4pt] \dfrac{d x_{2}}{d t} &=a_{21} x_{1}+a_{22} x_{2}+\ldots+a_{2 n} x_{n} \\[4pt] & \vdots \\[4pt] \dfrac{d x_{n}}{d t} &=a_{n 1} x_{1}+a_{n 2} x_{2}+\ldots+a_{n n} x_{n} \end{aligned} \nonumber

As we have seen, this can be written in the matrix form \mathbf{x}^{\prime}=A \mathbf{x}, where

\mathbf{x}=\left(\begin{array}{c} x_{1} \\[4pt] x_{2} \\[4pt] \vdots \\[4pt] x_{n} \end{array}\right) \nonumber

and

A=\left(\begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1 n} \\[4pt] a_{21} & a_{22} & \cdots & a_{2 n} \\[4pt] \vdots & \vdots & \ddots & \vdots \\[4pt] a_{n 1} & a_{n 2} & \cdots & a_{n n} \end{array}\right) \nonumber

Now, consider m vector solutions of this system: \phi_{1}(t), \phi_{2}(t), \ldots \phi_{m}(t). These solutions are said to be linearly independent on some domain if

c_{1} \phi_{1}(t)+c_{2} \phi_{2}(t)+\ldots+c_{m} \phi_{m}(t)=0 \nonumber

for all t in the domain implies that c_{1}=c_{2}=\ldots=c_{m}=0.

Let \phi_{1}(t), \phi_{2}(t), \ldots \phi_{n}(t) be a set of n linearly independent set of solutions of our system, called a fundamental set of solutions. We construct a matrix from these solutions using these solutions as the column of that matrix. We define this matrix to be the fundamental matrix solution. This matrix takes the form

\Phi=\left(\begin{array}{lll} \phi_{1} & \ldots & \phi_{n} \end{array}\right)=\left(\begin{array}{cccc} \phi_{11} & \phi_{12} & \cdots & \phi_{1 n} \\[4pt] \phi_{21} & \phi_{22} & \cdots & \phi_{2 n} \\[4pt] \vdots & \vdots & \ddots & \vdots \\[4pt] \phi_{n 1} & \phi_{n 2} & \cdots & \phi_{n n} \end{array}\right) \nonumber

What do we mean by a "matrix" solution? We have assumed that each \phi_{k} is a solution of our system. Therefore, we have that \phi_{k}^{\prime}=A \phi_{k}, for k=1, \ldots, n. We say that \Phi is a matrix solution because we can show that \Phi also satisfies the matrix formulation of the system of differential equations. We can show this using the properties of matrices.

\begin{aligned} \dfrac{d}{d t} \Phi &=\left(\phi_{1}^{\prime} \ldots \phi_{n}^{\prime}\right) \\[4pt] &=\left(A \phi_{1} \ldots A \phi_{n}\right) \\[4pt] &=A\left(\phi_{1} \ldots \phi_{n}\right) \\[4pt] &=A \Phi \end{aligned} \nonumber

Given a set of vector solutions of the system, when are they linearly independent? We consider a matrix solution \Omega(t) of the system in which we have n vector solutions. Then, we define the Wronskian of \Omega(t) to be

W=\operatorname{det} \Omega(t) \nonumber

If W(t) \neq 0, then \Omega(t) is a fundamental matrix solution. Before continuing, we list the fundamental matrix solutions for the set of examples in the last section. (Refer to the solutions from those examples.) Furthermore, note that the fundamental matrix solutions are not unique as one can multiply any column by a nonzero constant and still have a fundamental matrix solution.

Example 2.14 A=\left(\begin{array}{ll}4 & 2 \\[4pt] 3 & 3\end{array}\right).

\Phi(t)=\left(\begin{array}{cc} 2 e^{t} & e^{6 t} \\[4pt] -3 e^{t} & e^{6 t} \end{array}\right) \nonumber

We should note in this case that the Wronskian is found as

\begin{aligned} W &=\operatorname{det} \Phi(t) \\[4pt] &=\left|\begin{array}{cc} 2 e^{t} & e^{6 t} \\[4pt] -3 e^{t} & e^{6 t} \end{array}\right| \\[4pt] &=5 e^{7 t} \neq 0 . \end{aligned} \nonumber

Example 2.15 A=\left(\begin{array}{ll}3 & -5 \\[4pt] 1 & -1\end{array}\right).

\Phi(t)=\left(\begin{array}{cc} e^{t}(2 \cos t-\sin t) & e^{t}(\cos t+2 \sin t) \\[4pt] e^{t} \cos t & e^{t} \sin t \end{array}\right) \nonumber

Example 2.16 A=\left(\begin{array}{cc}7 & -1 \\[4pt] 9 & 1\end{array}\right).

\Phi(t)=\left(\begin{array}{cc} e^{4 t} & e^{4 t}(1+t) \\[4pt] 3 e^{4 t} & e^{4 t}(2+3 t) \end{array}\right) \nonumber

So far we have only determined the general solution. This is done by the following steps:

Procedure for Determining the General Solution

  1. Solve the eigenvalue problem (A-\lambda I) \mathbf{v}=0.
  2. Construct vector solutions from e^{\lambda t}. The method depends if one has real or complex conjugate eigenvalues.
  3. Form the fundamental solution matrix \Phi(t) from the vector solution.
  4. The general solution is given by \mathbf{x}(t)=\Phi(t) \mathbf{C} for \mathbf{C} an arbitrary constant vector.

We are now ready to solve the initial value problem:

\mathbf{x}^{\prime}=A \mathbf{x}, \quad \mathbf{x}\left(t_{0}\right)=\mathbf{x}_{0} . \nonumber

Starting with the general solution, we have that

\mathbf{x}_{0}=\mathbf{x}\left(t_{0}\right)=\Phi\left(t_{0}\right) \mathbf{C} . \nonumber

As usual, we need to solve for the c_{k} ’s. Using matrix methods, this is now easy. Since the Wronskian is not zero, then we can invert \Phi at any value of t. So, we have

\mathbf{C}=\Phi^{-1}\left(t_{0}\right) \mathbf{x}_{0} . \nonumber

Putting \mathbf{C} back into the general solution, we obtain the solution to the initial value problem:

\mathbf{x}(t)=\Phi(t) \Phi^{-1}\left(t_{0}\right) \mathbf{x}_{0} \nonumber

You can easily verify that this is a solution of the system and satisfies the initial condition at t=t_{0}.

The matrix combination \Phi(t) \Phi^{-1}\left(t_{0}\right) is useful. So, we will define the resulting product to be the principal matrix solution, denoting it by

\Psi(t)=\Phi(t) \Phi^{-1}\left(t_{0}\right) . \nonumber

Thus, the solution of the initial value problem is \mathbf{x}(t)=\Psi(t) \mathbf{x}_{0}. Furthermore, we note that \Psi(t) is a solution to the matrix initial value problem

\mathbf{x}^{\prime}=A \mathbf{x}, \quad \mathbf{x}\left(t_{0}\right)=I, \nonumber

where I is the n \times n identity matrix.

Matrix Solution of the Homogeneous Problem
In summary, the matrix solution of

\dfrac{d \mathbf{x}}{d t}=A \mathbf{x}, \quad \mathbf{x}\left(t_{0}\right)=\mathbf{x}_{0} \nonumber

\mathbf{x}(t)=\Psi(t) \mathbf{x}_{0}=\Phi(t) \Phi^{-1}\left(t_{0}\right) \mathbf{x}_{0}, \nonumber

is given by

Example 2.17. Let’s consider the matrix initial value problem

\begin{aligned} &x^{\prime}=5 x+3 y \\[4pt] &y^{\prime}=-6 x-4 y \end{aligned} \nonumber

satisfying x(0)=1, y(0)=2. Find the solution of this problem.

We first note that the coefficient matrix is

A=\left(\begin{array}{cc} 5 & 3 \\[4pt] -6 & -4 \end{array}\right) \nonumber

The eigenvalue equation is easily found from

\begin{aligned} 0 &=-(5-\lambda)(4+\lambda)+18 \\[4pt] &=\lambda^{2}-\lambda-2 \\[4pt] &=(\lambda-2)(\lambda+1) \end{aligned} \nonumber

So, the eigenvalues are \lambda=-1,2. The corresponding eigenvectors are found to be

\mathbf{v}_{1}=\left(\begin{array}{c} 1 \\[4pt] -2 \end{array}\right), \quad \mathbf{v}_{2}=\left(\begin{array}{c} 1 \\[4pt] -1 \end{array}\right) \nonumber

Now we construct the fundamental matrix solution. The columns are obtained using the eigenvectors and the exponentials, e^{\lambda t} :

\phi_{1}(t)=\left(\begin{array}{c} 1 \\[4pt] -2 \end{array}\right) e^{-t}, \quad \phi_{1}(t)=\left(\begin{array}{c} 1 \\[4pt] -1 \end{array}\right) e^{2 t} \nonumber

So, the fundamental matrix solution is

\Phi(t)=\left(\begin{array}{cc} e^{-t} & e^{2 t} \\[4pt] -2 e^{-t} & -e^{2 t} \end{array}\right) \nonumber

The general solution to our problem is then

\mathbf{x}(t)=\left(\begin{array}{cc} e^{-t} & e^{2 t} \\[4pt] -2 e^{-t} & -e^{2 t} \end{array}\right) \mathbf{C} \nonumber

for \mathbf{C} is an arbitrary constant vector.

In order to find the particular solution of the initial value problem, we need the principal matrix solution. We first evaluate \Phi(0), then we invert it:

\Phi(0)=\left(\begin{array}{cc} 1 & 1 \\[4pt] -2 & -1 \end{array}\right) \quad \Rightarrow \quad \Phi^{-1}(0)=\left(\begin{array}{cc} -1 & -1 \\[4pt] 2 & 1 \end{array}\right) \nonumber

The particular solution is then

Thus, x(t)=-3 e^{-t}+4 e^{2 t} and y(t)=6 e^{-t}-4 e^{2 t}.

Nonhomogeneous Systems

Before leaving the theory of systems of linear, constant coefficient systems, we will discuss nonhomogeneous systems. We would like to solve systems of the form

\begin{aligned} & \mathbf{x}(t)=\left(\begin{array}{cc}e^{-t} & e^{2 t} \\[4pt]-2 e^{-t} & -e^{2 t}\end{array}\right)\left(\begin{array}{cc}-1 & -1 \\[4pt]2 & 1\end{array}\right)\left(\begin{array}{l}1 \\[4pt]2\end{array}\right) \\[4pt] & =\left(\begin{array}{cc}e^{-t} & e^{2 t} \\[4pt]-2 e^{-t} & -e^{2 t}\end{array}\right)\left(\begin{array}{c}-3 \\[4pt]4\end{array}\right) \\[4pt] & =\left(\begin{array}{c}-3 e^{-t}+4 e^{2 t} \\[4pt]6 e^{-t}-4 e^{2 t}\end{array}\right) \end{aligned} \nonumber

\mathbf{x}^{\prime}=A(t) \mathbf{x}+\mathbf{f}(t) \nonumber

We will assume that we have found the fundamental matrix solution of the homogeneous equation. Furthermore, we will assume that A(t) and \mathbf{f}(t) are continuous on some common domain.

As with second order equations, we can look for solutions that are a sum of the general solution to the homogeneous problem plus a particular solution of the nonhomogeneous problem. Namely, we can write the general solution as

\mathbf{x}(t)=\Phi(t) \mathbf{C}+\mathbf{x}_{p}(t), \nonumber

where \mathbf{C} is an arbitrary constant vector, \Phi(t) is the fundamental matrix solution of \mathbf{x}^{\prime}=A(t) \mathbf{x}, and

\mathbf{x}_{p}^{\prime}=A(t) \mathbf{x}_{p}+\mathbf{f}(t) . \nonumber

Such a representation is easily verified.

We need to find the particular solution, \mathbf{x}_{p}(t). We can do this by applying The Method of Variation of Parameters for Systems. We consider a solution in the form of the solution of the homogeneous problem, but replace the constant vector by unknown parameter functions. Namely, we assume that

\mathbf{x}_{p}(t)=\Phi(t) \mathbf{c}(t) . \nonumber

Differentiating, we have that

\mathbf{x}_{p}^{\prime}=\Phi^{\prime} \mathbf{c}+\Phi \mathbf{c}^{\prime}=A \Phi \mathbf{c}+\Phi \mathbf{c}^{\prime} \nonumber

\mathbf{x}_{p}^{\prime}-A \mathbf{x}_{p}=\Phi \mathbf{c}^{\prime} . \nonumber

But the left side is \mathbf{f}. So, we have that,

\Phi \mathbf{c}^{\prime}=\mathbf{f} \nonumber

or, since \Phi is invertible (why?),

\mathbf{c}^{\prime}=\Phi^{-1} \mathbf{f} \nonumber

In principle, this can be integrated to give c. Therefore, the particular solution can be written as

\mathbf{x}_{p}(t)=\Phi(t) \int^{t} \Phi^{-1}(s) \mathbf{f}(s) d s . \nonumber

This is the variation of parameters formula.

The general solution of Equation (2.70) has been found as

\mathbf{x}(t)=\Phi(t) \mathbf{C}+\Phi(t) \int^{t} \Phi^{-1}(s) \mathbf{f}(s) d s . \nonumber

We can use the general solution to find the particular solution of an initial value problem consisting of Equation (2.70) and the initial condition \mathbf{x}\left(t_{0}\right)= \mathbf{x}_{0}. This condition is satisfied for a solution of the form

\mathbf{x}(t)=\Phi(t) \mathbf{C}+\Phi(t) \int_{t_{0}}^{t} \Phi^{-1}(s) \mathbf{f}(s) d s \nonumber

provided

\mathbf{x}_{0}=\mathbf{x}\left(t_{0}\right)=\Phi\left(t_{0}\right) \mathbf{C} . \nonumber

This can be solved for \mathbf{C} as in the last section. Inserting the solution back into the general solution (2.73), we have

\mathbf{x}(t)=\Phi(t) \Phi^{-1}\left(t_{0}\right) \mathbf{x}_{0}+\Phi(t) \int_{t_{0}}^{t} \Phi^{-1}(s) \mathbf{f}(s) d s \nonumber

This solution can be written a little neater in terms of the principal matrix solution, \Psi(t)=\Phi(t) \Phi^{-1}\left(t_{0}\right) :

\mathbf{x}(t)=\Psi(t) \mathbf{x}_{0}+\Psi(t) \int_{t_{0}}^{t} \Psi^{-1}(s) \mathbf{f}(s) d s \nonumber

Finally, one further simplification occurs when A is a constant matrix, which are the only types of problems we have solved in this chapter. In this case, we have that \Psi^{-1}(t)=\Psi(-t). So, computing \Psi^{-1}(t) is relatively easy.

Example 2.18. x^{\prime \prime}+x=2 \cos t, x(0)=4, x^{\prime}(0)=0. This example can be solved using the Method of Undetermined Coefficients. However, we will use the matrix method described in this section.

First, we write the problem in matrix form. The system can be written as

\begin{gathered} x^{\prime}=y \\[4pt] y^{\prime}=-x+2 \cos t \end{gathered} \nonumber

Thus, we have a nonhomogeneous system of the form

\mathbf{x}^{\prime}=A \mathbf{x}+\mathbf{f}=\left(\begin{array}{cc} 0 & 1 \\[4pt] -1 & 0 \end{array}\right)\left(\begin{array}{l} x \\[4pt] y \end{array}\right)+\left(\begin{array}{c} 0 \\[4pt] 2 \cos t \end{array}\right) \nonumber

Next we need the fundamental matrix of solutions of the homogeneous problem. We have that

A=\left(\begin{array}{cc} 0 & 1 \\[4pt] -1 & 0 \end{array}\right) \text {. } \nonumber

The eigenvalues of this matrix are \lambda=\pm i. An eigenvector associated with \lambda=i is easily found as \left(\begin{array}{l}1 \\[4pt] i\end{array}\right). This leads to a complex solution

\left(\begin{array}{l} 1 \\[4pt] i \end{array}\right) e^{i t}=\left(\begin{array}{c} \cos t+i \sin t \\[4pt] i \cos t-\sin t \end{array}\right) . \nonumber

From this solution we can construct the fundamental solution matrix

\Phi(t)=\left(\begin{array}{cc} \cos t & \sin t \\[4pt] -\sin t & \cos t \end{array}\right) \nonumber

So, the general solution to the homogeneous problem is

\mathbf{x}_{h}=\Phi(t) \mathbf{C}=\left(\begin{array}{c} c_{1} \cos t+c_{2} \sin t \\[4pt] -c_{1} \sin t+c_{2} \cos t \end{array}\right) \nonumber

Next we seek a particular solution to the nonhomogeneous problem. From Equation (2.73) we see that we need \Phi^{-1}(s) \mathbf{f}(s). Thus, we have

\begin{aligned} \Phi^{-1}(s) \mathbf{f}(s) &=\left(\begin{array}{cc} \cos s & -\sin s \\[4pt] \sin s & \cos s \end{array}\right)\left(\begin{array}{c} 0 \\[4pt] 2 \cos s \end{array}\right) \\[4pt] &=\left(\begin{array}{c} -2 \sin s \cos s \\[4pt] 2 \cos ^{2} s \end{array}\right) \end{aligned} \nonumber

We now compute

therefore, the general solution is

\mathbf{x}=\left(\begin{array}{c} c_{1} \cos t+c_{2} \sin t \\[4pt] -c_{1} \sin t+c_{2} \cos t \end{array}\right)+\left(\begin{array}{c} t \sin t \\[4pt] \sin t+t \cos t \end{array}\right) \nonumber

The solution to the initial value problem is

\mathbf{x}=\left(\begin{array}{cc} \cos t & \sin t \\[4pt] -\sin t & \cos t \end{array}\right)\left(\begin{array}{l} 4 \\[4pt] 0 \end{array}\right)+\left(\begin{array}{c} t \sin t \\[4pt] \sin t+t \cos t \end{array}\right) \nonumber

\mathbf{x}=\left(\begin{array}{c} 4 \cos t+t \sin t \\[4pt] -3 \sin t+t \cos t \end{array}\right) \nonumber

2.9 Applications

In this section we will describe several applications leading to systems of differential equations. In keeping with common practice in areas like physics, we will denote differentiation with respect to time as

image

\begin{aligned} & =\left(\begin{array}{cc}\cos t & \sin t \\[4pt]-\sin t & \cos t\end{array}\right)\left(\begin{array}{c}-\sin ^{2} t \\[4pt]t+\dfrac{1}{2} \sin (2 t)\end{array}\right) \\[4pt] & =\left(\begin{array}{c}t \sin t \\[4pt]\sin t+t \cos t\end{array}\right) \text {. } \end{aligned} \nonumber

\dot{x}=\dfrac{d x}{d t} \nonumber

We will look mostly at linear models and later modify some of these models to include nonlinear terms.

Spring-Mass Systems

There are many problems in physics that result in systems of equations. This is because the most basic law of physics is given by Newton’s Second Law, which states that if a body experiences a net force, it will accelerate. In particular, the net force is proportional to the acceleration with a proportionality constant called the mass, m. This is summarized as

\sum \mathbf{F}=m \mathbf{a} \nonumber

Since \mathbf{a}=\ddot{\mathbf{x}}, Newton’s Second Law is mathematically a system of second order differential equations for three dimensional problems, or one second order differential equation for one dimensional problems. If there are several masses, then we would naturally end up with systems no matter how many dimensions are involved.

A standard problem encountered in a first course in differential equations is that of a single block on a spring as shown in Figure 2.18. The net force in this case is the restoring force of the spring given by Hooke’s Law,

F_{s}=-k x \nonumber

where k>0 is the spring constant. Here x is the elongation of the spring, or the displacement of the block from equilibrium. When x is positive, the spring force is negative and when x is negative the spring force is positive. We have depicted a horizontal system sitting on a frictionless surface.

A similar model can be provided for vertically oriented springs. Place the block on a vertically hanging spring. It comes to equilibrium, stretching the spring by \ell_{0}. Newton’s Second Law gives

-m g+k \ell_{0}=0 . \nonumber

Now, pulling the mass further by x_{0}, and releasing it, the mass begins to oscillate. Letting x be the displacement from the new equilibrium, Newton’s Second Law now gives m \ddot{x}=-m g+k\left(\ell_{0}-x\right)=-k x.

In both examples (a horizontally or vetically oscillating mass) Newton’s Second Law of motion reults in the differential equation

m \ddot{x}+k x=0 . \nonumber

This is the equation for simple harmonic motion which we have already encountered in Chapter 1 .

image
Figure 2.18. Spring-Mass system.

This second order equation can be written as a system of two first order equations.

\begin{aligned} &x^{\prime}=y \\[4pt] &y^{\prime}=-\dfrac{k}{m} x \end{aligned} \nonumber

The coefficient matrix for this system is

A=\left(\begin{array}{cc} 0 & 1 \\[4pt] -\omega^{2} & 0 \end{array}\right) \nonumber

where \omega^{2}=\dfrac{k}{m}. The eigenvalues of this system are \lambda=\pm i \omega and the solutions are simple sines and cosines,

\begin{aligned} &x(t)=c_{1} \cos \omega t+c_{2} \sin \omega t, \\[4pt] &y(t)=\omega\left(-c_{1} \sin \omega t+c_{2} \cos \omega t\right) . \end{aligned} \nonumber

We further note that \omega is called the angular frequency of oscillation and is given in \mathrm{rad} / \mathrm{s}. The frequency of oscillation is

f=\dfrac{\omega}{2 \pi} \nonumber

It typically has units of \mathrm{s}^{-1}, cps, or Hz. The multiplicative inverse has units of time and is called the period,

T=\dfrac{1}{f} . \nonumber

Thus, the period of oscillation for a mass m on a spring with spring constant k is given by

T=2 \pi \sqrt{\dfrac{m}{k}} \nonumber

Of course, we did not need to convert the last problem into a system. In fact, we had seen this equation in Chapter 1 . However, when one considers

image
Figure 2.19. Spring-Mass system for two masses and two springs.

more complicated spring-mass systems, systems of differential equations occur naturally. Consider two blocks attached with two springs as shown in Figure 2.19. In this case we apply Newton’s second law for each block.

First, consider the forces acting on the first block. The first spring is stretched by x_{1}. This gives a force of F_{1}=-k_{1} x_{1}. The second spring may also exert a force on the block depending if it is stretched, or not. If both blocks are displaced by the same amount, then the spring is not displaced. So, the amount by which the spring is displaced depends on the relative displacements of the two masses. This results in a second force of F_{2}=k_{2}\left(x_{2}-x_{1}\right).

There is only one spring connected to mass two. Again the force depends on the relative displacement of the masses. It is just oppositely directed to the force which mass one feels from this spring.

Combining these forces and using Newton’s Second Law for both masses, we have the system of second order differential equations

\begin{aligned} &m_{1} \ddot{x}_{1}=-k_{1} x_{1}+k_{2}\left(x_{2}-x_{1}\right) \\[4pt] &m_{2} \ddot{x}_{2}=-k_{2}\left(x_{2}-x_{1}\right) \end{aligned} \nonumber

One can rewrite this system of two second order equations as a system of four first order equations. This is done by introducing two new variables x_{3}=\dot{x}_{1} and x_{4}=\dot{x}_{2}. Note that these physically are the velocities of the two blocks.

The resulting system of first order equations is given as

\begin{aligned} \dot{x}_{1} &=x_{3} \\[4pt] \dot{x}_{2} &=x_{4} \\[4pt] \dot{x}_{3} &=-\dfrac{k_{1}}{m_{1}} x_{1}+\dfrac{k_{2}}{m_{1}}\left(x_{2}-x_{1}\right) \\[4pt] \dot{x}_{4} &=-\dfrac{k_{2}}{m_{2}}\left(x_{2}-x_{1}\right) \end{aligned} \nonumber

We can write our new system in matrix form as

\left(\begin{array}{c} \dot{x}_{1} \\[4pt] \dot{x}_{2} \\[4pt] \dot{x}_{3} \\[4pt] \dot{x}_{4} \end{array}\right)=\left(\begin{array}{cccc} 0 & 0 & 1 & 0 \\[4pt] 0 & 0 & 0 & 1 \\[4pt] -\dfrac{k_{1}+k_{2}}{m_{1}} & \dfrac{k_{2}}{m_{1}} & 0 & 0 \\[4pt] \dfrac{k_{2}}{m_{2}} & -\dfrac{k_{2}}{m_{2}} & 0 & 0 \end{array}\right)\left(\begin{array}{l} x_{1} \\[4pt] x_{2} \\[4pt] x_{3} \\[4pt] x_{4} \end{array}\right) \nonumber

Electrical Circuits

Another problem often encountered in a first year physics class is that of an LRC series circuit. This circuit is pictured in Figure 2.20. The resistor is a circuit element satisfying Ohm’s Law. The capacitor is a device that stores electrical energy and an inductor, or coil, stores magnetic energy.

The physics for this problem stems from Kirchoff’s Rules for circuits. Since there is only one loop, we will only need Kirchoff’s Loop Rule. Namely, the sum of the drops in electric potential are set equal to the rises in electric potential. The potential drops across each circuit element are given by

  1. Resistor: V_{R}=I R.
  2. Capacitor: V_{C}=\dfrac{q}{C}.
  3. Inductor: V_{L}=L \dfrac{d I}{d t}.
image
Figure 2.20. Series LRC Circuit.

Adding these potential drops and setting the sum equal to the voltage supplied by the voltage source, V(t), we obtain

I R+\dfrac{q}{C}+L \dfrac{d I}{d t}=V(t) \nonumber

Furthermore, we recall that the current is defined as I=\dfrac{d q}{d t}. where q is the charge in the circuit. Since both q and I are unknown, we can replace the current by its expression in terms of the charge to obtain

L \ddot{q}+R \dot{q}+\dfrac{1}{C} q=V(t) . \nonumber

This is a second order differential equation for q(t). One can set up a system of equations and proceed to solve them. However, this is a constant coefficient differential equation and can also be solved using the methods in Chapter 1 .

In the next examples we will look at special cases that arise for the series LRC circuit equation. These include R C circuits, solvable by first order methods and L C circuits, leading to oscillatory behavior.

Example 2.19. RC Circuits

We first consider the case of an RC circuit in which there is no inductor. Also, we will consider what happens when one charges a capacitor with a DC battery \left(V(t)=V_{0}\right) and when one discharges a charged capacitor (V(t)=0).

For charging a capacitor, we have the initial value problem

R \dfrac{d q}{d t}+\dfrac{q}{C}=V_{0}, \quad q(0)=0 \nonumber

This equation is an example of a linear first order equation for q(t). However, we can also rewrite this equation and solve it as a separable equation, since V_{0} is a constant. We will do the former only as another example of finding the integrating factor.

We first write the equation in standard form:

\dfrac{d q}{d t}+\dfrac{q}{R C}=\dfrac{V_{0}}{R} . \nonumber

The integrating factor is then

\mu(t)=e^{\int \dfrac{d t}{R C}}=e^{t / R C} . \nonumber

Thus,

\dfrac{d}{d t}\left(q e^{t / R C}\right)=\dfrac{V_{0}}{R} e^{t / R C} \nonumber

Integrating, we have

q e^{t / R C}=\dfrac{V_{0}}{R} \int e^{t / R C} d t=C V_{0} e^{t / R C}+K \nonumber

Note that we introduced the integration constant, K. Now divide out the exponential to get the general solution:

q=C V_{0}+K e^{-t / R C} \nonumber

(If we had forgotten the K, we would not have gotten a correct solution for the differential equation.)

Next, we use the initial condition to get our particular solution. Namely, setting t=0, we have that

0=q(0)=C V_{0}+K \nonumber

So, K=-C V_{0}. Inserting this into our solution, we have

q(t)=C V_{0}\left(1-e^{-t / R C}\right) \nonumber

Now we can study the behavior of this solution. For large times the second term goes to zero. Thus, the capacitor charges up, asymptotically, to the final value of q_{0}=C V_{0}. This is what we expect, because the current is no longer flowing over R and this just gives the relation between the potential difference across the capacitor plates when a charge of q_{0} is established on the plates.

Charagiacgitor

image
Figure 2.21. The charge as a function of time for a charging capacitor with R=2.00 \mathrm{k} \Omega, C=6.00 \mathrm{mF}, and V_{0}=12 \mathrm{~V}

Let’s put in some values for the parameters. We let R=2.00 \mathrm{k} \Omega, C=6.00 \mathrm{mF}, and V_{0}=12 \mathrm{~V}. A plot of the solution is given in Figure 2.21. We see that the charge builds up to the value of C V_{0}=72 \mathrm{mC}. If we use a smaller resistance, R=200 \Omega, we see in Figure 2.22 that the capacitor charges to the same value, but much faster.

The rate at which a capacitor charges, or discharges, is governed by the time constant, \tau=R C. This is the constant factor in the exponential. The larger it is, the slower the exponential term decays. If we set t=\tau, we find that

q(\tau)=C V_{0}\left(1-e^{-1}\right)=(1-0.3678794412 \ldots) q_{0} \approx 0.63 q_{0} \nonumber

Thus, at time t=\tau, the capacitor has almost charged to two thirds of its final value. For the first set of parameters, \tau=12 \mathrm{~s}. For the second set, \tau=1.2 \mathrm{~s}.

Charagiacgi tor

image
Figure 2.22. The charge as a function of time for a charging capacitor with R=200 \Omega, C=6.00 \mathrm{mF}, and V_{0}=12 \mathrm{~V}.

Now, let’s assume the capacitor is charged with charge \pm q_{0} on its plates. If we disconnect the battery and reconnect the wires to complete the circuit, the charge will then move off the plates, discharging the capacitor. The relevant form of our initial value problem becomes

R \dfrac{d q}{d t}+\dfrac{q}{C}=0, \quad q(0)=q_{0} \nonumber

This equation is simpler to solve. Rearranging, we have

\dfrac{d q}{d t}=-\dfrac{q}{R C} \nonumber

This is a simple exponential decay problem, which you can solve using separation of variables. However, by now you should know how to immediately write down the solution to such problems of the form y^{\prime}=k y. The solution is

q(t)=q_{0} e^{-t / \tau}, \quad \tau=R C . \nonumber

We see that the charge decays exponentially. In principle, the capacitor never fully discharges. That is why you are often instructed to place a shunt across a discharged capacitor to fully discharge it.

In Figure 2.23 we show the discharging of our two previous RC circuits. Once again, \tau=R C determines the behavior. At t=\tau we have

q(\tau)=q_{0} e^{-1}=(0.3678794412 \ldots) q_{0} \approx 0.37 q_{0} . \nonumber

So, at this time the capacitor only has about a third of its original value.

Discolaøgíingor

image

R=2000 \nonumber

\begin{aligned} & \mathrm{R}=200 \end{aligned} \nonumber

Figure 2.23. The charge as a function of time for a discharging capacitor with R=2.00 \mathrm{k} \Omega or R=200 \Omega, and C=6.00 \mathrm{mF}, and q_{0}=72 \mathrm{mC}.

Example 2.20. LC Circuits

Another simple result comes from studying L C circuits. We will now connect a charged capacitor to an inductor. In this case, we consider the initial value problem

L \ddot{q}+\dfrac{1}{C} q=0, \quad q(0)=q_{0}, \dot{q}(0)=I(0)=0 \nonumber

Dividing out the inductance, we have

\ddot{q}+\dfrac{1}{L C} q=0 . \nonumber

This equation is a second order, constant coefficient equation. It is of the same form as the one we saw earlier for simple harmonic motion of a mass on a spring. So, we expect oscillatory behavior. The characteristic equation is

r^{2}+\dfrac{1}{L C}=0 . \nonumber

The solutions are

r_{1,2}=\pm \dfrac{i}{\sqrt{L C}} . \nonumber

Thus, the solution of (2.96) is of the form

q(t)=c_{1} \cos (\omega t)+c_{2} \sin (\omega t), \quad \omega=(L C)^{-1 / 2} \nonumber

Inserting the initial conditions yields

q(t)=q_{0} \cos (\omega t) . \nonumber

The oscillations that result are understandable. As the charge leaves the plates, the changing current induces a changing magnetic field in the inductor. The stored electrical energy in the capacitor changes to stored magnetic energy in the inductor. However, the process continues until the plates are charged with opposite polarity and then the process begins in reverse. The charged capacitor then discharges and the capacitor eventually returns to its original state and the whole system repeats this over and over.

The frequency of this simple harmonic motion is easily found. It is given

f=\dfrac{\omega}{2 \pi}=\dfrac{1}{2 \pi} \dfrac{1}{\sqrt{L C}} . \nonumber

This is called the tuning frequency because of its role in tuning circuits.

Of course, this is an ideal situation. There is always resistance in the circuit, even if only a small amount from the wires. So, we really need to account for resistance, or even add a resistor. This leads to a slightly more complicated system in which damping will be present. More complicated circuits are possible by looking at parallel connections, or other combinations, of resistors, capacitors and inductors. This will result in several equations for each loop in the circuit, leading to larger systems of differential equations. an example of another circuit setup is shown in Figure 2.24. This is not a problem that can be covered in the first year physics course.

image
Figure 2.24. A circuit with two loops containing several different circuit elements.
image
Figure 2.25. The previous parallel circuit with the directions indicated for traversing the loops in Kirchoff’s Laws.

We have three unknown functions for the charge. Once we know the charge functions, differentiation will yield the currents. However, we only have two equations. We need a third equation. This is found from Kirchoff’s Point (Junction) Rule. Consider the points A and B in Figure 2.25. Any charge (current) entering these junctions must be the same as the total charge (current) leaving the junctions. For point A we have

I_{1}=I_{2}+I_{3}, \nonumber

\dot{q}_{1}=\dot{q}_{2}+\dot{q}_{3} . \nonumber

Equations (2.100), (2.101), and (2.103) form a coupled system of differential equations for this problem. There are both first and second order derivatives involved. We can write the whole system in terms of charges as

\begin{array}{r} R_{1} \dot{q}_{1}+\dfrac{q_{2}}{C}=V(t) \\[4pt] R_{2} \dot{q}_{3}+L \ddot{q}_{3}=\dfrac{q_{2}}{C} \\[4pt] \dot{q}_{1}=\dot{q}_{2}+\dot{q}_{3} \end{array} \nonumber

The question is whether, or not, we can write this as a system of first order differential equations. Since there is only one second order derivative, we can introduce the new variable q_{4}=\dot{q}_{3}. The first equation can be solved for \dot{q}_{1}. The third equation can be solved for \dot{q}_{2} with appropriate substitutions for the other terms. \dot{q}_{3} is gotten from the definition of q_{4} and the second equation can be solved for \ddot{q}_{3} and substitutions made to obtain the system

\begin{aligned} \dot{q}_{1} &=\dfrac{V}{R_{1}}-\dfrac{q_{2}}{R_{1} C} \\[4pt] \dot{q}_{2} &=\dfrac{V}{R_{1}}-\dfrac{q_{2}}{R_{1} C}-q_{4} \\[4pt] \dot{q}_{3} &=q_{4} \\[4pt] \dot{q}_{4} &=\dfrac{q_{2}}{L C}-\dfrac{R_{2}}{L} q_{4} \end{aligned} \nonumber

So, we have a nonhomogeneous first order system of differential equations. In the last section we learned how to solve such systems.

Love Affairs

The next application is one that has been studied by several authors as a cute system involving relationships. One considers what happens to the affections that two people have for each other over time. Let R denote the affection that Romeo has for Juliet and J be the affection that Juliet has for Romeo. positive values indicate love and negative values indicate dislike.

One possible model is given by

\begin{aligned} &\dfrac{d R}{d t}=b J \\[4pt] &\dfrac{d J}{d t}=c R \end{aligned} \nonumber

with b>0 and c<0. In this case Romeo loves Juliet the more she likes him. But Juliet backs away when she finds his love for her increasing.

A typical system relating the combined changes in affection can be modeled

\begin{aligned} &\dfrac{d R}{d t}=a R+b J \\[4pt] &\dfrac{d J}{d t}=c R+d J \end{aligned} \nonumber

Several scenarios are possible for various choices of the constants. For example, if a>0 and b>0, Romeo gets more and more excited by Juliet’s love for him. If c>0 and d<0, Juliet is being cautious about her relationship with Romeo. For specific values of the parameters and initial conditions, one can explore this match of an overly zealous lover with a cautious lover.

Predator Prey Models

Another common model studied is that of competing species. For example, we could consider a population of rabbits and foxes. Left to themselves, rabbits would tend to multiply, thus

\dfrac{d R}{d t}=a R, \nonumber

with a>0. In such a model the rabbit population would grow exponentially. Similarly, a population of foxes would decay without the rabbits to feed on. So, we have that

\dfrac{d F}{d t}=-b F \nonumber

for b>0.

Now, if we put these populations together on a deserted island, they would interact. The more foxes, the rabbit population would decrease. However, the more rabbits, the foxes would have plenty to eat and the population would thrive. Thus, we could model the competing populations as

\begin{gathered} \dfrac{d R}{d t}=a R-c F, \\[4pt] \dfrac{d F}{d t}=-b F+d R, \end{gathered} \nonumber

where all of the constants are positive numbers. Studying this coupled system would lead to as study of the dynamics of these populations. We will discuss other (nonlinear) systems in the next chapter.

Mixture Problems

There are many types of mixture problems. Such problems are standard in a first course on differential equations as examples of first order differential equations. Typically these examples consist of a tank of brine, water containing a specific amount of salt with pure water entering and the mixture leaving, or the flow of a pollutant into, or out of, a lake.

In general one has a rate of flow of some concentration of mixture entering a region and a mixture leaving the region. The goal is to determine how much stuff is in the region at a given time. This is governed by the equation

\text { Rate of change of substance }=\text { Rate In }-\text { Rate Out. } \nonumber

This can be generalized to the case of two interconnected tanks. We provide some examples.

Example 2.21. Single Tank Problem

A 50 gallon tank of pure water has a brine mixture with concentration of 2 pounds per gallon entering at the rate of 5 gallons per minute. [See Figure 2.26.] At the same time the well-mixed contents drain out at the rate of 5 gallons per minute. Find the amount of salt in the tank at time t. In all such problems one assumes that the solution is well mixed at each instant of time.

image
Figure 2.26. A typical mixing problem.

Let x(t) be the amount of salt at time t. Then the rate at which the salt in the tank increases is due to the amount of salt entering the tank less that leaving the tank. To figure out these rates, one notes that d x / d t has units of pounds per minute. The amount of salt entering per minute is given by the product of the entering concentration times the rate at which the brine enters. This gives the correct units:

\left(2 \dfrac{\text { pounds }}{\text { gal }}\right)\left(5 \dfrac{\text { gal }}{\text { min }}\right)=10 \dfrac{\text { pounds }}{\text { min }} . \nonumber

Similarly, one can determine the rate out as

\left(\dfrac{x \text { pounds }}{50 \text { gal }}\right)\left(5 \dfrac{\text { gal }}{\mathrm{min}}\right)=\dfrac{x}{10} \dfrac{\text { pounds }}{\mathrm{min}} . \nonumber

Thus, we have

\dfrac{d x}{d t}=10-\dfrac{x}{10} \nonumber

This equation is easily solved using the methods for first order equations.

Example 2.22. Double Tank Problem

image
Figure 2.27. The two tank problem.

One has two tanks connected together, labelled tank X and tank Y, as shown in Figure 2.27. Let tank \mathrm{X} initially have 100 gallons of brine made with 100 pounds of salt. Tank Y initially has 100 gallons of pure water. Now pure water is pumped into \operatorname{tank} X at a rate of 2.0 gallons per minute. Some of the mixture of brine and pure water flows into tank Y at 3 gallons per minute. To keep the tank levels the same, one gallon of the Y mixture flows back into tank X at a rate of one gallon per minute and 2.0 gallons per minute drains out. Find the amount of salt at any given time in the tanks. What happens over a long period of time?

In this problem we set up two equations. Let x(t) be the amount of salt in tank X and y(t) the amount of salt in tank Y. Again, we carefully look at the rates into and out of each tank in order to set up the system of differential equations. We obtain the system

\begin{aligned} &\dfrac{d x}{d t}=\dfrac{y}{100}-\dfrac{3 x}{100} \\[4pt] &\dfrac{d y}{d t}=\dfrac{3 x}{100}-\dfrac{3 y}{100} \end{aligned} \nonumber

This is a linear, homogenous constant coefficient system of two first order equations, which we know how to solve.

Chemical Kinetics

There are many problems that come from studying chemical reactions. The simplest reaction is when a chemical A turns into chemical B. This happens at a certain rate, k>0. This can be represented by the chemical formula

image

In this case we have that the rates of change of the concentrations of A,[A], and B,[B], are given by

\begin{aligned} &\dfrac{d[A]}{d t}=-k[A] \\[4pt] &\dfrac{d[B]}{d t}=k[A] \end{aligned} \nonumber

Think about this as it is a key to understanding the next reactions.

A more complicated reaction is given by

A \underset{k_{1}}{\longrightarrow} B \underset{k_{2}}{\longrightarrow} C \text {. } \nonumber

In this case we can add to the above equation the rates of change of concentrations [B] and [C]. The resulting system of equations is

\begin{aligned} \dfrac{d[A]}{d t} &=-k_{1}[A] \\[4pt] \dfrac{d[B]}{d t} &=k_{1}[A]-k_{2}[B] \\[4pt] \dfrac{d[C]}{d t} &=k_{2}[B] \end{aligned} \nonumber

One can further consider reactions in which a reverse reaction is possible. Thus, a further generalization occurs for the reaction

A \underset{k_{1}}{\stackrel{k_{3}}{\longrightarrow}} B \underset{k_{2}}{\longrightarrow} C \text {. } \nonumber

The resulting system of equations is

\begin{aligned} &\dfrac{d[A]}{d t}=-k_{1}[A]+k_{3}[B] \\[4pt] &\dfrac{d[B]}{d t}=k_{1}[A]-k_{2}[B]-k_{3}[B] \\[4pt] &\dfrac{d[C]}{d t}=k_{2}[B] \end{aligned} \nonumber

More complicated chemical reactions will be discussed at a later time.

Epidemics

Another interesting area of application of differential equation is in predicting the spread of disease. Typically, one has a population of susceptible people or animals. Several infected individuals are introduced into the population and one is interested in how the infection spreads and if the number of infected people drastically increases or dies off. Such models are typically nonlinear and we will look at what is called the SIR model in the next chapter. In this section we will model a simple linear model.

Let break the population into three classes. First, S(t) are the healthy people, who are susceptible to infection. Let I(t) be the number of infected people. Of these infected people, some will die from the infection and others recover. Let’s assume that initially there in one infected person and the rest, say N, are obviously healthy. Can we predict how many deaths have occurred by time t?

Let’s try and model this problem using the compartmental analysis we had seen in the mixing problems. The total rate of change of any population would be due to those entering the group less those leaving the group. For example, the number of healthy people decreases due infection and can increase when some of the infected group recovers. Let’s assume that the rate of infection is proportional to the number of healthy people, a S. Also, we assume that the number who recover is proportional to the number of infected, r I. Thus, the rate of change of the healthy people is found as

\dfrac{d S}{d t}=-a S+r I . \nonumber

Let the number of deaths be D(t). Then, the death rate could be taken to be proportional to the number of infected people. So,

\dfrac{d D}{d t}=d I \nonumber

Finally, the rate of change of infectives is due to healthy people getting infected and the infectives who either recover or die. Using the corresponding terms in the other equations, we can write

\dfrac{d I}{d t}=a S-r I-d I . \nonumber

This linear system can be written in matrix form.

\dfrac{d}{d t}\left(\begin{array}{c} S \\[4pt] I \\[4pt] D \end{array}\right)=\left(\begin{array}{ccc} -a & r & 0 \\[4pt] a & -d-r & 0 \\[4pt] 0 & d & 0 \end{array}\right)\left(\begin{array}{c} S \\[4pt] I \\[4pt] D \end{array}\right) \nonumber

The eigenvalue equation for this system is

\lambda\left[\lambda^{2}+(a+r+d) \lambda+a d\right]=0 . \nonumber

The reader can find the solutions of this system and determine if this is a realistic model.

Appendix: Diagonalization and Linear Systems

As we have seen, the matrix formulation for linear systems can be powerful, especially for n differential equations involving n unknown functions. Our ability to proceed towards solutions depended upon the solution of eigenvalue problems. However, in the case of repeated eigenvalues we saw some additional complications. This all depends deeply on the background linear algebra. Namely, we relied on being able to diagonalize the given coefficient matrix. In this section we will discuss the limitations of diagonalization and introduce the Jordan canonical form.

We begin with the notion of similarity. Matrix A is similar to matrix B if and only if there exists a nonsingular matrix P such that

B=P^{-1} A P . \nonumber

Recall that a nonsingular matrix has a nonzero determinant and is invertible.

We note that the similarity relation is an equivalence relation. Namely, it satisfies the following

  1. A is similar to itself.
  2. If A is similar to B, then B is similar to A.
  3. If A is similar to B and B is similar to C, the A is similar to C.

Also, if A is similar to B, then they have the same eigenvalues. This follows from a simple computation of the eigenvalue equation. Namely,

\begin{aligned} 0 &=\operatorname{det}(B-\lambda I) \\[4pt] &=\operatorname{det}\left(P^{-1} A P-\lambda P^{-1} I P\right) \\[4pt] &=\operatorname{det}(P)^{-1} \operatorname{det}(A-\lambda I) \operatorname{det}(P) \\[4pt] &=\operatorname{det}(A-\lambda I) \end{aligned} \nonumber

Therefore, \operatorname{det}(A-\lambda I)=0 and \lambda is an eigenvalue of both A and B.

An n \times n matrix A is diagonalizable if and only if A is similar to a diagonal matrix D; i.e., there exists a nonsingular matrix P such that

D=P^{-1} A P . \nonumber

One of the most important theorems in linear algebra is the Spectral Theorem. This theorem tells us when a matrix can be diagonalized. In fact, it goes beyond matrices to the diagonalization of linear operators. We learn in linear algebra that linear operators can be represented by matrices once we pick a particular representation basis. Diagonalization is simplest for finite dimensional vector spaces and requires some generalization for infinite dimensional vectors spaces. Examples of operators to which the spectral theorem applies are self-adjoint operators (more generally normal operators on Hilbert spaces). We will explore some of these ideas later in the course. The spectral theorem provides a canonical decomposition, called the spectral decomposition, or eigendecomposition, of the underlying vector space on which it acts.

The next theorem tells us how to diagonalize a matrix:

Theorem 2.23. Let A be an n \times n matrix. Then A is diagonalizable if and only if A has n linearly independent eigenvectors. If so, then

D=P^{-1} A P . \nonumber

If \left\{v_{1}, \ldots, v_{n}\right\} are the eigenvectors of A and \left\{\lambda_{1}, \ldots, \lambda_{n}\right\} are the corresponding eigenvalues, then v_{j} is the jth column of P and D_{j j}=\lambda_{j}.

A simpler determination results by noting

Theorem 2.24. Let A be an n \times n matrix with n real and distinct eigenvalues. Then A is diagonalizable.

Therefore, we need only look at the eigenvalues and determine diagonalizability. In fact, one also has from linear algebra the following result.

Theorem 2.25. Let A be an n \times n real symmetric matrix. Then A is diagonalizable.

Recall that a symmetric matrix is one whose transpose is the same as the matrix, or A_{i j}=A_{j i}.

Example 2.26. Consider the matrix

A=\left(\begin{array}{lll} 1 & 2 & 2 \\[4pt] 2 & 3 & 0 \\[4pt] 2 & 0 & 3 \end{array}\right) \nonumber

This is a real symmetric matrix. The characteristic polynomial is found to be

\operatorname{det}(A-\lambda I)=-(\lambda-5)(\lambda-3)(\lambda+1)=0 \nonumber

As before, we can determine the corresponding eigenvectors (for \lambda=-1,3,5, respectively) as

\left(\begin{array}{c} -2 \\[4pt] 1 \\[4pt] 1 \end{array}\right), \quad\left(\begin{array}{c} 0 \\[4pt] -1 \\[4pt] 1 \end{array}\right), \quad\left(\begin{array}{l} 1 \\[4pt] 1 \\[4pt] 1 \end{array}\right) \text {. } \nonumber

We can use these to construct the diagonalizing matrix P. Namely, we have

P^{-1} A P=\left(\begin{array}{ccc} -2 & 0 & 1 \\[4pt] 1 & -1 & 1 \\[4pt] 1 & 1 & 1 \end{array}\right)^{-1}\left(\begin{array}{lll} 1 & 2 & 2 \\[4pt] 2 & 3 & 0 \\[4pt] 2 & 0 & 3 \end{array}\right)\left(\begin{array}{ccc} -2 & 0 & 1 \\[4pt] 1 & -1 & 1 \\[4pt] 1 & 1 & 1 \end{array}\right)=\left(\begin{array}{ccc} -1 & 0 & 0 \\[4pt] 0 & 3 & 0 \\[4pt] 0 & 0 & 5 \end{array}\right) \nonumber

Now diagonalization is an important idea in solving linear systems of first order equations, as we have seen for simple systems. If our system is originally diagonal, that means our equations are completely uncoupled. Let our system take the form

\dfrac{d \mathbf{y}}{d t}=D \mathbf{y} \nonumber

where D is diagonal with entries \lambda_{i}, i=1, \ldots, n. The system of equations, y_{i}^{\prime}=\lambda_{i} y_{i}, has solutions

y_{i}(t)=c_{c} e^{\lambda_{i} t} \nonumber

Thus, it is easy to solve a diagonal system.

Let A be similar to this diagonal matrix. Then

\dfrac{d \mathbf{y}}{d t}=P^{-1} A P \mathbf{y} \nonumber

This can be rewritten as

\dfrac{d P \mathbf{y}}{d t}=A P \mathbf{y} \nonumber

Defining \mathbf{x}=P \mathbf{y}, we have

\dfrac{d \mathbf{x}}{d t}=A \mathbf{x} \nonumber

This simple derivation shows that if A is diagonalizable, then a transformation of the original system in \mathbf{x} to new coordinates, or a new basis, results in a simpler system in \mathbf{y}.

However, it is not always possible to diagonalize a given square matrix. This is because some matrices do not have enough linearly independent vectors, or we have repeated eigenvalues. However, we have the following theorem:

Theorem 2.27. Every n \times n matrix A is similar to a matrix of the form

J=\operatorname{diag}\left[J_{1}, J_{2}, \ldots, J_{n}\right] \nonumber

where

J_{i}=\left(\begin{array}{ccccc} \lambda_{i} & 1 & 0 & \cdots & 0 \\[4pt] 0 & \lambda_{i} & 1 & \cdots & 0 \\[4pt] \vdots & \ddots & \ddots & \ddots & \vdots \\[4pt] 0 & \cdots & 0 & \lambda_{i} & 1 \\[4pt] 0 & 0 & \cdots & 0 & \lambda_{i} \end{array}\right) \nonumber

We will not go into the details of how one finds this Jordan Canonical Form or proving the theorem. In practice you can use a computer algebra system to determine this and the similarity matrix. However, we would still need to know how to use it to solve our system of differential equations. Example 2.28. Let’s consider a simple system with the 3 \times 3 Jordan block

A=\left(\begin{array}{lll} 2 & 1 & 0 \\[4pt] 0 & 2 & 1 \\[4pt] 0 & 0 & 2 \end{array}\right) \nonumber

The corresponding system of coupled first order differential equations takes the form

\begin{aligned} &\dfrac{d x_{1}}{d t}=2 x_{1}+x_{2}, \\[4pt] &\dfrac{d x_{2}}{d t}=2 x_{2}+x_{3}, \\[4pt] &\dfrac{d x_{3}}{d t}=2 x_{3} . \end{aligned} \nonumber

The last equation is simple to solve, giving x_{3}(t)=c_{3} e^{2 t}. Inserting into the second equation, you have a

\dfrac{d x_{2}}{d t}=2 x_{2}+c_{3} e^{2 t} \nonumber

Using the integrating factor, e^{-2 t}, one can solve this equation to get x_{2}(t)= \left(c_{2}+c_{3} t\right) e^{2 t}. Similarly, one can solve the first equation to obtain x_{1}(t)= \left(c_{1}+c_{2} t+\dfrac{1}{2} c_{3} t^{2}\right) e^{2 t}

This should remind you of a problem we had solved earlier leading to the generalized eigenvalue problem in (2.43). This suggests that there is a more general theory when there are multiple eigenvalues and relating to Jordan canonical forms.

Let’s write the solution we just obtained in vector form. We have

\mathbf{x}(t)=\left[c_{1}\left(\begin{array}{l} 1 \\[4pt] 0 \\[4pt] 0 \end{array}\right)+c_{2}\left(\begin{array}{l} t \\[4pt] 1 \\[4pt] 0 \end{array}\right)+c_{3}\left(\begin{array}{c} \dfrac{1}{2} t^{2} \\[4pt] t \\[4pt] 1 \end{array}\right)\right] e^{2 t} \nonumber

It looks like this solution is a linear combination of three linearly independent solutions,

\begin{aligned} &\mathbf{x}=\mathbf{v}_{1} e^{2 \lambda t} \\[4pt] &\mathbf{x}=\left(t \mathbf{v}_{1}+\mathbf{v}_{2}\right) e^{\lambda t} \\[4pt] &\mathbf{x}=\left(\dfrac{1}{2} t^{2} \mathbf{v}_{1}+t \mathbf{v}_{2}+\mathbf{v}_{3}\right) e^{\lambda t} \end{aligned} \nonumber

where \lambda=2 and the vectors satisfy the equations

\begin{aligned} &(A-\lambda I) \mathbf{v}_{1}=0 \\[4pt] &(A-\lambda I) \mathbf{v}_{2}=\mathbf{v}_{1} \\[4pt] &(A-\lambda I) \mathbf{v}_{3}=\mathbf{v}_{2} \end{aligned} \nonumber

and

\begin{aligned} (A-\lambda I) \mathbf{v}_{1} &=0 \\[4pt] (A-\lambda I)^{2} \mathbf{v}_{2} &=0 \\[4pt] (A-\lambda I)^{3} \mathbf{v}_{3} &=0 \end{aligned} \nonumber

It is easy to generalize this result to build linearly independent solutions corresponding to multiple roots (eigenvalues) of the characteristic equation.

Problems

2.1. Consider the system

\begin{array}{r} x^{\prime}=-4 x-y \\[4pt] y^{\prime}=x-2 y \end{array} \nonumber

a. Determine the second order differential equation satisfied by x(t).

b. Solve the differential equation for x(t).

c. Using this solution, find y(t).

d. Verify your solutions for x(t) and y(t).

e. Find a particular solution to the system given the initial conditions x(0)= 1 and y(0)=0.

2.2. Consider the following systems. Determine the families of orbits for each system and sketch several orbits in the phase plane and classify them by their type (stable node, etc.)

a.

\begin{aligned} &x^{\prime}=3 x \\[4pt] &y^{\prime}=-2 y \end{aligned} \nonumber

b.

\begin{aligned} &x^{\prime}=-y \\[4pt] &y^{\prime}=-5 x \end{aligned} \nonumber

\begin{aligned} &x^{\prime}=2 y \\[4pt] &y^{\prime}=-3 x \end{aligned} \nonumber

\mathrm{d}

\begin{aligned} &x^{\prime}=x-y \\[4pt] &y^{\prime}=y \end{aligned} \nonumber

e.

\begin{aligned} &x^{\prime}=2 x+3 y \\[4pt] &y^{\prime}=-3 x+2 y \end{aligned} \nonumber

2.3. Use the transformations relating polar and Cartesian coordinates to prove that

\dfrac{d \theta}{d t}=\dfrac{1}{r^{2}}\left[x \dfrac{d y}{d t}-y \dfrac{d x}{d t}\right] \nonumber

2.4. In Equation (2.34) the exponential of a matrix was defined.

a. Let

A=\left(\begin{array}{ll} 2 & 0 \\[4pt] 0 & 0 \end{array}\right) \nonumber

Compute e^{A}.

b. Give a definition of \cos A and compute \cos \left(\begin{array}{ll}1 & 0 \\[4pt] 0 & 2\end{array}\right) in simplest form.

c. Prove e^{P A P^{-1}}=P e^{A} P^{-1}.

2.5. Consider the general system

\begin{aligned} &x^{\prime}=a x+b y \\[4pt] &y^{\prime}=c x+d y . \end{aligned} \nonumber

Can one determine the family of trajectories for the general case? Recall, this means we have to solve the first order equation

\dfrac{d y}{d x}=\dfrac{c x+d y}{a x+b y} . \nonumber

[Actually, this equation is homogeneous of degree 0.] It can be written in the form \dfrac{d y}{d x}=F\left(\dfrac{y}{x}\right). For such equations, one can make the substitution z=\dfrac{y}{x}, or y(x)=x z(x), and obtain a separable equation for z.

a. Using the general system, show that z=z(x) satisfies and equation of the form

x \dfrac{d z}{d x}=F(z)-z . \nonumber

Identify the function F(z).

b. Use the equation for z(x) in part a to find the family of trajectories of the system

\begin{aligned} x^{\prime} &=x-y \\[4pt] y^{\prime} &=x+y . \end{aligned} \nonumber

First determine the appropriate F(z) and then solve the resulting separable equation as a relation between z and x. Then write the solution of the original equation in terms of x and y. c. Use polar coordinates to describe the family of solutions obtained. You can rewrite the solution in polar coordinates and/or solve the system rewritten in polar coordinates.

2.6. Find the eigenvalue(s) and eigenvector(s) for the following:
a. \left(\begin{array}{ll}4 & 2 \\[4pt] 3 & 3\end{array}\right)
b. \left(\begin{array}{ll}3 & -5 \\[4pt] 1 & -1\end{array}\right)
c. \left(\begin{array}{ll}4 & 1 \\[4pt] 0 & 4\end{array}\right)
d. \left(\begin{array}{ccc}1 & -1 & 4 \\[4pt] 3 & 2 & -1 \\[4pt] 2 & 1 & -1\end{array}\right)

2.7. Consider the following systems. For each system determine the coefficient matrix. When possible, solve the eigenvalue problem for each matrix and use the eigenvalues and eigenfunctions to provide solutions to the given systems. Finally, in the common cases which you investigated in Problem 2.2, make comparisons with your previous answers, such as what type of eigenvalues correspond to stable nodes.

a.

\begin{aligned} &x^{\prime}=3 x-y \\[4pt] &y^{\prime}=2 x-2 y \end{aligned} \nonumber

b.

\begin{aligned} &x^{\prime}=-y \\[4pt] &y^{\prime}=-5 x \end{aligned} \nonumber

c.

\begin{aligned} &x^{\prime}=x-y \\[4pt] &y^{\prime}=y \end{aligned} \nonumber

\mathrm{d}

\begin{aligned} &x^{\prime}=2 x+3 y \\[4pt] &y^{\prime}=-3 x+2 y \end{aligned} \nonumber

e.

\begin{aligned} &x^{\prime}=-4 x-y \\[4pt] &y^{\prime}=x-2 y . \end{aligned} \nonumber

\begin{aligned} &x^{\prime}=x-y \\[4pt] &y^{\prime}=x+y \end{aligned} \nonumber

2.8. For each of the following matrices consider the system \mathbf{x}^{\prime}=A \mathbf{x} and

a. Find the fundamental solution matrix.

b. Find the principal solution matrix.

a.

A=\left(\begin{array}{ll} 1 & 1 \\[4pt] 4 & 1 \end{array}\right) \nonumber

b.

A=\left(\begin{array}{ll} 2 & 5 \\[4pt] 0 & 2 \end{array}\right) \nonumber

A=\left(\begin{array}{cc} 4 & -13 \\[4pt] 2 & -6 \end{array}\right) \nonumber

\mathrm{d} .

A=\left(\begin{array}{ccc} 1 & -1 & 4 \\[4pt] 3 & 2 & -1 \\[4pt] 2 & 1 & -1 \end{array}\right) \nonumber

2.9. For the following problems

  1. Rewrite the problem in matrix form.
  2. Find the fundamental matrix solution.
  3. Determine the general solution of the nonhomogeneous system.
  4. Find the principal matrix solution.
  5. Determine the particular solution of the initial value problem.

a. y^{\prime \prime}+y=2 \sin 3 x, \quad y(0)=2, \quad y^{\prime}(0)=0.

b. y^{\prime \prime}-3 y^{\prime}+2 y=20 e^{-2 x}, \quad y(0)=0, \quad y^{\prime}(0)=6.

2.10. Prove Equation (2.75)

\mathbf{x}(t)=\Psi(t) \mathbf{x}_{0}+\Psi(t) \int_{t_{0}}^{t} \Psi^{-1}(s) \mathbf{f}(s) d s \nonumber

starting with Equation (2.73)

2.11. Add a third spring connected to mass two in the coupled system shown in Figure 2.19 to a wall on the far right. Assume that the masses are the same and the springs are the same.

a. Model this system with a set of first order differential equations.

b. If the masses are all 2.0 \mathrm{~kg} and the spring constants are all 10.0 \mathrm{~N} / \mathrm{m}, then find the general solution for the system. c. Move mass one to the left (of equilibrium) 10.0 \mathrm{~cm} and mass two to the right 5.0 \mathrm{~cm}. Let them go. find the solution and plot it as a function of time. Where is each mass at 5.0 seconds?

2.12. Consider the series circuit in Figure 2.20 with L=1.00 \mathrm{H}, R=1.00 \times 10^{2} \Omega, C=1.00 \times 10^{-4} \mathrm{~F}, and V_{0}=1.00 \times 10^{3} \mathrm{~V} .

a. Set up the problem as a system of two first order differential equations for the charge and the current.

b. Suppose that no charge is present and no current is flowing at time t=0 when V_{0} is applied. Find the current and the charge on the capacitor as functions of time.

c. Plot your solutions and describe how the system behaves over time.

2.13. You live in a cabin in the mountains and you would like to provide yourself with water from a water tank that is 25 feet above the level of the pipe going into the cabin. [See Figure 2.28.] The tank is filled from an aquifer 125 \mathrm{ft} below the surface and being pumped at a maximum rate of 7 gallons per minute. As this flow rate is not sufficient to meet your daily needs, you would like to store water in the tank and have gravity supply the needed pressure. So, you design a cylindrical tank that is 35 \mathrm{ft} high and has a 10 \mathrm{ft} diameter. The water then flows through pipe at the bottom of the tank. You are interested in the height h of the water at time t. This in turn will allow you to figure the water pressure.

image
Figure 2.28. A water tank problem in the mountains.

First, the differential equation governing the flow of water from a tank through an orifice is given as

\dfrac{d h}{d t}=\dfrac{K-\alpha a \sqrt{2 g h}}{A} \nonumber

Here K is the rate at which water is being pumped into the top of the tank. A is the cross sectional area of this tank. \alpha is called the contraction coefficient, which measures the flow through the orifice, which has cross section a. We will assume that \alpha=0.63 and that the water enters in a 6 in diameter PVC pipe.

a. Assuming that the water tank is initially full, find the minimum flow rate in the system during the first two hours.

b. What is the minimum water pressure during the first two hours? Namely, what is the gauge pressure at the house? Note that \Delta P=\rho g H, where \rho is the water density and H is the total height of the fluid (tank plus vertical pipe). Note that \rho g=0.434 psi (pounds per square inch).

c. How long will it take for the tank to drain to 10 \mathrm{ft} above the base of the tank?

Other information you may need is 1 gallon =231 in { }^{2} and g=32.2 \mathrm{ft} / \mathrm{s}^{2}.

2.14. Initially a 200 gallon tank is filled with pure water. At time t=0 a salt concentration with 3 pounds of salt per gallon is added to the container at the rate of 4 gallons per minute, and the well-stirred mixture is drained from the container at the same rate.

a. Find the number of pounds of salt in the container as a function of time.

b. How many minutes does it take for the concentration to reach 2 pounds per gallon?

c. What does the concentration in the container approach for large values of time? Does this agree with your intuition?

d. Assuming that the tank holds much more than 200 gallons, and everything is the same except that the mixture is drained at 3 gallons per minute, what would the answers to parts a and become?

2.15. You make two gallons of chili for a party. The recipe calls for two teaspoons of hot sauce per gallon, but you had accidentally put in two tablespoons per gallon. You decide to feed your guests the chili anyway. Assume that the guests take 1 \mathrm{cup} / \mathrm{min} of chili and you replace what was taken with beans and tomatoes without any hot sauce. [1 gal =16 cups and 1 \mathrm{~Tb}=3 \mathrm{tsp} .]

a. Write down the differential equation and initial condition for the amount of hot sauce as a function of time in this mixture-type problem.

b. Solve this initial value problem.

c. How long will it take to get the chili back to the recipe’s suggested concentration?

2.16. Consider the chemical reaction leading to the system in (2.111). Let the rate constants be k_{1}=0.20 \mathrm{~ms}^{-1}, k_{2}=0.05 \mathrm{~ms}^{-1}, and k_{3}=0.10 \mathrm{~ms}^{-1}. What do the eigenvalues of the coefficient matrix say about the behavior of the system? Find the solution of the system assuming [A](0)=A_{0}=1.0 \mu \mathrm{mol},[B](0)=0, and [C](0)=0. Plot the solutions for t=0.0 to 50.0 \mathrm{~ms} and describe what is happening over this time. 2.17. Consider the epidemic model leading to the system in (2.112). Choose the constants as a=2.0 days ^{-1}, d=3.0 days ^{-1}, and r=1.0 days ^{-1}. What are the eigenvalues of the coefficient matrix? Find the solution of the system assuming an initial population of 1,000 and one infected individual. Plot the solutions for t=0.0 to 5.0 days and describe what is happening over this time. Is this model realistic?

Nonlinear Systems

Introduction

Most of your studies of differential equations to date have been the study linear differential equations and common methods for solving them. However, the real world is very nonlinear. So, why study linear equations? Because they are more readily solved. As you may recall, we can use the property of linear superposition of solutions of linear differential equations to obtain general solutions. We will see that we can sometimes approximate the solutions of nonlinear systems with linear systems in small regions of phase space.

In general, nonlinear equations cannot be solved obtaining general solutions. However, we can often investigate the behavior of the solutions without actually being able to find simple expressions in terms of elementary functions. When we want to follow the evolution of these solutions, we resort to numerically solving our differential equations. Such numerical methods need to be executed with care and there are many techniques that can be used. We will not go into these techniques in this course. However, we can make use of computer algebra systems, or computer programs, already developed for obtaining such solutions.

Nonlinear problems occur naturally. We will see problems from many of the same fields we explored in Section 2.9. One example is that of population dynamics. Typically, we have a certain population, y(t), and the differential equation governing the growth behavior of this population is developed in a manner similar to that used previously for mixing problems. We note that the rate of change of the population is given by the Rate In minus the Rate Out. The Rate In is given by the number of the species born per unit time. The Rate Out is given by the number that die per unit time.

A simple population model can be obtained if one assumes that these rates are linear in the population. Thus, we assume that the Rate In =b y and the Rate Out =m y. Here we have denoted the birth rate as b and the mortality rate as m, . This gives the rate of change of population as

\dfrac{d y}{d t}=b y-m y \nonumber

Generally, these rates could depend upon time. In the case that they are both constant rates, we can define k=b-m and we obtain the familiar exponential model:

\dfrac{d y}{d t}=k y . \nonumber

This is easily solved and one obtains exponential growth (k>0) or decay (k< 0). This model has been named after Malthus 1 , a clergyman who used this model to warn of the impending doom of the human race if its reproductive practices continued.

However, when populations get large enough, there is competition for resources, such as space and food, which can lead to a higher mortality rate. Thus, the mortality rate may be a function of the population size, m=m(y). The simplest model would be a linear dependence, m=\tilde{m}+c y. Then, the previous exponential model takes the form

\dfrac{d y}{d t}=k y-c y^{2} \nonumber

This is known as the logistic model of population growth. Typically, c is small and the added nonlinear term does not really kick in until the population gets large enough.

While one can solve this particular equation, it is instructive to study the qualitative behavior of the solutions without actually writing down the explicit solutions. Such methods are useful for more difficult nonlinear equations. We will investigate some simple first order equations in the next section. In the following section we present the analytic solution for completeness.

We will resume our studies of systems of equations and various applications throughout the rest of this chapter. We will see that we can get quite a bit of information about the behavior of solutions by using some of our earlier methods for linear systems.

Autonomous First Order Equations

In this section we will review the techniques for studying the stability of nonlinear first order autonomous equations. We will then extend this study to looking at families of first order equations which are connected through a parameter.

Recall that a first order autonomous equation is given in the form

{ }^{1} Malthus, Thomas Robert. An Essay on the Principle of Population. Library of Economics and Liberty. Retrieved August 2, 2007 from the World Wide Web: http://www.econlib.org/library/Malthus/malPop1.html

\dfrac{d y}{d t}=f(y) . \nonumber

We will assume that f and \dfrac{\partial f}{\partial y} are continuous functions of y, so that we know that solutions of initial value problems exist and are unique.

We will recall the qualitative methods for studying autonomous equations by considering the example

\dfrac{d y}{d t}=y-y^{2} . \nonumber

This is just an example of a logistic equation.

First, one determines the equilibrium, or constant, solutions given by y^{\prime}= 0 . For this case, we have y-y^{2}=0. So, the equilibrium solutions are y=0 and y=1. Sketching these solutions, we divide the t y-plane into three regions. Solutions that originate in one of these regions at t=t_{0} will remain in that region for all t>t_{0} since solutions cannot intersect. [Note that if two solutions intersect then they have common values y_{1} at time t_{1}. Using this information, we could set up an initial value problem for which the initial condition is y\left(t_{1}\right)=y_{1}. Since the two different solutions intersect at this point in the phase plane, we would have an initial value problem with two different solutions corresponding to the same initial condition. This contradicts the uniqueness assumption stated above. We will leave the reader to explore this further in the homework.]

Next, we determine the behavior of solutions in the three regions. Noting that d y / d t gives the slope of any solution in the plane, then we find that the solutions are monotonic in each region. Namely, in regions where d y / d t>0, we have monotonically increasing functions. We determine this from the right side of our equation.

For example, in this problem y-y^{2}>0 only for the middle region and y-y^{2}<0 for the other two regions. Thus, the slope is positive in the middle region, giving a rising solution as shown in Figure 3.1. Note that this solution does not cross the equilibrium solutions. Similar statements can be made about the solutions in the other regions.

We further note that the solutions on either side of y=1 tend to approach this equilibrium solution for large values of t. In fact, no matter how close one is to y=1, eventually one will approach this solution as t \rightarrow \infty. So, the equilibrium solution is a stable solution. Similarly, we see that y=0 is an unstable equilibrium solution.

If we are only interested in the behavior of the equilibrium solutions, we could just construct a phase line. In Figure 3.2 we place a vertical line to the right of the t y-plane plot. On this line one first places dots at the corresponding equilibrium solutions and labels the solutions. These points at the equilibrium solutions are end points for three intervals. In each interval one then places arrows pointing upward (downward) indicating solutions with positive (negative) slopes. Looking at the phase line one can now determine if a given equilibrium is stable (arrows pointing towards the point) or unstable

image
Figure 3.1. Representative solution behavior for y^{\prime}=y-y^{2}.

(arrows pointing away from the point). In Figure 3.3 we draw the final phase line by itself.

image
Figure 3.2. Representative solution behavior and phase line for y^{\prime}=y-y^{2}.

Solution of the Logistic Equation

We have seen that one does not need an explicit solution of the logistic equation (3.2) in order to study the behavior of its solutions. However, the logistic equation is an example of a nonlinear first order equation that is solvable. It is an example of a Riccati equation.

The general form of the Riccati equation is

image
Figure 3.3. Phase line for y^{\prime}=y-y^{2}.

\dfrac{d y}{d t}=a(t)+b(t) y+c(t) y^{2} \nonumber

As long as c(t) \neq 0, this equation can be reduced to a second order linear differential equation through the transformation

y(t)=-\dfrac{1}{c(t)} \dfrac{\dot{x}(t)}{x(t)} . \nonumber

We will demonstrate this using the simple case of the logistic equation,

\dfrac{d y}{d t}=k y-c y^{2} . \nonumber

We let

y(t)=\dfrac{1}{c} \dfrac{\dot{x}}{x} \nonumber

Then

\begin{aligned} \dfrac{d y}{d t} &=\dfrac{1}{c}\left[\dfrac{\ddot{x}}{x}-\left(\dfrac{\dot{x}}{x}\right)^{2}\right] \\[4pt] &=\dfrac{1}{c}\left[\dfrac{\ddot{x}}{x}-(c y)^{2}\right] \\[4pt] &=\dfrac{1}{c} \dfrac{\ddot{x}}{x}-c y^{2} \end{aligned} \nonumber

Inserting this into the logistic equation (3.5), we have

\dfrac{1}{c} \dfrac{\ddot{x}}{x}-c y^{2}=k \dfrac{1}{c}\left(\dfrac{\dot{x}}{x}\right)-c y^{2}, \nonumber

or

\ddot{x}=k \dot{x} . \nonumber

This equation is readily solved to give

x(t)=A+B e^{k t} . \nonumber

Therefore, we have the solution to the logistic equation is

y(t)=\dfrac{1}{c} \dfrac{\dot{x}}{x}=\dfrac{k B e^{k t}}{c\left(A+B e^{k t}\right)} \nonumber

It appears that we have two arbitrary constants. But, we started out with a first order differential equation and expect only one arbitrary constant. However, we can resolve this by dividing the numerator and denominator by k B e^{k t} and defining C=\dfrac{A}{B}. Then we have

y(t)=\dfrac{k / c}{1+C e^{-k t}}, \nonumber

showing that there really is only one arbitrary constant in the solution.

We should note that this is not the only way to obtain the solution to the logistic equation, though it does provide an introduction to Riccati equations. A more direct approach would be to use separation of variables on the logistic equation. The reader should verify this.

3.4 Bifurcations for First Order Equations

In this section we introduce families of first order differential equations of the form

\dfrac{d y}{d t}=f(y ; \mu) . \nonumber

Here \mu is a parameter that we can change and then observe the resulting effects on the behaviors of the solutions of the differential equation. When a small change in the parameter leads to large changes in the behavior of the solution, then the system is said to undergo a bifurcation. We will turn to some generic examples, leading to special bifurcations of first order autonomous differential equations.

Example 3.1. y^{\prime}=y^{2}-\mu.

First note that equilibrium solutions occur for y^{2}=\mu. In this problem, there are three cases to consider.

  1. \mu>0.

In this case there are two real solutions, y=\pm \sqrt{\mu}. Note that y^{2}-\mu<0 for |y|<\sqrt{\mu}. So, we have the left phase line in Figure 3.4. 2. \mu=0.

There is only one equilibrium point at y=0. The equation becomes y^{\prime}=y^{2}. It is obvious that the right side of this equation is never negative. So, the phase line is shown as the middle line in Figure 3.4.

  1. \mu<0.

In this case there are no equilibrium solutions. Since y^{2}-\mu>0, the slopes for all solutions are positive as indicated by the last phase line in Figure 3.4

image
Figure 3.4. Phase lines for y^{\prime}=y^{2}-\mu. On the left \mu>0 and on the right \mu<0.

We can combine these results into one diagram known as a bifurcation diagram. We plot the equilibrium solutions y vs \mu. We begin by lining up the phase lines for various \mu ’s. We display these in Figure 3.5. Note the pattern of equilibrium points satisfies y=\mu^{2} as it should. This is easily seen to be a parabolic curve. The upper branch of this curve is a collection of unstable equilibria and the bottom is a stable branch. So, we can dispose of the phase lines and just keep the equilibria. However, we will draw the unstable branch as a dashed line and the stable branch as a solid line.

The bifurcation diagram is displayed in Figure 3.6. This type of bifurcation is called a saddle-node bifurcation. The point \mu=0 at which the behavior changes is called the bifurcation point. As \mu goes from negative to positive, we go from having no equilibria to having one stable and one unstable equilibrium point.

Example 3.2. y^{\prime}=y^{2}-\mu y.

In this example we have two equilibrium points, y=0 and y=\mu. The behavior of the solutions depends upon the sign of y^{2}-\mu y=y(y-\mu). This leads to four cases with the indicated signs of the derivative.

  1. y>0, y-\mu>0 \Rightarrow y^{\prime}>0.
  2. y<0, y-\mu>0 \Rightarrow y^{\prime}<0
  3. y>0, y-\mu<0 \Rightarrow y^{\prime}<0.
  4. y<0, y-\mu<0 \Rightarrow y^{\prime}>0.

The corresponding phase lines and superimposed bifurcation diagram are shown in 3.7. The bifurcation diagram is in Figure 3.8 and this is called a transcritical bifurcation.

image
Figure 3.5. The typical phase lines for y^{\prime}=y^{2}-\mu.
image
Figure 3.6. Bifurcation diagram for y^{\prime}=y^{2}-\mu. This is an example of a saddle-node bifurcation.
image
Figure 3.7. Collection of phase lines for y^{\prime}=y^{2}-\mu y.

Example 3.3. y^{\prime}=y^{3}-\mu y.

For this last example, we find from y^{3}-\mu y=y\left(y^{2}-\mu\right)=0 that there are two cases.

  1. \mu<0 In this case there is only one equilibrium point at y=0. For positive values of y we have that y^{\prime}>0 and for negative values of y we have that y^{\prime}<0. Therefore, this is an unstable equilibrium point.
image
Figure 3.8. Bifurcation diagram for y^{\prime}=y^{2}-\mu y. This is an example of a transcritical bifurcation.
  1. \mu>0 Here we have three equilibria, x=0, \pm \sqrt{\mu}. A careful investigation shows that x=0. is a stable equilibrium point and that the other two equilibria are unstable.

In Figure 3.9 we show the phase lines for these two cases. The corresponding bifurcation diagram is then sketched in Figure 3.10. For obvious reasons this has been labeled a pitchfork bifurcation.

image
Figure 3.9. The phase lines for y^{\prime}=y^{3}-\mu y. The left one corresponds to \mu<0 and the right phase line is for \mu>0.
image
Figure 3.10. Bifurcation diagram for y^{\prime}=y^{3}-\mu y. This is an example of a pitchfork bifurcation.

Nonlinear Pendulum

In this section we will introduce the nonlinear pendulum as our first example of periodic motion in a nonlinear system. Oscillations are important in many areas of physics. We have already seen the motion of a mass on a spring, leading to simple, damped, and forced harmonic motions. Later we will explore these effects on a simple nonlinear system. In this section we will introduce the nonlinear pendulum and determine its period of oscillation.

We begin by deriving the pendulum equation. The simple pendulum consists of a point mass m hanging on a string of length L from some support. [See Figure 3.11.] One pulls the mass back to some starting angle, \theta_{0}, and releases it. The goal is to find the angular position as a function of time, \theta(t).

image
Figure 3.11. A simple pendulum consists of a point mass m attached to a string of length L. It is released from an angle \theta_{0}.

There are a couple of derivations possible. We could either use Newton’s Second Law of Motion, F=m a, or its rotational analogue in terms of torque. We will use the former only to limit the amount of physics background needed.

There are two forces acting on the point mass, the weight and the tension in the string. The weight points downward and has a magnitude of m g, where g is the standard symbol for the acceleration due to gravity. At the surface of the earth we can take this to be 9.8 \mathrm{~m} / \mathrm{s}^{2} or 32.2 \mathrm{ft} / \mathrm{s}^{2}. In Figure 3.12 we show both the weight and the tension acting on the mass. The net force is also shown.

The tension balances the projection of the weight vector, leaving an unbalanced component of the weight in the direction of the motion. Thus, the magnitude of the sum of the forces is easily found from this unbalanced component as F=m g \sin \theta.

Newton’s Second Law of Motion tells us that the net force is the mass times the acceleration. So, we can write

m \ddot{x}=-m g \sin \theta . \nonumber

Next, we need to relate x and \theta . x is the distance traveled, which is the length of the arc traced out by our point mass. The arclength is related to the angle, provided the angle is measured in radians. Namely, x=r \theta for r=L. Thus, we can write

image
Figure 3.12. There are two forces acting on the mass, the weight m g and the tension T. The magnitude of the net force is found to be F=m g \sin \theta.

m L \ddot{\theta}=-m g \sin \theta \nonumber

Canceling the masses, leads to the nonlinear pendulum equation

L \ddot{\theta}+g \sin \theta=0 . \nonumber

There are several variations of Equation (3.8) which will be used in this text. The first one is the linear pendulum. This is obtained by making a small angle approximation. For small angles we know that \sin \theta \approx \theta. Under this approximation (3.8) becomes

L \ddot{\theta}+g \theta=0 . \nonumber

We can also make the system more realistic by adding damping. This could be due to energy loss in the way the string is attached to the support or due to the drag on the mass, etc. Assuming that the damping is proportional to the angular velocity, we have equations for the damped nonlinear and damped linear pendula:

\begin{gathered} L \ddot{\theta}+b \dot{\theta}+g \sin \theta=0 . \\[4pt] L \ddot{\theta}+b \dot{\theta}+g \theta=0 . \end{gathered} \nonumber

Finally, we can add forcing. Imagine that the support is attached to a device to make the system oscillate horizontally at some frequency. Then we could have equations such as

L \ddot{\theta}+b \dot{\theta}+g \sin \theta=F \cos \omega t . \nonumber

We will look at these and other oscillation problems later in the exercises. These are summarized in the table below.

image

In Search of Solutions

Before returning to studying the equilibrium solutions of the nonlinear pendulum, we will look at how far we can get at obtaining analytical solutions. First, we investigate the simple linear pendulum.

The linear pendulum equation (3.9) is a constant coefficient second order linear differential equation. The roots of the characteristic equations are r= \pm \sqrt{\dfrac{g}{L}} i. Thus, the general solution takes the form

\theta(t)=c_{1} \cos \left(\sqrt{\dfrac{g}{L}} t\right)+c_{2} \sin \left(\sqrt{\dfrac{g}{L}} t\right) \nonumber

We note that this is usually simplified by introducing the angular frequency

\omega \equiv \sqrt{\dfrac{g}{L}} . \nonumber

One consequence of this solution, which is used often in introductory physics, is an expression for the period of oscillation of a simple pendulum. REcall that the period is the time it takes to complete one cycle of the oscillation. The period is found to be

T=\dfrac{2 \pi}{\omega}=2 \pi \sqrt{\dfrac{L}{g}} \nonumber

This value for the period of a simple pendulum is based on the linear pendulum equation, which was derived assuming a small angle approximation. How good is this approximation? What is meant by a small angle? We recall the Taylor series approximation of \sin \theta about \theta=0 :

\sin \theta=\theta-\dfrac{\theta^{3}}{3 !}+\dfrac{\theta^{5}}{5 !}+\ldots \nonumber

One can obtain a bound on the error when truncating this series to one term after taking a numerical analysis course. But we can just simply plot the relative error, which is defined as Relative Error =\left|\dfrac{\sin \theta-\theta}{\sin \theta}\right| \times 100 \%.

A plot of the relative error is given in Figure 3.13. We note that a one percent relative error corresponds to about 0.24 radians, which is less that fourteen degrees. Further discussion on this is provided at the end of this section.

image
Figure 3.13. The relative error in percent when approximating \sin \theta by \theta.

We now turn to the nonlinear pendulum. We first rewrite Equation (3.8) in the simpler form

\ddot{\theta}+\omega^{2} \sin \theta=0 . \nonumber

We next employ a technique that is useful for equations of the form

\ddot{\theta}+F(\theta)=0 \nonumber

when it is easy to integrate the function F(\theta). Namely, we note that

\dfrac{d}{d t}\left[\dfrac{1}{2} \dot{\theta}^{2}+\int^{\theta(t)} F(\phi) d \phi\right]=[\ddot{\theta}+F(\theta)] \dot{\theta} \nonumber

For our problem, we multiply Equation (3.17) by \dot{\theta},

\ddot{\theta} \dot{\theta}+\omega^{2} \sin \theta \dot{\theta}=0 \nonumber

and note that the left side of this equation is a perfect derivative. Thus,

\dfrac{d}{d t}\left[\dfrac{1}{2} \dot{\theta}^{2}-\omega^{2} \cos \theta\right]=0 \nonumber

Therefore, the quantity in the brackets is a constant. So, we can write

\dfrac{1}{2} \dot{\theta}^{2}-\omega^{2} \cos \theta=c . \nonumber

Solving for \dot{\theta}, we obtain

\dfrac{d \theta}{d t}=\sqrt{2\left(c+\omega^{2} \cos \theta\right)} \nonumber

This equation is a separable first order equation and we can rearrange and integrate the terms to find that

t=\int d t=\int \dfrac{d \theta}{\sqrt{2\left(c+\omega^{2} \cos \theta\right)}} . \nonumber

Of course, one needs to be able to do the integral. When one gets a solution in this implicit form, one says that the problem has been solved by quadratures. Namely, the solution is given in terms of some integral. In the appendix to this chapter we show that this solution can be written in terms of elliptic integrals and derive corrections to formula for the period of a pendulum.

The Stability of Fixed Points in Nonlinear Systems

We are now interested in studying the stability of the equilibrium solutions of the nonlinear pendulum. Along the way we will develop some basic methods for studying the stability of equilibria in nonlinear systems.

We begin with the linear differential equation for damped oscillations as given earlier in Equation (3.9). In this case, we have a second order equation of the form

x^{\prime \prime}+b x^{\prime}+\omega^{2} x . \nonumber

Using the methods of Chapter 2, this second order equation can be written as a system of two first order equations:

\begin{aligned} &x^{\prime}=y \\[4pt] &y^{\prime}=-b y-\omega^{2} x . \end{aligned} \nonumber

This system has only one equilibrium solution, x=0, y=0.

Turning to the damped nonlinear pendulum, we have the system

\begin{aligned} x^{\prime} &=y \\[4pt] y^{\prime} &=-b y-\omega^{2} \sin x . \end{aligned} \nonumber

This system also has the equilibrium solution, x=0, y=0. However, there are actually an infinite number of solutions. The equilibria are determined from y=0 and -b y-\omega^{2} \sin x=0. This implies that \sin x=0. There are an infinite number of solutions: x=n \pi, n=0, \pm 1, \pm 2, \ldots So, we have an infinite number of equilibria, (n \pi, 0), n=0, \pm 1, \pm 2, \ldots

Next, we need to determine their stability. To do this we need a more general theory for nonlinear systems. We begin with the n-dimensional system

\mathbf{x}^{\prime}=\mathbf{f}(\mathbf{x}), \quad \mathrm{x} \in \mathrm{R}^{n} \nonumber

Here \mathbf{f}: \mathrm{R}^{n} \rightarrow \mathrm{R}^{n}. We define fixed points, or equilibrium solutions, of this system as points \mathrm{x}^{*} satisfying \mathbf{f}\left(\mathrm{x}^{*}\right)=\mathbf{0}.

The stability in the neighborhood of fixed points can now be determined. We are interested in what happens to solutions of our system with initial conditions starting near a fixed point. We can represent a point near a fixed point in the form \mathbf{x}=\mathbf{x}^{*}+\boldsymbol{\xi}, where the length of \boldsymbol{\xi} gives an indication of how close we are to the fixed point. So, we consider that initially, |\boldsymbol{\xi}| \ll 1.

As the system evolves, \boldsymbol{\xi} will change. The change of \boldsymbol{\xi} in time is in turn governed by a system of equations. We can approximate this evolution as follows. First, we note that

\mathbf{x}^{\prime}=\boldsymbol{\xi}^{\prime} \nonumber

Next, we have that

\mathbf{f}(\mathbf{x})=\mathbf{f}\left(\mathbf{x}^{*}+\boldsymbol{\xi}\right) \nonumber

We can expand the right side about the fixed point using a multidimensional version of Taylor’s Theorem. Thus, we have that

\mathbf{f}\left(\mathbf{x}^{*}+\boldsymbol{\xi}\right)=\mathbf{f}\left(\mathbf{x}^{*}\right)+D \mathbf{f}\left(\mathbf{x}^{*}\right) \boldsymbol{\xi}+O\left(|\boldsymbol{\xi}|^{2}\right) \nonumber

Here Df is the Jacobian matrix, defined as

D \mathbf{f}=\left(\begin{array}{cccc} \dfrac{\partial f_{1}}{\partial x_{1}} & \dfrac{\partial f_{1}}{\partial x_{2}} & \cdots & \dfrac{\partial f_{1}}{\partial x_{n}} \\[4pt] \dfrac{\partial f_{2}}{\partial x_{1}} & \ddots & \ddots & \vdots \\[4pt] \vdots & \ddots & \ddots & \vdots \\[4pt] \dfrac{\partial f_{n}}{\partial x_{1}} & \cdots & \cdots & \dfrac{\partial f_{n}}{\partial x_{n}} \end{array}\right) \nonumber

Noting that \mathbf{f}\left(\mathbf{x}^{*}\right)=\mathbf{0}, we then have that system (3.22) becomes

\xi^{\prime} \approx D \mathbf{f}\left(\mathbf{x}^{*}\right) \boldsymbol{\xi} \nonumber

It is this equation which describes the behavior of the system near the fixed point. We say that system (3.22) has been linearized or that Equation (3.23) is the linearization of system (3.22). Example 3.4. As an example of the application of this linearization, we look at the system

\begin{aligned} &x^{\prime}=-2 x-3 x y \\[4pt] &y^{\prime}=3 y-y^{2} \end{aligned} \nonumber

We first determine the fixed points:

\begin{aligned} &0=-2 x-3 x y=-x(2+3 y) \\[4pt] &0=3 y-y^{2}=y(3-y) \end{aligned} \nonumber

From the second equation, we have that either y=0 or y=3. The first equation then gives x=0 in either case. So, there are two fixed points: (0,0) and (0,3)

Next, we linearize about each fixed point separately. First, we write down the Jacobian matrix.

D \mathbf{f}(x, y)=\left(\begin{array}{cc} -2-3 y & -3 x \\[4pt] 0 & 3-2 y \end{array}\right) \nonumber

  1. Case I (0,0).

In this case we find that

D \mathbf{f}(0,0)=\left(\begin{array}{cc} -2 & 0 \\[4pt] 0 & 3 \end{array}\right) \nonumber

Therefore, the linearized equation becomes

\xi^{\prime}=\left(\begin{array}{cc} -2 & 0 \\[4pt] 0 & 3 \end{array}\right) \boldsymbol{\xi} \nonumber

This is equivalently written out as the system

\begin{aligned} &\xi_{1}^{\prime}=-2 \xi_{1} \\[4pt] &\xi_{2}^{\prime}=3 \xi_{2} \end{aligned} \nonumber

This is the linearized system about the origin. Note the similarity with the original system. We emphasize that the linearized equations are constant coefficient equations and we can use earlier matrix methods to determine the nature of the equilibrium point. The eigenvalues of the system are obviously \lambda=-2,3. Therefore, we have that the origin is a saddle point.

  1. Case II (0,3).

In this case we proceed as before. We write down the Jacobian matrix and look at its eigenvalues to determine the type of fixed point. So, we have that the Jacobian matrix is

D \mathbf{f}(0,3)=\left(\begin{array}{cc} -2 & 0 \\[4pt] 0 & -3 \end{array}\right) \nonumber

Here, we have the eigenvalues \lambda=-2,-3. So, this fixed point is a stable node. This analysis has given us a saddle and a stable node. We know what the behavior is like near each fixed point, but we have to resort to other means to say anything about the behavior far from these points. The phase portrait for this system is given in Figure 3.14. You should be able to find the saddle point and the node. Notice how solutions behave in regions far from these points.

image
Figure 3.14. Phase plane for the system x^{\prime}=-2 x-3 x y, y^{\prime}=3 y-y^{2}.

We can expect to be able to perform a linearization under general conditions. These are given in the Hartman-Großman Theorem:

Theorem 3.5. A continuous map exists between the linear and nonlinear systems when D \mathbf{f}\left(\mathbf{x}^{*}\right) does not have any eigenvalues with zero real part.

Generally, there are several types of behavior that one can see in nonlinear systems. One can see sinks or sources, hyperbolic (saddle) points, elliptic points (centers) or foci. We have defined some of these for planar systems. In general, if at least two eigenvalues have real parts with opposite signs, then the fixed point is a hyperbolic point. If the real part of a nonzero eigenvalue is zero, then we have a center, or elliptic point.

Example 3.6. Return to the Nonlinear Pendulum

We are now ready to establish the behavior of the fixed points of the damped nonlinear pendulum in Equation (3.21). The system was

\begin{aligned} x^{\prime} &=y \\[4pt] y^{\prime} &=-b y-\omega^{2} \sin x \end{aligned} \nonumber

We found that there are an infinite number of fixed points at (n \pi, 0), n= 0, \pm 1, \pm 2, \ldots

We note that the Jacobian matrix is

D \mathbf{f}(x, y)=\left(\begin{array}{cc} 0 & 1 \\[4pt] -\omega^{2} \cos x & -b \end{array}\right) \text {. } \nonumber

Evaluating this at the fixed points, we find that

D \mathbf{f}(n \pi, 0)=\left(\begin{array}{cc} 0 & 1 \\[4pt] -\omega^{2} \cos n \pi & -b \end{array}\right)=\left(\begin{array}{cc} 0 & 1 \\[4pt] \omega^{2}(-1)^{n+1} & -b \end{array}\right) \text {. } \nonumber

There are two cases to consider: n even and n odd. For the first case, we find the eigenvalue equation

\lambda^{2}+b \lambda+\omega^{2}=0 \nonumber

This has the roots

\lambda=\dfrac{-b \pm \sqrt{b^{2}-4 \omega^{2}}}{2} \nonumber

For b^{2}<4 \omega^{2}, we have two complex conjugate roots with a negative real part. Thus, we have stable foci for even n values. If there is no damping, then we obtain centers.

In the second case, n odd, we have that

\lambda^{2}+b \lambda-\omega^{2}=0 \nonumber

In this case we find

\lambda=\dfrac{-b \pm \sqrt{b^{2}+4 \omega^{2}}}{2} . \nonumber

Since b^{2}+4 \omega^{2}>b^{2}, these roots will be real with opposite signs. Thus, we have hyperbolic points, or saddles.

In Figure (3.15) we show the phase plane for the undamped nonlinear pendulum. We see that we have a mixture of centers and saddles. There are orbits for which there is periodic motion. At \theta=\pi the behavior is unstable. This is because it is difficult to keep the mass vertical. This would be appropriate if we were to replace the string by a massless rod. There are also unbounded orbits, going through all of the angles. These correspond to the mass spinning around the pivot in one direction forever. We have indicated in the figure solution curves with the initial conditions \left(x_{0}, y_{0}\right)=(0,3),(0,2),(0,1),(5,1).

When there is damping, we see that we can have a variety of other behaviors as seen in Figure (3.16). In particular, energy loss leads to the mass settling around one of the stable fixed points. This leads to an understanding as to why there are an infinite number of equilibria, even though physically the mass traces out a bound set of Cartesian points. We have indicated in the Figure (3.16) solution curves with the initial conditions \left(x_{0}, y_{0}\right)=(0,3),(0,2),(0,1),(5,1).

image
Figure 3.15. Phase plane for the undamped nonlinear pendulum. Solution curves are shown for initial conditions \left(x_{0}, y_{0}\right)=(0,3),(0,2),(0,1),(5,1).

Nonlinear Population Models

We have already encountered several models of population dynamics. Of course, one could dream up several other examples. There are two standard types of models: Predator-prey and competing species. In the predator-prey model, one typically has one species, the predator, feeding on the other, the prey. We will look at the standard Lotka-Volterra model in this section. The competing species model looks similar, except there are a few sign changes, since one species is not feeding on the other. Also, we can build in logistic terms into our model. We will save this latter type of model for the homework.

The Lotka-Volterra model takes the form

\begin{aligned} &\dot{x}=a x-b x y, \\[4pt] &\dot{y}=-d y+c x y \end{aligned} \nonumber

In this case, we can think of x as the population of rabbits (prey) and y is the population of foxes (predators). Choosing all constants to be positive, we can describe the terms.

  • ax: When left alone, the rabbit population will grow. Thus a is the natural growth rate without predators.
  • -d y : When there are no rabbits, the fox population should decay. Thus, the coefficient needs to be negative.
image
Figure 3.16. Phase plane for the damped nonlinear pendulum. Solution curves are shown for initial conditions \left(x_{0}, y_{0}\right)=(0,3),(0,2),(0,1),(5,1).
  • -b x y : We add a nonlinear term corresponding to the depletion of the rabbits when the foxes are around.
  • cxy: The more rabbits there are, the more food for the foxes. So, we add a nonlinear term giving rise to an increase in fox population.

The analysis of the Lotka-Volterra model begins with determining the fixed points. So, we have from Equation (3.34)

\begin{gathered} x(a-b y)=0, \\[4pt] y(-d+c x)=0 . \end{gathered} \nonumber

Therefore, the origin and \left(\dfrac{d}{c} \dfrac{a}{b}\right) are the fixed points.

Next, we determine their stability, by linearization about the fixed points. We can use the Jacobian matrix, or we could just expand the right hand side of each equation in (3.34). The Jacobian matrix is D f(x, y)=\left(\begin{array}{cc}a-b y & -b x \\[4pt] c y & -d+c x\end{array}\right). Evaluating at each fixed point, we have

\begin{gathered} D f(0,0)=\left(\begin{array}{cc} a & 0 \\[4pt] 0 & -d \end{array}\right), \\[4pt] D f\left(\dfrac{d}{c}, \dfrac{a}{b}\right)=\left(\begin{array}{cc} 0 & -\dfrac{b d}{c} \\[4pt] \dfrac{a c}{b} & 0 \end{array}\right) . \end{gathered} \nonumber

The eigenvalues of (3.36) are \lambda=a,-d. So, the origin is a saddle point. The eigenvalues of (3.37) satisfy \lambda^{2}+a d=0. So, the other point is a center. In Figure 3.17 we show a sample direction field for the Lotka-Volterra system.

Another way to linearize is to expand the equations about the fixed points. Even though this is equivalent to computing the Jacobian matrix, it sometimes might be faster.

image
Figure 3.17. Phase plane for the Lotka-Volterra system given by \dot{x}=x-0.2 x y, \dot{y}= -y+0.2 x y. Solution curves are shown for initial conditions \left(x_{0}, y_{0}\right)=(8,3),(1,5).

Limit Cycles

So far we have just been concerned with equilibrium solutions and their behavior. However, asymptotically stable fixed points are not the only attractors. There are other types of solutions, known as limit cycles, towards which a solution may tend. In this section we will look at some examples of these periodic solutions.

Such solutions are common in nature. Rayleigh investigated the problem

x^{\prime \prime}+c\left(\dfrac{1}{3}\left(x^{\prime}\right)^{2}-1\right) x^{\prime}+x=0 \nonumber

in the study of the vibrations of a violin string. Van der Pol studied an electrical circuit, modelling this behavior. Others have looked into biological systems, such as neural systems, chemical reactions, such as Michaelis-Menton kinetics or systems leading to chemical oscillations. One of the most important models in the historical study of dynamical systems is that of planetary motion and investigating the stability of planetary orbits. As is well known, these orbits are periodic.

Limit cycles are isolated periodic solutions towards which neighboring states might tend when stable. A key example exhibiting a limit cycle is given by the system

\begin{aligned} &x^{\prime}=\mu x-y-x\left(x^{2}+y^{2}\right) \\[4pt] &y^{\prime}=x+\mu y-y\left(x^{2}+y^{2}\right) \end{aligned} \nonumber

It is clear that the origin is a fixed point. The Jacobian matrix is given as

\operatorname{Df}(0,0)=\left(\begin{array}{cc} \mu & -1 \\[4pt] 1 & \mu \end{array}\right) \nonumber

The eigenvalues are found to be \lambda=\mu \pm i. For \mu=0 we have a center. For \mu<0 we have a stable spiral and for \mu>0 we have an unstable spiral. However, this spiral does not wander off to infinity. We see in Figure 3.18 that equilibrium point is a spiral. However, in Figure 3.19 it is clear that the solution does not spiral out to infinity. It is bounded by a circle.

image
Figure 3.18. Phase plane for system (3.39) with \mu=0.4.
image
Figure 3.19. Phase plane for system (3.39) with \mu=0.4 showing that the inner spiral is bounded by a limit cycle.

One can actually find the radius of this circle. This requires rewriting the system in polar form. Recall from Chapter 2 that this is done using

\begin{gathered} r r^{\prime}=x x^{\prime}+y y^{\prime}, \\[4pt] r^{2} \theta^{\prime}=x y^{\prime}-y x^{\prime} . \end{gathered} \nonumber

Inserting the system (3.39) into these expressions, we have

r r^{\prime}=\mu r^{2}-r^{4}, \quad r^{2} \theta^{\prime}=r^{2}, \nonumber

r^{\prime}=\mu r-r^{3}, \theta^{\prime}=1 . \nonumber

Of course, for a circle r= const, therefore we need to look at the equilibrium solutions of Equation (3.43). This amounts to solving \mu r-r^{3}=0 for r. The solutions of this equation are r=0, \pm \sqrt{\mu}. We need only keep the one positive radius solution, r=\sqrt{\mu}. In Figures 3.18-3.19 \mu=0.4, so we expect a circle with r=\sqrt{0.4} \approx 0.63. The \theta equation just tells us that we follow the limit cycle in a counterclockwise direction.

Limit cycles are not always circles. In Figures 3.20-3.21 we show the behavior of the Rayleigh system (3.38) for c=0.4 and c=2.0. In this case we see that solutions tend towards a noncircular limit cycle.

image
Figure 3.20. Phase plane for the Rayleigh system (3.38) with c=0.4.

The limit cycle for c=2.0 is shown in Figure 3.22.

image
Figure 3.21. Phase plane for the Rayleigh system (3.38) with c=2.0.
image
Figure 3.22. Phase plane for the Rayleigh system (3.44) with c=0.4.

Can one determine ahead of time if a given nonlinear system will have a limit cycle? In order to answer this question, we will introduce some definitions.

image
Figure 3.23. A sketch depicting the idea of trajectory, or orbit, passing through x.

We first describe different trajectories and families of trajectories. A flow on R^{2} is a function \phi that satisfies the following

  1. \phi(\mathbf{x}, t) is continuous in both arguments.
  2. \phi(\mathbf{x}, 0)=\mathbf{x} for all \mathbf{x} \in R^{2}
  3. \phi\left(\phi\left(\mathbf{x}, t_{1}\right), t_{2}\right)=\phi\left(\mathbf{x}, t_{1}+t_{2}\right).

The orbit, or trajectory, through \mathbf{x} is defined as \gamma=\{\phi(\mathbf{x}, t) \mid t \in I\}. In Figure 3.23 we demonstrate these properties. For t=0, \phi(\mathbf{x}, 0)=\mathbf{x}. Increasing t, one follows the trajectory until one reaches the point \phi\left(\mathbf{x}, t_{1}\right). Continuing t_{2} further, one is then at \phi\left(\phi\left(\mathbf{x}, t_{1}\right), t_{2}\right). By the third property, this is the same as going from \mathbf{x} to \phi\left(\mathbf{x}, t_{1}+t_{2}\right) for t=t_{1}+t_{2}.

Having defined the orbits, we need to define the asymptotic behavior of the orbit for both positive and negative large times. We define the positive semiorbit through \mathbf{x} as \gamma^{+}=\{\phi(\mathbf{x}, t) \mid t>0\}. The negative semiorbit through \mathbf{x} is defined as \gamma^{-}=\{\phi(\mathbf{x}, t) \mid t<0\}. Thus, we have \gamma=\gamma^{+} \cup^{-}.

The positive limit set, or \omega-limit set, of point \mathbf{x} is defined as

\Lambda^{+}=\left\{\mathbf{y} \mid \text { there exists a sequence of } t_{n} \rightarrow \infty \text { such that } \phi\left(\mathbf{x}, t_{n}\right) \rightarrow \mathbf{y}\right\} \nonumber

The \mathbf{y} ’s are referred to as \omega-limit points. This is shown in Figure 3.24.

image
Figure 3.24. A sketch depicting an \omega-limit set. Note that the orbits tends towards the set as t increases.
image
Figure 3.25. A sketch depicting an \alpha-limit set. Note that the orbits tends away from the set as t increases.

Similarly, we define the negative limit set, or it alpha-limit sets, of point \mathbf{x} is defined as \Lambda^{-}=\left\{\mathbf{y} \mid\right. there exists a sequences of t_{n} \rightarrow-\infty such that \left.\phi\left(\mathbf{x}, t_{n}\right) \rightarrow \mathbf{y}\right\}

and the corresponding \mathbf{y} ’s are \alpha-limit points. This is shown in Figure 3.25.

There are several types of orbits that a system might possess. A cycle or periodic orbit is any closed orbit which is not an equilibrium point. A periodic orbit is stable if for every neighborhood of the orbit such that all nearby orbits stay inside the neighborhood. Otherwise, it is unstable. The orbit is asymptotically stable if all nearby orbits converge to the periodic orbit.

A limit cycle is a cycle which is the \alpha or \omega-limit set of some trajectory other than the limit cycle. A limit cycle \Gamma is stable if \Lambda^{+}=\Gamma for all \mathbf{x} in some neighborhood of \Gamma. A limit cycle \Gamma is unstable if \Lambda^{-}=\Gamma for all \mathbf{x} in some neighborhood of \Gamma. Finally, a limits cycle is semistable if it is attracting on one side and repelling on the other side. In the previous examples, we saw limit cycles that were stable. Figures 3.24 and 3.25 depict stable and unstable limit cycles, respectively.

We now state a theorem which describes the type of orbits we might find in our system.

Theorem 3.7. Poincaré-Bendixon Theorem Let \gamma^{+}be contained in a bounded region in which there are finitely many critical points. Then \Lambda^{+}is either

  1. a single critical point;
  2. a single closed orbit;
  3. a set of critical points joined by heteroclinic orbits. [Compare Figures 3.27 and ??.]
image
Figure 3.26. A heteroclinic orbit connecting two critical points.

We are interested in determining when limit cycles may, or may not, exist. A consequence of the Poincaré-Bendixon Theorem is given by the following corollary.

Corollary Let D be a bounded closed set containing no critical points and suppose that \gamma^{+} \subset D. Then there exists a limit cycle contained in D.

More specific criteria allow us to determine if there is a limit cycle in a given region. These are given by Dulac’s Criteria and Bendixon’s Criteria.

image
Figure 3.27. A homoclinic orbit returning to the point it left.

Dulac’s Criteria Consider the autonomous planar system

x^{\prime}=f(x, y), \quad y^{\prime}=g(x, y) \nonumber

and a continuously differentiable function \psi defined on an annular region D contained in some open set. If

\dfrac{\partial}{\partial x}(\psi f)+\dfrac{\partial}{\partial y}(\psi g) \nonumber

does not change sign in D, then there is at most one limit cycle contained entirely in D.

Bendixon’s Criteria Consider the autonomous planar system

x^{\prime}=f(x, y), \quad y^{\prime}=g(x, y) \nonumber

defined on a simply connected domain D such that

\dfrac{\partial}{\partial x}(\psi f)+\dfrac{\partial}{\partial y}(\psi g) \neq 0 \nonumber

in D. Then there are no limit cycles entirely in D.

These are easily proved using Green’s Theorem in the plane. We prove Bendixon’s Criteria. Let \mathbf{f}=(f, g). Assume that \Gamma is a closed orbit lying in D. Let S be the interior of \Gamma. Then

\begin{aligned} \int_{S} \nabla \cdot \mathbf{f} d x d y &=\oint_{\Gamma}(f d y-g d x) \\[4pt] &=\int_{0}^{T}(f \dot{y}-g \dot{x}) d t \\[4pt] &=\int_{0}^{T}(f g-g f) d t=0 \end{aligned} \nonumber

So, if \nabla \cdot \mathbf{f} is not identically zero and does not change sign in S, then from the continuity of \nabla \cdot \mathbf{f} in S we have that the right side above is either positive or negative. Thus, we have a contradiction and there is no closed orbit lying in D

Example 3.8. Consider the earlier example in (3.39) with \mu=1.

\begin{aligned} &x^{\prime}=x-y-x\left(x^{2}+y^{2}\right) \\[4pt] &y^{\prime}=x+y-y\left(x^{2}+y^{2}\right) . \end{aligned} \nonumber

We already know that a limit cycle exists at x^{2}+y^{2}=1. A simple computation gives that

\nabla \cdot \mathbf{f}=2-4 x^{2}-4 y^{2} \nonumber

For an arbitrary annulus a<x^{2}+y^{2}<b, we have

2-4 b<\nabla \cdot \mathbf{f}<2-4 a . \nonumber

For a=3 / 4 and b=5 / 4,-3<\nabla \cdot \mathbf{f}<-1. Thus, \nabla \cdot \mathbf{f}<0 in the annulus 3 / 4<x^{2}+y^{2}<5 / 4. Therefore, by Dulac’s Criteria there is at most one limit cycle in this annulus.

Example 3.9. Consider the system

\begin{aligned} &x^{\prime}=y \\[4pt] &y^{\prime}=-a x-b y+c x^{2}+d y^{2} . \end{aligned} \nonumber

Let \psi(x, y)=e^{-2 d x}. Then,

\dfrac{\partial}{\partial x}(\psi y)+\dfrac{\partial}{\partial y}\left(\psi\left(-a x-b y+c x^{2}+d y^{2}\right)\right)=-b e^{-2 d x} \neq 0 \nonumber

We conclude by Bendixon’s Criteria that there are no limit cycles for this system.

Nonautonomous Nonlinear Systems

In this section we discuss nonautonomous systems. Recall that an autonomous system is one in which there is no explicit time dependence. A simple example is the forced nonlinear pendulum given by the nonhomogeneous equation

\ddot{x}+\omega^{2} \sin x=f(t) . \nonumber

We can set this up as a system of two first order equations:

\begin{aligned} &\dot{x}=y \\[4pt] &\dot{y}=-\omega^{2} \sin x+f(t) \end{aligned} \nonumber

This system is not in a form for which we could use the earlier methods. Namely, it is a nonautonomous system. However, we introduce a new variable z(t)=t and turn it into an autonomous system in one more dimension. The new system takes the form

\begin{aligned} &\dot{x}=y \\[4pt] &\dot{y}=-\omega^{2} \sin x+f(z) \\[4pt] &\dot{z}=1 \end{aligned} \nonumber

This system is a three dimensional autonomous, possibly nonlinear, system and can be explored using our earlier methods.

A more interesting model is provided by the Duffing Equation. This equation models hard spring and soft spring oscillations. It also models a periodically forced beam as shown in Figure 3.28. It is of interest because it is a simple system which exhibits chaotic dynamics and will motivate us towards using new visualization methods for nonautonomous systems.

image
Figure 3.28. One model of the Duffing equation describes a periodically forced beam which interacts with two magnets.

The most general form of Duffing’s equation is given by

\ddot{x}+k \dot{x}+\left(\beta x^{3} \pm \omega_{0}^{2} x\right)=\Gamma \cos (\omega t+\phi) . \nonumber

This equation models hard spring (\beta>0) and soft spring (\beta<0) oscillations. However, we will use a simpler version of the Duffing equation:

\ddot{x}+k \dot{x}+x^{3}-x=\Gamma \cos \omega t . \nonumber

Let’s first look at the behavior of some of the orbits of the system as we vary the parameters. In Figures 3.29-3.31 we show some typical solution plots superimposed on the direction field. We start with the the undamped (k=0) and unforced (\Gamma=0) Duffing equation,

\ddot{x}+x^{3}-x==0 . \nonumber

We can write this second order equation as the autonomous system

\begin{aligned} &\dot{x}=y \\[4pt] &\dot{y}=x\left(1-x^{2}\right) \end{aligned} \nonumber

We see there are three equilibrium points at (0,0),(\pm 1,0). In Figure 3.29 we plot several orbits for We see that the three equilibrium points consist of two centers and a saddle.

image
Figure 3.29. Phase plane for the undamped, unforced Duffing equation (k=0, \Gamma=0).

We now turn on the damping. The system becomes

\begin{aligned} &\dot{x}=y \\[4pt] &\dot{y}=-k y+x\left(1-x^{2}\right) . \end{aligned} \nonumber

In Figure 3.30 we show what happens when k=0.1 These plots are reminiscent of the plots for the nonlinear pendulum; however, there are fewer equilibria. The centers become stable spirals.

Next we turn on the forcing to obtain a damped, forced Duffing equation. The system is now nonautonomous.

image
Figure 3.30. Phase plane for the unforced Duffing equation with k=0.1 and \Gamma=0.

\begin{aligned} &\dot{x}=y \\[4pt] &\dot{y}=x\left(1-x^{2}\right)+\Gamma \cos \omega t \end{aligned} \nonumber

In Figure 3.31 we only show one orbit with k=0.1, \Gamma=0.5, and \omega=1.25. The solution intersects itself and look a bit messy. We can imagine what we would get if we added any more orbits. For completeness, we show in Figure 3.32 an example with four different orbits.

In cases for which one has periodic orbits such as the Duffing equation, Poincaré introduced the notion of surfaces of section. One embeds the orbit in a higher dimensional space so that there are no self intersections, like we saw in Figures 3.31 and 3.32. In Figure 3.33 we show an example where a simple orbit is shown as it periodically pierces a given surface.

In order to simplify the resulting pictures, one only plots the points at which the orbit pierces the surface as sketched in Figure 3.34. In practice, there is a natural frequency, such as \omega in the forced Duffing equation. Then, one plots points at times that are multiples of the period, T=\dfrac{2 \pi}{\omega}. In Figure 3.35 we show what the plot for one orbit would look like for the damped, unforced Duffing equation.

The more interesting case, is when there is forcing and damping. In this case the surface of section plot is given in Figure 3.36. While this is not as busy as the solution plot in Figure 3.31, it still provides some interesting behavior. What one finds is what is called a strange attractor. Plotting many orbits, we find that after a long time, all of the orbits are attracted to a small region in the plane, much like a stable node attracts nearby orbits. However, this

image
Figure 3.31. Phase plane for the Duffing equation with k=0.1, \Gamma=0.5, and \omega=1.25. In this case we show only one orbit which was generated from the initial condition \left(x_{0}=1.0, \quad y_{0}=0.5\right).
image
Figure 3.32. Phase plane for the Duffing equation with k=0.1, \Gamma=0.5, and \omega=1.25. In this case four initial conditions were used to generate four orbits.
image
Figure 3.33. Poincaré’s surface of section. One notes each time the orbit pierces the surface.
image
Figure 3.34. As an orbit pierces the surface of section, one plots the point of intersection in that plane to produce the surface of section plot.

set consists of more than one point. Also, the flow on the attractor is chaotic in nature. Thus, points wander in an irregular way throughout the attractor. This is one of the interesting topics in chaos theory and this whole theory of dynamical systems has only been touched in this text leaving the reader to wander of into further depth into this fascinating field.

Maple Code for Phase Plane Plots

For reference, the plots in Figures 3.29 and 3.30 were generated in Maple using the following commands:

image
Figure 3.35. Poincaré’s surface of section plot for the damped, unforced Duffing equation.
image
Figure 3.36. Poincaré’s surface of section plot for the damped, forced Duffing equation.

This leads to what is known as a strange attractor.

image

The surface of section plots at the end of the last section were obtained using code from S. Lynch’s book Dynamical Systems with Applications Using Maple. The Maple code is given by

image

Appendix: Period of the Nonlinear Pendulum

In Section 3.5.1 we saw that the solution of the nonlinear pendulum problem can be found up to quadrature. In fact, the integral in Equation (3.19) can be transformed into what is know as an elliptic integral of the first kind. We will rewrite our result and then use it to obtain an approximation to the period of oscillation of our nonlinear pendulum, leading to corrections to the linear result found earlier.

We will first rewrite the constant found in (3.18). This requires a little physics. The swinging of a mass on a string, assuming no energy loss at the pivot point, is a conservative process. Namely, the total mechanical energy is conserved. Thus, the total of the kinetic and gravitational potential energies is a constant. Noting that v=L \dot{\theta}, the kinetic energy of the mass on the string is given as

T=\dfrac{1}{2} m v^{2}=\dfrac{1}{2} m L^{2} \dot{\theta}^{2} . \nonumber

The potential energy is the gravitational potential energy. If we set the potential energy to zero at the bottom of the swing, then the potential energy is U=m g h, where h is the height that the mass is from the bottom of the swing. A little trigonometry gives that h=L(1-\cos \theta). This gives the potential energy as

U=m g L(1-\cos \theta) . \nonumber

So, the total mechanical energy is

E=\dfrac{1}{2} m L^{2} \theta^{\prime 2}+m g L(1-\cos \theta) . \nonumber

We note that a little rearranging shows that we can relate this to Equation (3.18)

\dfrac{1}{2}\left(\theta^{\prime}\right)^{2}-\omega^{2} \cos \theta=\dfrac{1}{m L^{2}} E-\omega^{2}=c . \nonumber

We can use Equation (3.56) to get a value for the total energy. At the top of the swing the mass is not moving, if only for a moment. Thus, the kinetic energy is zero and the total energy is pure potential energy. Letting \theta_{0} denote the angle at the highest position, we have that

E=m g L\left(1-\cos \theta_{0}\right)=m L^{2} \omega^{2}\left(1-\cos \theta_{0}\right) . \nonumber

Here we have used the relation g=L \omega^{2}.

Therefore, we have found that

\dfrac{1}{2} \dot{\theta}^{2}-\omega^{2} \cos \theta=\omega^{2}\left(1-\cos \theta_{0}\right) . \nonumber

Using the half angle formula,

\sin ^{2} \dfrac{\theta}{2}=\dfrac{1}{2}(1-\cos \theta) \nonumber

we can rewrite Equation (3.57) as

\dfrac{1}{2} \dot{\theta}^{2}=2 \omega^{2}\left[\sin ^{2} \dfrac{\theta_{0}}{2}-\sin ^{2} \dfrac{\theta}{2}\right] \nonumber

Solving for \theta^{\prime}, we have

\dfrac{d \theta}{d t}=2 \omega\left[\sin ^{2} \dfrac{\theta_{0}}{2}-\sin ^{2} \dfrac{\theta}{2}\right]^{1 / 2} \nonumber

One can now apply separation of variables and obtain an integral similar to the solution we had obtained previously. Noting that a motion from \theta=0 to \theta=\theta_{0} is a quarter of a cycle, then we have that

T=\dfrac{2}{\omega} \int_{0}^{\theta_{0}} \dfrac{d \phi}{\sqrt{\sin ^{2} \dfrac{\theta_{0}}{2}-\sin ^{2} \dfrac{\theta}{2}}} \nonumber

This result is not much different than our previous result, but we can now easily transform the integral into an elliptic integral. We define

z=\dfrac{\sin \dfrac{\theta}{2}}{\sin \dfrac{\theta_{0}}{2}} \nonumber

and

k=\sin \dfrac{\theta_{0}}{2} \nonumber

Then Equation (3.60) becomes

T=\dfrac{4}{\omega} \int_{0}^{1} \dfrac{d z}{\sqrt{\left(1-z^{2}\right)\left(1-k^{2} z^{2}\right)}} . \nonumber

This is done by noting that d z=\dfrac{1}{2 k} \cos \dfrac{\theta}{2} d \theta=\dfrac{1}{2 k}\left(1-k^{2} z^{2}\right)^{1 / 2} d \theta and that \sin ^{2} \dfrac{\theta_{0}}{2}-\sin ^{2} \dfrac{\theta}{2}=k^{2}\left(1-z^{2}\right). The integral in this result is an elliptic integral of the first kind. In particular, the elliptic integral of the first kind is defined

F(\phi, k) \equiv=\int_{0}^{\phi} \dfrac{d \theta}{\sqrt{1-k^{2} \sin ^{2} \theta}}=\int_{0}^{\sin \phi} \dfrac{d z}{\sqrt{\left(1-z^{2}\right)\left(1-k^{2} z^{2}\right)}} . \nonumber

In some contexts, this is known as the incomplete elliptic integral of the first kind and K(k)=F\left(\dfrac{\pi}{2}, k\right) is called the complete integral of the first kind.

There are tables of values for elliptic integrals. Historically, that is how one found values of elliptic integrals. However, we now have access to computer algebra systems which can be used to compute values of such integrals. For small angles, we have that k is small. So, we can develop a series expansion for the period, T, for small k. This is done by first expanding

\left(1-k^{2} z^{2}\right)^{-1 / 2}=1+\dfrac{1}{2} k^{2} z^{2}+\dfrac{3}{8} k^{2} z^{4}+O\left((k z)^{6}\right) \nonumber

Substituting this in the integrand and integrating term by term, one finds that

T=2 \pi \sqrt{\dfrac{L}{g}}\left[1+\dfrac{1}{4} k^{2}+\dfrac{9}{64} k^{4}+\ldots\right] \nonumber

This expression gives further corrections to the linear result, which only provides the first term. In Figure 3.37 we show the relative errors incurred when keeping the k^{2} and k^{4} terms versus not keeping them. The reader is asked to explore this further in Problem 3.8.

image
Figure 3.37. The relative error in percent when approximating the exact period of a nonlinear pendulum with one, two, or three terms in Equation (3.62).

Problems

3.1. Find the equilibrium solutions and determine their stability for the following systems. For each case draw representative solutions and phase lines.
a. y^{\prime}=y^{2}-6 y-16.
b. y^{\prime}=\cos y.
c. y^{\prime}=y(y-2)(y+3).
d. y^{\prime}=y^{2}(y+1)(y-4).

3.2. For y^{\prime}=y-y^{2}, find the general solution corresponding to y(0)=y_{0}. Provide specific solutions for the following initial conditions and sketch them: a. y(0)=0.25, b. y(0)=1.5, and c. y(0)=-0.5.

3.3. For each problem determine equilibrium points, bifurcation points and construct a bifurcation diagram. Discuss the different behaviors in each system.
a. y^{\prime}=y-\mu y^{2}
b. y^{\prime}=y(\mu-y)(\mu-2 y)
c. x^{\prime}=\mu-x^{3}
d. x^{\prime}=x-\dfrac{\mu x}{1+x^{2}}

3.4. Consider the family of differential equations x^{\prime}=x^{3}+\delta x^{2}-\mu x.

a. Sketch a bifurcation diagram in the x \mu-plane for \delta=0.

b. Sketch a bifurcation diagram in the x \mu-plane for \delta>0. Hint: Pick a few values of \delta and \mu in order to get a feel for how this system behaves.

3.5. Consider the system

\begin{aligned} &x^{\prime}=-y+x\left[\mu-x^{2}-y^{2}\right], \\[4pt] &y^{\prime}=x+y\left[\mu-x^{2}-y^{2}\right] \end{aligned} \nonumber

Rewrite this system in polar form. Look at the behavior of the r equation and construct a bifurcation diagram in \mu r space. What might this diagram look like in the three dimensional \mu x y space? (Think about the symmetry in this problem.) This leads to what is called a Hopf bifurcation.

3.6. Find the fixed points of the following systems. Linearize the system about each fixed point and determine the nature and stability in the neighborhood of each fixed point, when possible. Verify your findings by plotting phase portraits using a computer.

a.

\begin{aligned} &x^{\prime}=x(100-x-2 y), \\[4pt] &y^{\prime}=y(150-x-6 y) \end{aligned} \nonumber

b.

\begin{aligned} &x^{\prime}=x+x^{3}, \\[4pt] &y^{\prime}=y+y^{3} \end{aligned} \nonumber

c.

\begin{aligned} &x^{\prime}=x-x^{2}+x y \\[4pt] &y^{\prime}=2 y-x y-6 y^{2} \end{aligned} \nonumber

d.

\begin{aligned} &x^{\prime}=-2 x y \\[4pt] &y^{\prime}=-x+y+x y-y^{3} . \end{aligned} \nonumber

3.7. Plot phase portraits for the Lienard system

\begin{aligned} &x^{\prime}=y-\mu\left(x^{3}-x\right) \\[4pt] &y^{\prime}=-x . \end{aligned} \nonumber

for a small and a not so small value of \mu. Describe what happens as one varies \mu. 3.8. Consider the period of a nonlinear pendulum. Let the length be L=1.0 \mathrm{m} and g=9.8 \mathrm{~m} / \mathrm{s}^{2}. Sketch T vs the initial angle \theta_{0} and compare the linear and nonlinear values for the period. For what angles can you use the linear approximation confidently?

3.9. Another population model is one in which species compete for resources, such as a limited food supply. Such a model is given by

\begin{aligned} &x^{\prime}=a x-b x^{2}-c x y \\[4pt] &y^{\prime}=d y-e y^{2}-f x y . \end{aligned} \nonumber

In this case, assume that all constants are positive.

a Describe the effects/purpose of each terms.

b Find the fixed points of the model.

c Linearize the system about each fixed point and determine the stability.

\mathrm{d} From the above, describe the types of solution behavior you might expect, in terms of the model.

3.10. Consider a model of a food chain of three species. Assume that each population on its own can be modeled by logistic growth. Let the species be labeled by x(t), y(t), and z(t). Assume that population x is at the bottom of the chain. That population will be depleted by population y. Population y is sustained by x ’s, but eaten by z ’s. A simple, but scaled, model for this system can be given by the system

\begin{aligned} &x^{\prime}=x(1-x)-x y \\[4pt] &y^{\prime}=y(1-y)+x y-y z \\[4pt] &z^{\prime}=z(1-z)+y z \end{aligned} \nonumber

a. Find the equilibrium points of the system.

b. Find the Jacobian matrix for the system and evaluate it at the equilibrium points.

c. Find the eigenvalues and eigenvectors.

d. Describe the solution behavior near each equilibrium point.

f. Which of these equilibria are important in the study of the population model and describe the interactions of the species in the neighborhood of these point (\mathrm{s})

3.11. Show that the system x^{\prime}=x-y-x^{3}, y^{\prime}=x+y-y^{3}, has a unique limit cycle by picking an appropriate \psi(x, y) in Dulac’s Criteria.

Boundary Value Problems

Introduction

Until this point we have solved initial value problems. For an initial value problem one has to solve a differential equation subject to conditions on the unknown function and its derivatives at one value of the independent variable. For example, for x=x(t) we could have the initial value problem

x^{\prime \prime}+x=2, \quad x(0)=1, \quad x^{\prime}(0)=0 \nonumber

In the next chapters we will study boundary value problems and various tools for solving such problems. In this chapter we will motivate our interest in boundary value problems by looking into solving the one-dimensional heat equation, which is a partial differential equation. for the rest of the section, we will use this solution to show that in the background of our solution of boundary value problems is a structure based upon linear algebra and analysis leading to the study of inner product spaces. Though technically, we should be lead to Hilbert spaces, which are complete inner product spaces.

For an initial value problem one has to solve a differential equation subject to conditions on the unknown function or its derivatives at more than one value of the independent variable. As an example, we have a slight modification of the above problem: Find the solution x=x(t) for 0 \leq t \leq 1 that satisfies the problem

x^{\prime \prime}+x=2, \quad x(0)=1, \quad x(1)=0 . \nonumber

Typically, initial value problems involve time dependent functions and boundary value problems are spatial. So, with an initial value problem one knows how a system evolves in terms of the differential equation and the state of the system at some fixed time. Then one seeks to determine the state of the system at a later time.

For boundary values problems, one knows how each point responds to its neighbors, but there are conditions that have to be satisfied at the endpoints. An example would be a horizontal beam supported at the ends, like a bridge. The shape of the beam under the influence of gravity, or other forces, would lead to a differential equation and the boundary conditions at the beam ends would affect the solution of the problem. There are also a variety of other types of boundary conditions. In the case of a beam, one end could be fixed and the other end could be free to move. We will explore the effects of different boundary value conditions in our discussions and exercises.

Let’s solve the above boundary value problem. As with initial value problems, we need to find the general solution and then apply any conditions that we may have. This is a nonhomogeneous differential equation, so we have that the solution is a sum of a solution of the homogeneous equation and a particular solution of the nonhomogeneous equation, x(t)=x_{h}(t)+x_{p}(t). The solution of x^{\prime \prime}+x=0 is easily found as

x_{h}(t)=c_{1} \cos t+c_{2} \sin t \nonumber

The particular solution is easily found using the Method of Undetermined Coefficients,

x_{p}(t)=2 \nonumber

Thus, the general solution is

x(t)=2+c_{1} \cos t+c_{2} \sin t . \nonumber

We now apply the boundary conditions and see if there are values of c_{1} and c_{2} that yield a solution to our problem. The first condition, x(0)=0, gives

0=2+c_{1} \nonumber

Thus, c_{1}=-2. Using this value for c_{1}, the second condition, x(1)=1, gives

0=2-2 \cos 1+c_{2} \sin 1 \nonumber

This yields

c_{2}=\dfrac{2(\cos 1-1)}{\sin 1} . \nonumber

We have found that there is a solution to the boundary value problem and it is given by

x(t)=2\left(1-\cos t \dfrac{(\cos 1-1)}{\sin 1} \sin t\right) \nonumber

Boundary value problems arise in many physical systems, just as many of the initial values problems we have seen. We will see in the next section that boundary value problems for ordinary differential equations often appear in the solution of partial differential equations.

Partial Differential Equations

In this section we will introduce some generic partial differential equations and see how the discussion of such equations leads naturally to the study of boundary value problems for ordinary differential equations. However, we will not derive the particular equations, leaving that to courses in differential equations, mathematical physics, etc.

For ordinary differential equations, the unknown functions are functions of a single variable, e.g., y=y(x). Partial differential equations are equations involving an unknown function of several variables, such as u=u(x, y), u= u(x, y), u=u(x, y, z, t), and its (partial) derivatives. Therefore, the derivatives are partial derivatives. We will use the standard notations u_{x}=\dfrac{\partial u}{\partial x}, u_{x x}=\dfrac{\partial^{2} u}{\partial x^{2}}, etc.

There are a few standard equations that one encounters. These can be studied in one to three dimensions and are all linear differential equations. A list is provided in Table 4.1. Here we have introduced the Laplacian operator, \nabla^{2} u=u_{x x}+u_{y y}+u_{z z}. Depending on the types of boundary conditions imposed and on the geometry of the system (rectangular, cylindrical, spherical, etc.), one encounters many interesting boundary value problems for ordinary differential equations.

Name 2 Vars 3 \mathrm{D}
Heat Equation u_{t}=k u_{x x} u_{t}=k \nabla^{2} u
Wave Equation u_{t t}=c^{2} u_{x x} u_{t t}=c^{2} \nabla^{2} u
Laplace’s Equation u_{x x}+u_{y y}=0 \nabla^{2} u=0
Poisson’s Equation u_{x x}+u_{y y}=F(x, y) \nabla^{2} u=F(x, y, z)
Schrödinger’s Equation i u_{t}=u_{x x}+F(x, t) u i u_{t}=\nabla^{2} u+F(x, y, z, t) u

Table 4.1. List of generic partial differential equations.

Let’s look at the heat equation in one dimension. This could describe the heat conduction in a thin insulated rod of length L. It could also describe the diffusion of pollutant in a long narrow stream, or the flow of traffic down a road. In problems involving diffusion processes, one instead calls this equation the diffusion equation.

A typical initial-boundary value problem for the heat equation would be that initially one has a temperature distribution u(x, 0)=f(x). Placing the bar in an ice bath and assuming the heat flow is only through the ends of the bar, one has the boundary conditions u(0, t)=0 and u(L, t)=0. Of course, we are dealing with Celsius temperatures and we assume there is plenty of ice to keep that temperature fixed at each end for all time. So, the problem one would need to solve is given as

image

Another problem that will come up in later discussions is that of the vibrating string. A string of length L is stretched out horizontally with both ends fixed. Think of a violin string or a guitar string. Then the string is plucked, giving the string an initial profile. Let u(x, t) be the vertical displacement of the string at position x and time t. The motion of the string is governed by the one dimensional wave equation. The initial-boundary value problem for this problem is given as

image

Solving the Heat Equation

We would like to see how the solution of such problems involving partial differential equations will lead naturally to studying boundary value problems for ordinary differential equations. We will see this as we attempt the solution of the heat equation problem 4.3. We will employ a method typically used in studying linear partial differential equations, called the method of separation of variables.

We assume that u can be written as a product of single variable functions of each independent variable,

u(x, t)=X(x) T(t) \nonumber

Substituting this guess into the heat equation, we find that

X T^{\prime}=k X^{\prime \prime} T \nonumber

Dividing both sides by k and u=X T, we then get

\dfrac{1}{k} \dfrac{T^{\prime}}{T}=\dfrac{X^{\prime \prime}}{X} \nonumber

We have separated the functions of time on one side and space on the other side. The only way that a function of t equals a function of x is if the functions are constant functions. Therefore, we set each function equal to a constant, \lambda

image

This leads to two equations:

\begin{aligned} &T^{\prime}=k \lambda T \\[4pt] &X^{\prime \prime}=\lambda X \end{aligned} \nonumber

These are ordinary differential equations. The general solutions to these equations are readily found as

\begin{gathered} T(t)=A e^{k \lambda t} \\[4pt] X(x)=c_{1} e^{\sqrt{\lambda} x}+c_{2} e^{\sqrt{-\lambda} x} \end{gathered} \nonumber

We need to be a little careful at this point. The aim is to force our product solutions to satisfy both the boundary conditions and initial conditions. Also, we should note that \lambda is arbitrary and may be positive, zero, or negative. We first look at how the boundary conditions on u lead to conditions on X.

The first condition is u(0, t)=0. This implies that

X(0) T(t)=0 \nonumber

for all t. The only way that this is true is if X(0)=0. Similarly, u(L, t)=0 implies that X(L)=0. So, we have to solve the boundary value problem

X^{\prime \prime}-\lambda X=0, \quad X(0)=0=X(L) \nonumber

We are seeking nonzero solutions, as X \equiv 0 is an obvious and uninteresting solution. We call such solutions trivial solutions.

There are three cases to consider, depending on the sign of \lambda.

I. \underline{\lambda>0}

In this case we have the exponential solutions

X(x)=c_{1} e^{\sqrt{\lambda} x}+c_{2} e^{\sqrt{-\lambda} x} . \nonumber

For X(0)=0, we have

0=c_{1}+c_{2} \nonumber

We will take c_{2}=-c_{1}. Then, X(x)=c_{1}\left(e^{\sqrt{\lambda} x}-e^{\sqrt{-\lambda} x}\right)=2 c_{1} \sinh \sqrt{\lambda} x.

Applying the second condition, X(L)=0 yields

c_{1} \sinh \sqrt{\lambda} L=0 . \nonumber

This will be true only if c_{1}=0, since \lambda>0. Thus, the only solution in this case is X(x)=0. This leads to a trivial solution, u(x, t)=0.

II. \underline{\lambda=0}

\(\overline{\text { For this case it is easier to set } \lambda \text { to zero in the differential equation. So, }\) X^{\prime \prime}=0. Integrating twice, one finds

X(x)=c_{1} x+c_{2} . \nonumber

Setting x=0, we have c_{2}=0, leaving X(x)=c_{1} x. Setting x=L, we find c_{1} L=0. So, c_{1}=0 and we are once again left with a trivial solution.

III. \underline{\lambda<0}

In this case is would be simpler to write \lambda=-\mu^{2}. Then the differential equation is

X^{\prime \prime}+\mu^{2} X=0 \nonumber

The general solution is

X(x)=c_{1} \cos \mu x+c_{2} \sin \mu x . \nonumber

At x=0 we get 0=c_{1}. This leaves X(x)=c_{2} \sin \mu x. At x=L, we find

0=c_{2} \sin \mu L . \nonumber

So, either c_{2}=0 or \sin \mu L=0 . c_{2}=0 leads to a trivial solution again. But, there are cases when the sine is zero. Namely,

\mu L==n \pi, \quad n=1,2, \ldots \nonumber

Note that n=0 is not included since this leads to a trivial solution. Also, negative values of n are redundant, since the sine function is an odd function.

In summary, we can find solutions to the boundary value problem (4.9) for particular values of \lambda. The solutions are

X_{n}(x)=\sin \dfrac{n \pi x}{L}, \quad n=1,2,3, \ldots \nonumber

for

\lambda_{n}=-\mu_{n}^{2}=-\left(\dfrac{n \pi}{L}\right)^{2}, \quad n=1,2,3, \ldots \nonumber

Product solutions of the heat equation (4.3) satisfying the boundary conditions are therefore

u_{n}(x, t)=b_{n} e^{k \lambda_{n} t} \sin \dfrac{n \pi x}{L}, \quad n=1,2,3, \ldots, \nonumber

where b_{n} is an arbitrary constant. However, these do not necessarily satisfy the initial condition u(x, 0)=f(x). What we do get is

u_{n}(x, 0)=\sin \dfrac{n \pi x}{L}, \quad n=1,2,3, \ldots \nonumber

So, if our initial condition is in one of these forms, we can pick out the right n and we are done.

For other initial conditions, we have to do more work. Note, since the heat equation is linear, we can write a linear combination of our product solutions and obtain the general solution satisfying the given boundary conditions as

u(x, t)=\sum_{n=1}^{\infty} b_{n} e^{k \lambda_{n} t} \sin \dfrac{n \pi x}{L} . \nonumber

The only thing to impose is the initial condition:

f(x)=u(x, 0)=\sum_{n=1}^{\infty} b_{n} \sin \dfrac{n \pi x}{L} . \nonumber

So, if we are given f(x), can we find the constants b_{n} ? If we can, then we will have the solution to the full initial-boundary value problem. This will be the subject of the next chapter. However, first we will look at the general form of our boundary value problem and relate what we have done to the theory of infinite dimensional vector spaces.

Connections to Linear Algebra

We have already seen in earlier chapters that ideas from linear algebra crop up in our studies of differential equations. Namely, we solved eigenvalue problems associated with our systems of differential equations in order to determine the local behavior of dynamical systems near fixed points. In our study of boundary value problems we will find more connections with the theory of vector spaces. However, we will find that our problems lie in the realm of infinite dimensional vector spaces. In this section we will begin to see these connections.

Eigenfunction Expansions for PDEs

In the last section we sought solutions of the heat equation. Let’s formally write the heat equation in the form

\dfrac{1}{k} u_{t}=L[u] \nonumber

where

L=\dfrac{\partial^{2}}{\partial x^{2}} \nonumber

L is another example of a linear differential operator. [See Section 1.1.2.] It is a differential operator because it involves derivative operators. We sometimes define D_{x}=\dfrac{\partial}{\partial x}, so that L=D_{x}^{2}. It is linear, because for functions f(x) and g(x) and constants \alpha, \beta we have

L[\alpha f+\beta g]=\alpha L[f]+\beta L[g] \nonumber

When solving the heat equation, using the method of separation of variables, we found an infinite number of product solutions u_{n}(x, t)=T_{n}(t) X_{n}(x). We did this by solving the boundary value problem

L[X]=\lambda X, \quad X(0)=0=X(L) \nonumber

Here we see that an operator acts on an unknown function and spits out an unknown constant times that unknown. Where have we done this before? This is the same form as A \mathbf{v}=\lambda \mathbf{v}. So, we see that Equation (4.14) is really an eigenvalue problem for the operator L and given boundary conditions. When we solved the heat equation in the last section, we found the eigenvalues

\lambda_{n}=-\left(\dfrac{n \pi}{L}\right)^{2} \nonumber

and the eigenfunctions

X_{n}(x)=\sin \dfrac{n \pi x}{L} . \nonumber

We used these to construct the general solution that is essentially a linear combination over the eigenfunctions,

u(x, t)=\sum_{n=1}^{\infty} T_{n}(t) X_{n}(x) \nonumber

Note that these eigenfunctions live in an infinite dimensional function space.

We would like to generalize this method to problems in which L comes from an assortment of linear differential operators. So, we consider the more general partial differential equation

u_{t}=L[u], \quad a \leq x \leq b, \quad t>0 \nonumber

satisfying the boundary conditions

B[u](a, t)=0, \quad B[u](b, t)=0, \quad t>0 \nonumber

and initial condition

u(x, 0)=f(x), \quad a \leq x \leq b \nonumber

The form of the allowed boundary conditions B[u] will be taken up later. Also, we will later see specific examples and properties of linear differential operators that will allow for this procedure to work.

We assume product solutions of the form u_{n}(x, t)=b_{n}(t) \phi_{n}(x), where the \phi_{n} ’s are the eigenfunctions of the operator L,

L \phi_{n}=\lambda_{n} \phi_{n}, \quad n=1,2, \ldots, \nonumber

satisfying the boundary conditions

B\left[\phi_{n}\right](a)=0, \quad B\left[\phi_{n}\right](b)=0 . \nonumber

Inserting the general solution

u(x, t)=\sum_{n=1}^{\infty} b_{n}(t) \phi_{n}(x) \nonumber

into the partial differential equation, we have

\begin{aligned} u_{t} &=L[u] \\[4pt] \dfrac{\partial}{\partial t} \sum_{n=1}^{\infty} b_{n}(t) \phi_{n}(x) &=L\left[\sum_{n=1}^{\infty} b_{n}(t) \phi_{n}(x)\right] \end{aligned} \nonumber

On the left we differentiate term by term { }^{1} and on the right side we use the linearity of L :

\sum_{n=1}^{\infty} \dfrac{d b_{n}(t)}{d t} \phi_{n}(x)=\sum_{n=1}^{\infty} b_{n}(t) L\left[\phi_{n}(x)\right] \nonumber

Now, we make use of the result of applying L to the eigenfunction \phi_{n} :

\sum_{n=1}^{\infty} \dfrac{d b_{n}(t)}{d t} \phi_{n}(x)=\sum_{n=1}^{\infty} b_{n}(t) \lambda_{n} \phi_{n}(x) \nonumber

Comparing both sides, or using the linear independence of the eigenfunctions, we see that

\dfrac{d b_{n}(t)}{d t}=\lambda_{n} b_{n}(t) \nonumber

whose solution is

b_{n}(t)=b_{n}(0) e^{\lambda_{n} t} \nonumber

So, the general solution becomes

{ }^{1} Infinite series cannot always be differentiated, so one must be careful. When we ignore such details for the time being, we say that we formally differentiate the series and formally apply the differential operator to the series. Such operations need to be justified later.

u(x, t)=\sum_{n=1}^{\infty} b_{n}(0) e^{\lambda_{n} t} \phi_{n}(x) \nonumber

This solution satisfies, at least formally, the partial differential equation and satisfies the boundary conditions.

Finally, we need to determine the b_{n}(0) ’s, which are so far arbitrary. We use the initial condition u(x, 0)=f(x) to find that

f(x)=\sum_{n=1}^{\infty} b_{n}(0) \phi_{n}(x) . \nonumber

So, given f(x), we are left with the problem of extracting the coefficients b_{n}(0) in an expansion of f in the eigenfunctions \phi_{n}. We will see that this is related to Fourier series expansions, which we will take up in the next chapter.

Eigenfunction Expansions for Nonhomogeneous ODEs

Partial differential equations are not the only applications of the method of eigenfunction expansions, as seen in the last section. We can apply these method to nonhomogeneous two point boundary value problems for ordinary differential equations assuming that we can solve the associated eigenvalue problem.

Let’s begin with the nonhomogeneous boundary value problem:

\begin{gathered} L[u]=f(x), \quad a \leq x \leq b \\[4pt] B[u](a)=0, \quad B[u](b)=0 . \end{gathered} \nonumber

We first solve the eigenvalue problem,

\begin{array}{r} L[\phi]=\lambda \phi, \quad a \leq x \leq b \\[4pt] B[\phi](a)=0, \quad B[\phi](b)=0, \end{array} \nonumber

and obtain a family of eigenfunctions, \left\{\phi_{n}(x)\right\}_{n=1}^{\infty}. Then we assume that u(x) can be represented as a linear combination of these eigenfunctions:

u(x)=\sum_{n=1}^{\infty} b_{n} \phi_{n}(x) \nonumber

Inserting this into the differential equation, we have

\begin{aligned} f(x) &=L[u] \\[4pt] &=L\left[\sum_{n=1}^{\infty} b_{n} \phi_{n}(x)\right] \\[4pt] &=\sum_{n=1}^{\infty} b_{n} L\left[\phi_{n}(x)\right] \end{aligned} \nonumber

\begin{aligned} &=\sum_{n=1}^{\infty} \lambda_{n} b_{n} \phi_{n}(x) \\[4pt] &\equiv \sum_{n=1}^{\infty} c_{n} \phi_{n}(x) \end{aligned} \nonumber

Therefore, we have to find the expansion coefficients c_{n}=\lambda_{n} b_{n} of the given f(x) in a series expansion over the eigenfunctions. This is similar to what we had found for the heat equation problem and its generalization in the last section.

There are a lot of questions and details that have been glossed over in our formal derivations. Can we always find such eigenfunctions for a given operator? Do the infinite series expansions converge? Can we differentiate our expansions terms by term? Can one find expansions that converge to given functions like f(x) above? We will begin to explore these questions in the case that the eigenfunctions are simple trigonometric functions like the \phi_{n}(x)=\sin \dfrac{n \pi x}{L} in the solution of the heat equation.

Linear Vector Spaces

Much of the discussion and terminology that we will use comes from the theory of vector spaces. Until now you may only have dealt with finite dimensional vector spaces in your classes. Even then, you might only be comfortable with two and three dimensions. We will review a little of what we know about finite dimensional spaces so that we can deal with the more general function spaces, which is where our eigenfunctions live.

The notion of a vector space is a generalization of our three dimensional vector spaces. In three dimensions, we have things called vectors, which are arrows of a specific length and pointing in a given direction. To each vector, we can associate a point in a three dimensional Cartesian system. We just attach the tail of the vector \mathbf{v} to the origin and the head lands at (x, y, z). We then use unit vectors \mathbf{i}, \mathbf{j} and \mathbf{k} along the coordinate axes to write

\mathbf{v}=x \mathbf{i}+y \mathbf{j}+z \mathbf{k} \nonumber

Having defined vectors, we then learned how to add vectors and multiply vectors by numbers, or scalars. Under these operations, we expected to get back new vectors. Then we learned that there were two types of multiplication of vectors. We could multiply then to get a scalar or a vector. This lead to the dot and cross products, respectively. The dot product was useful for determining the length of a vector, the angle between two vectors, or if the vectors were orthogonal.

These notions were later generalized to spaces of more than three dimensions in your linear algebra class. The properties outlined roughly above need to be preserved. So, we have to start with a space of vectors and the operations between them. We also need a set of scalars, which generally come from some field. However, in our applications the field will either be the set of real numbers or the set of complex numbers.

Definition 4.1. A vector space V over a field F is a set that is closed under addition and scalar multiplication and satisfies the following conditions: For any u, v, w \in V and a, b \in F

  1. u+v=v+u.
  2. (u+v)+w=u+(v+w).
  3. There exists a 0 such that 0+v=v.
  4. There exists a-v such that v+(-v)=0.
  5. a(b v)=(a b) v.
  6. (a+b) v=a v+b v
  7. a(u+v)=a u+b v.
  8. 1(v)=v

Now, for an n-dimensional vector space, we have the idea that any vector in the space can be represented as the sum over n linearly independent vectors. Recall that a linearly independent set of vectors \left\{\mathbf{v}_{j}\right\}_{j=1}^{n} satisfies

\sum_{j=1}^{n} c_{j} \mathbf{v}_{j}=\mathbf{0} \quad \Leftrightarrow \quad c_{j}=0 \nonumber

This leads to the idea of a basis set. The standard basis in an n-dimensional vector space is a generalization of the standard basis in three dimensions (i, \mathbf{j} and \mathbf{k}). We define

\mathbf{e}_{k}=(0, \ldots, 0, \underbrace{1}_{k \text { th space }}, 0, \ldots, 0), \quad k=1, \ldots, n . \nonumber

Then, we can expand any \mathbf{v} \in V as

\mathbf{v}=\sum_{k=1}^{n} v_{k} \mathbf{e}_{k} \nonumber

where the v_{k} ’s are called the components of the vector in this basis and one can write \mathbf{v} as an n-tuple \left(v_{1}, v_{2}, \ldots, v_{n}\right).

The only other thing we will need at this point is to generalize the dot product, or scalar product. Recall that there are two forms for the dot product in three dimensions. First, one has that

\mathbf{u} \cdot \mathbf{v}=u v \cos \theta \nonumber

where u and v denote the length of the vectors. The other form, is the component form:

\mathbf{u} \cdot \mathbf{v}=u_{1} v_{1}+u_{2} v_{2}+u_{3} v_{3}=\sum_{k=1}^{3} u_{k} v_{k} \nonumber

Of course, this form is easier to generalize. So, we define the scalar product between to n-dimensional vectors as

<\mathbf{u}, \mathbf{v}>=\sum_{k=1}^{n} u_{k} v_{k} . \nonumber

Actually, there are a number of notations that are used in other texts. One can write the scalar product as (\mathbf{u}, \mathbf{v}) or even use the Dirac notation <\mathbf{u} \mid \mathbf{v}> for applications in quantum mechanics.

While it does not always make sense to talk about angles between general vectors in higher dimensional vector spaces, there is one concept that is useful. It is that of orthogonality, which in three dimensions another way of say vectors are perpendicular to each other. So, we also say that vectors \mathbf{u} and \mathbf{v} are orthogonal if and only if \langle\mathbf{u}, \mathbf{v}\rangle=0. If \left\{\mathbf{a}_{k}\right\}_{k=1}^{n}, is a set of basis vectors such that

<\mathbf{a}_{j}, \mathbf{a}_{k}>=0, \quad k \neq j, \nonumber

then it is called an orthogonal basis. If in addition each basis vector is a unit vector, then one has an orthonormal basis

Let \left\{\mathbf{a}_{k}\right\}_{k=1}^{n}, be a set of basis vectors for vector space V. We know that any vector \mathbf{v} can be represented in terms of this basis, \mathbf{v}=\sum_{k=1}^{n} v_{k} \mathbf{a}_{k}. If we know the basis and vector, can we find the components? The answer is, yes. We can use the scalar product of \mathbf{v} with each basis element \mathbf{a}_{j}. So, we have for j=1, \ldots, n

\begin{aligned} <\mathbf{a}_{j}, \mathbf{v}>&=<\mathbf{a}_{j}, \sum_{k=1}^{n} v_{k} \mathbf{a}_{k}>\\[4pt] &=\sum_{k=1}^{n} v_{k}<\mathbf{a}_{j}, \mathbf{a}_{k}> \end{aligned} \nonumber

Since we know the basis elements, we can easily compute the numbers

A_{j k} \equiv<\mathbf{a}_{j}, \mathbf{a}_{k}> \nonumber

and

b_{j} \equiv<\mathbf{a}_{j}, \mathbf{v}>. \nonumber

Therefore, the system (4.28) for the v_{k} ’s is a linear algebraic system, which takes the form A \mathbf{v}=\mathbf{b}. However, if the basis is orthogonal, then the matrix A is diagonal and the system is easily solvable. We have that

<\mathbf{a}_{j}, \mathbf{v}>=v_{j}<\mathbf{a}_{j}, \mathbf{a}_{j}>, \nonumber

v_{j}=\dfrac{\left\langle\mathbf{a}_{j}, \mathbf{v}\right\rangle}{\left.<\mathbf{a}_{j}, \mathbf{a}_{j}\right\rangle} . \nonumber

In fact, if the basis is orthonormal, A is the identity matrix and the solution is simpler:

v_{j}=<\mathbf{a}_{j}, \mathbf{v}>. \nonumber

We spent some time looking at this simple case of extracting the components of a vector in a finite dimensional space. The keys to doing this simply were to have a scalar product and an orthogonal basis set. These are the key ingredients that we will need in the infinite dimensional case. Recall that when we solved the heat equation, we had a function (vector) that we wanted to expand in a set of eigenfunctions (basis) and we needed to find the expansion coefficients (components). As you can see, we need to extend the concepts for finite dimensional spaces to their analogs in infinite dimensional spaces. Linear algebra will provide some of the backdrop for what is to follow: The study of many boundary value problems amounts to the solution of eigenvalue problems over infinite dimensional vector spaces (complete inner product spaces, the space of square integrable functions, or Hilbert spaces).

We will consider the space of functions of a certain type. They could be the space of continuous functions on [0,1], or the space of differentiably continuous functions, or the set of functions integrable from a to b. Later, we will specify the types of functions needed. We will further need to be able to add functions and multiply them by scalars. So, we can easily obtain a vector space of functions.

We will also need a scalar product defined on this space of functions. There are several types of scalar products, or inner products, that we can define. For a real vector space, we define

Definition 4.2. An inner product <,> on a real vector space V is a mapping from V \times V into R such that for u, v, w \in V and \alpha \in R one has

  1. <u+v, w>=<u, w>+<v, w>.
  2. <\alpha v, w>=\alpha<v, w>.
  3. <v, w>=\langle w, v>.
  4. <v, v>\geq 0 and <v, v>=0 iff v=0.

A real vector space equipped with the above inner product leads to a real inner product space. A more general definition with the third item replaced with \langle v, w\rangle=\langle w, v\rangle is needed for complex inner product spaces.

For the time being, we are dealing just with real valued functions. We need an inner product appropriate for such spaces. One such definition is the following. Let f(x) and g(x) be functions defined on [a, b]. Then, we define the inner product, if the integral exists, as

<f, g>=\int_{a}^{b} f(x) g(x) d x . \nonumber

So far, we have functions spaces equipped with an inner product. Can we find a basis for the space? For an n-dimensional space we need n basis vectors. For an infinite dimensional space, how many will we need? How do we know when we have enough? We will think about those things later.

Let’s assume that we have a basis of functions \left\{\phi_{n}(x)\right\}_{n=1}^{\infty}. Given a function f(x), how can we go about finding the components of f in this basis? In other words, let

f(x)=\sum_{n=1}^{\infty} c_{n} \phi_{n}(x) \nonumber

How do we find the c_{n} ’s? Does this remind you of the problem we had earlier?

Formally, we take the inner product of f with each \phi_{j}, to find

\begin{aligned} <\phi_{j}, f>&=<\phi_{j}, \sum_{n=1}^{\infty} c_{n} \phi_{n}>\\[4pt] &=\sum_{n=1}^{\infty} c_{n}<\phi_{j}, \phi_{n}> \end{aligned} \nonumber

If our basis is an orthogonal basis, then we have

<\phi_{j}, \phi_{n}>=N_{j} \delta_{j n}, \nonumber

where \delta_{i j} is the Kronecker delta defined as

\delta_{i j}= \begin{cases}0, & i \neq j \\[4pt] 1, & i=j\end{cases} \nonumber

Thus, we have

\begin{aligned} <\phi_{j}, f>&=\sum_{n=1}^{\infty} c_{n}<\phi_{j}, \phi_{n}>\\[4pt] &=\sum_{n=1}^{\infty} c_{n} N_{j} \delta_{j n} \\[4pt] &=c_{1} N_{j} \delta_{j 1}+c_{2} N_{j} \delta_{j 2}+\ldots+c_{j} N_{j} \delta_{j j}+\ldots \\[4pt] &=c_{j} N_{j} \end{aligned} \nonumber

So, the expansion coefficient is

c_{j}=\dfrac{<\phi_{j}, f>}{N_{j}}=\dfrac{<\phi_{j}, f>}{<\phi_{j}, \phi_{j}>} \nonumber

We summarize this important result:

Generalized Basis Expansion

Let f(x) be represented by an expansion over a basis of orthogonal functions, \left\{\phi_{n}(x)\right\}_{n=1}^{\infty},

f(x)=\sum_{n=1}^{\infty} c_{n} \phi_{n}(x) . \nonumber

Then, the expansion coefficients are formally determined as

c_{n}=\dfrac{\left.<\phi_{n}, f\right\rangle}{<\phi_{n}, \phi_{n}>} . \nonumber

In our preparation for later sections, let’s determine if the set of functions \phi_{n}(x)=\sin n x for n=1,2, \ldots is orthogonal on the interval [-\pi, \pi]. We need to show that <\phi_{n}, \phi_{m}>=0 for n \neq m. Thus, we have for n \neq m

\begin{aligned} <\phi_{n}, \phi_{m}>&=\int_{-\pi}^{\pi} \sin n x \sin m x d x \\[4pt] &=\dfrac{1}{2} \int_{-\pi}^{\pi}[\cos (n-m) x-\cos (n+m) x] d x \\[4pt] &=\dfrac{1}{2}\left[\dfrac{\sin (n-m) x}{n-m}-\dfrac{\sin (n+m) x}{n+m}\right]_{-\pi}^{\pi}=0 \end{aligned} \nonumber

Here we have made use of a trigonometric identity for the product of two sines. We recall how this identity is derived. Recall the addition formulae for cosines:

\begin{aligned} &\cos (A+B)=\cos A \cos B-\sin A \sin B \\[4pt] &\cos (A-B)=\cos A \cos B+\sin A \sin B \end{aligned} \nonumber

Adding, or subtracting, these equations gives

\begin{aligned} &2 \cos A \cos B=\cos (A+B)+\cos (A-B), \\[4pt] &2 \sin A \sin B=\cos (A-B)-\cos (A+B) . \end{aligned} \nonumber

So, we have determined that the set \phi_{n}(x)=\sin n x for n=1,2, \ldots is an orthogonal set of functions on the interval [=\pi, \pi]. Just as with vectors in three dimensions, we can normalize our basis functions to arrive at an orthonormal basis, <\phi_{n}, \phi_{m}>=\delta_{n m}, m, n=1,2, \ldots This is simply done by dividing by the length of the vector. Recall that the length of a vector was obtained as v=\sqrt{\mathbf{v} \cdot \mathbf{v}} In the same way, we define the norm of our functions by

\|f\|=\sqrt{<f, f>} . \nonumber

Note, there are many types of norms, but this will be sufficient for us. For the above basis of sine functions, we want to first compute the norm of each function. Then we would like to find a new basis from this one such that each basis eigenfunction has unit length and is therefore an orthonormal basis. We first compute

\begin{aligned} \left\|\phi_{n}\right\|^{2} &=\int_{-\pi}^{\pi} \sin ^{2} n x d x \\[4pt] &=\dfrac{1}{2} \int_{-\pi}^{\pi}[1-\cos 2 n x] d x \\[4pt] &=\dfrac{1}{2}\left[x-\dfrac{\sin 2 n x}{2 n}\right]_{-\pi}^{\pi}=\pi \end{aligned} \nonumber

We have found for our example that

<\phi_{n}, \phi_{m}>=\pi \delta_{n m} \nonumber

and that \left\|\phi_{n}\right\|=\sqrt{\pi}. Defining \psi_{n}(x)=\dfrac{1}{\sqrt{\pi}} \phi_{n}(x), we have normalized the \phi_{n} ’s and have obtained an orthonormal basis of functions on [-\pi, \pi].

Expansions of functions in trigonometric bases occur often and originally resulted from the study of partial differential equations. They have been named Fourier series and will be the topic of the next chapter.

Problems

4.1. Solve the following problem:

x^{\prime \prime}+x=2, \quad x(0)=0, \quad x^{\prime}(1)=0 . \nonumber

4.2. Find product solutions, u(x, t)=b(t) \phi(x), to the heat equation satisfying the boundary conditions u_{x}(0, t)=0 and u(L, t)=0. Use these solutions to find a general solution of the heat equation satisfying these boundary conditions.

4.3. Consider the following boundary value problems. Determine the eigenvalues, \lambda, and eigenfunctions, y(x) for each problem. { }^{2}
a. y^{\prime \prime}+\lambda y=0, \quad y(0)=0, \quad y^{\prime}(1)=0.
b. y^{\prime \prime}-\lambda y=0, \quad y(-\pi)=0, \quad y^{\prime}(\pi)=0.
c. x^{2} y^{\prime \prime}+x y^{\prime}+\lambda y=0, \quad y(1)=0, \quad y(2)=0.
d. \left(x^{2} y^{\prime}\right)^{\prime}+\lambda y=0, \quad y(1)=0, \quad y^{\prime}(e)=0 .

{ }^{2} In problem d you will not get exact eigenvalues. Show that you obtain a transcendental equation for the eigenvalues in the form \tan z=2 z. Find the first three eigenvalues numerically. 4.4. For the following sets of functions: i) show that each is orthogonal on the given interval, and ii) determine the corresponding orthonormal set.
n=1,2,3, \ldots, \quad 0 \leq x \leq \pi
a. \{\sin 2 n x\},
b. \{\cos n \pi x\},
n=0,1,2, \ldots, \quad 0 \leq x \leq 2
n=1,2,3, \ldots, \quad x \in[-L, L].
c. \left\{\sin \dfrac{n \pi x}{L}\right\},

4.5. Consider the boundary value problem for the deflection of a horizontal beam fixed at one end,

\dfrac{d^{4} y}{d x^{4}}=C, \quad y(0)=0, \quad y^{\prime}(0)=0, \quad y^{\prime \prime}(L)=0, \quad y^{\prime \prime \prime}(L)=0 \nonumber

Solve this problem assuming that C is a constant.

Fourier Series

Introduction

In this chapter we will look at trigonometric series. Previously, we saw that such series expansion occurred naturally in the solution of the heat equation and other boundary value problems. In the last chapter we saw that such functions could be viewed as a basis in an infinite dimensional vector space of functions. Given a function in that space, when will it have a representation as a trigonometric series? For what values of x will it converge? Finding such series is at the heart of Fourier, or spectral, analysis.

There are many applications using spectral analysis. At the root of these studies is the belief that many continuous waveforms are comprised of a number of harmonics. Such ideas stretch back to the Pythagorean study of the vibrations of strings, which lead to their view of a world of harmony. This idea was carried further by Johannes Kepler in his harmony of the spheres approach to planetary orbits. In the 1700 ’s others worked on the superposition theory for vibrating waves on a stretched spring, starting with the wave equation and leading to the superposition of right and left traveling waves. This work was carried out by people such as John Wallis, Brook Taylor and Jean le Rond d’Alembert.

In 1742 d’Alembert solved the wave equation

c^{2} \dfrac{\partial^{2} y}{\partial x^{2}}-\dfrac{\partial^{2} y}{\partial t^{2}}=0 \nonumber

where y is the string height and c is the wave speed. However, his solution led himself and others, like Leonhard Euler and Daniel Bernoulli, to investigate what "functions" could be the solutions of this equation. In fact, this lead to a more rigorous approach to the study of analysis by first coming to grips with the concept of a function. For example, in 1749 Euler sought the solution for a plucked string in which case the initial condition y(x, 0)=h(x) has a discontinuous derivative! In 1753 Daniel Bernoulli viewed the solutions as a superposition of simple vibrations, or harmonics. Such superpositions amounted to looking at solutions of the form

y(x, t)=\sum_{k} a_{k} \sin \dfrac{k \pi x}{L} \cos \dfrac{k \pi c t}{L}, \nonumber

where the string extends over the interval [0, L] with fixed ends at x=0 and x=L. However, the initial conditions for such superpositions are

y(x, 0)=\sum_{k} a_{k} \sin \dfrac{k \pi x}{L} . \nonumber

It was determined that many functions could not be represented by a finite number of harmonics, even for the simply plucked string given by an initial condition of the form

y(x, 0)=\left\{\begin{array}{cc} c x, & 0 \leq x \leq L / 2 \\[4pt] c(L-x), & L / 2 \leq x \leq L \end{array}\right. \nonumber

Thus, the solution consists generally of an infinite series of trigonometric functions.

Such series expansions were also of importance in Joseph Fourier’s solution of the heat equation. The use of such Fourier expansions became an important tool in the solution of linear partial differential equations, such as the wave equation and the heat equation. As seen in the last chapter, using the Method of Separation of Variables, allows higher dimensional problems to be reduced to several one dimensional boundary value problems. However, these studies lead to very important questions, which in turn opened the doors to whole fields of analysis. Some of the problems raised were

  1. What functions can be represented as the sum of trigonometric functions?
  2. How can a function with discontinuous derivatives be represented by a sum of smooth functions, such as the above sums?
  3. Do such infinite sums of trigonometric functions a actually converge to the functions they represents?

Sums over sinusoidal functions naturally occur in music and in studying sound waves. A pure note can be represented as

y(t)=A \sin (2 \pi f t), \nonumber

where A is the amplitude, f is the frequency in hertz (\mathrm{Hz}), and t is time in seconds. The amplitude is related to the volume, or intensity, of the sound. The larger the amplitude, the louder the sound. In Figure 5.1 we show plots of two such tones with f=2 \mathrm{~Hz} in the top plot and f=5 \mathrm{~Hz} in the bottom one.

image
Figure 5.1. Plots of y(t)=\sin (2 \pi f t) on [0,5] for f=2 \mathrm{~Hz} and f=5 \mathrm{~Hz}.

different amplitudes and frequencies. In Figure 5.2 we see what happens when we add several sinusoids. Note that as one adds more and more tones with different characteristics, the resulting signal gets more complicated. However, we still have a function of time. In this chapter we will ask, "Given a function f(t), can we find a set of sinusoidal functions whose sum converges to f(t) ?"

Looking at the superpositions in Figure 5.2, we see that the sums yield functions that appear to be periodic. This is not to be unexpected. We recall that a periodic function is one in which the function values repeat over the domain of the function. The length of the smallest part of the domain which repeats is called the period. We can define this more precisely.

Definition 5.1. A function is said to be periodic with period T if f(t+T)= f(t) for all t and the smallest such positive number T is called the period.

For example, we consider the functions used in Figure 5.2. We began with y(t)=2 \sin (4 \pi t). Recall from your first studies of trigonometric functions that one can determine the period by dividing the coefficient of t into 2 \pi to get the period. In this case we have

T=\dfrac{2 \pi}{4 \pi}=\dfrac{1}{2} . \nonumber

Looking at the top plot in Figure 5.1 we can verify this result. (You can count the full number of cycles in the graph and divide this into the total time to get a more accurate value of the period.)

image
Figure 5.2. Superposition of several sinusoids. Top: Sum of signals with f=2 \mathrm{~Hz} and f=5 \mathrm{~Hz}. Bottom: Sum of signals with f=2 \mathrm{~Hz}, f=5 \mathrm{~Hz}, and and f=8 \mathrm{~Hz}.

Of course, this result makes sense, as the unit of frequency, the hertz, is also defined as s^{-1}, or cycles per second.

Returning to the superpositions in Figure 5.2, we have that y(t)= \sin (10 \pi t) has a period of 0.2 \mathrm{~Hz} and y(t)=\sin (16 \pi t) has a period of 0.125 \mathrm{~Hz}. The two superpositions retain the largest period of the signals added, which is 0.5 \mathrm{~Hz}.

Our goal will be to start with a function and then determine the amplitudes of the simple sinusoids needed to sum to that function. First of all, we will see that this might involve an infinite number of such terms. Thus, we will be studying an infinite series of sinusoidal functions.

Secondly, we will find that using just sine functions will not be enough either. This is because we can add sinusoidal functions that do not necessarily peak at the same time. We will consider two signals that originate at different times. This is similar to when your music teacher would make sections of the class sing a song like "Row, Row, Row your Boat" starting at slightly different times.

We can easily add shifted sine functions. In Figure 5.3 we show the functions y(t)=2 \sin (4 \pi t) and y(t)=2 \sin (4 \pi t+7 \pi / 8) and their sum. Note that this shifted sine function can be written as y(t)=2 \sin (4 \pi(t+7 / 32)). Thus, this corresponds to a time shift of -7 / 8.

image
Figure 5.3. Plot of the functions y(t)=2 \sin (4 \pi t) and y(t)=2 \sin (4 \pi t+7 \pi / 8) and their sum.

We are now in a position to state our goal in this chapter.

Goal

Given a signal f(t), we would like to determine its frequency content by finding out what combinations of sines and cosines of varying frequencies and amplitudes will sum to the given function. This is called Fourier Analysis.

Fourier Trigonometric Series

As we have seen in the last section, we are interested in finding representations of functions in terms of sines and cosines. Given a function f(x) we seek a representation in the form

f(x) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos n x+b_{n} \sin n x\right] \nonumber

Notice that we have opted to drop reference to the frequency form of the phase. This will lead to a simpler discussion for now and one can always make the transformation n x=2 \pi f_{n} t when applying these ideas to applications.

The series representation in Equation (5.1) is called a Fourier trigonometric series. We will simply refer to this as a Fourier series for now. The set of constants a_{0}, a_{n}, b_{n}, n=1,2, \ldots are called the Fourier coefficients. The constant term is chosen in this form to make later computations simpler, though some other authors choose to write the constant term as a_{0}. Our goal is to find the Fourier series representation given f(x). Having found the Fourier series representation, we will be interested in determining when the Fourier series converges and to what function it converges.

From our discussion in the last section, we see that the infinite series is periodic. The largest period of the terms comes from the n=1 terms. The periods of \cos x and \sin x are T=2 \pi. Thus, the Fourier series has period 2 \pi. This means that the series should be able to represent functions that are periodic of period 2 \pi.

While this appears restrictive, we could also consider functions that are defined over one period. In Figure 5.4 we show a function defined on [0,2 \pi]. In the same figure, we show its periodic extension. These are just copies of the original function shifted by the period and glued together. The extension can now be represented by a Fourier series and restricting the Fourier series to [0,2 \pi] will give a representation of the original function. Therefore, we will first consider Fourier series representations of functions defined on this interval. Note that we could just as easily considered functions defined on [-\pi, \pi] or any interval of length 2 \pi.

Fourier Coefficients

image
Figure 5.4. Plot of the functions f(t) defined on [0,2 \pi] and its periodic extension.

These expressions for the Fourier coefficients are obtained by considering special integrations of the Fourier series. We will look at the derivations of the a_{n} ’s. First we obtain a_{0}.

We begin by integrating the Fourier series term by term in Equation (5.1).

\int_{0}^{2 \pi} f(x) d x=\int_{0}^{2 \pi} \dfrac{a_{0}}{2} d x+\int_{0}^{2 \pi} \sum_{n=1}^{\infty}\left[a_{n} \cos n x+b_{n} \sin n x\right] d x \nonumber

We assume that we can integrate the infinite sum term by term. Then we need to compute

\begin{gathered} \int_{0}^{2 \pi} \dfrac{a_{0}}{2} d x=\dfrac{a_{0}}{2}(2 \pi)=\pi a_{0}, \\[4pt] \int_{0}^{2 \pi} \cos n x d x=\left[\dfrac{\sin n x}{n}\right]_{0}^{2 \pi}=0 \\[4pt] \int_{0}^{2 \pi} \sin n x d x=\left[\dfrac{-\cos n x}{n}\right]_{0}^{2 \pi}=0 \end{gathered} \nonumber

From these results we see that only one term in the integrated sum does not vanish leaving

\int_{0}^{2 \pi} f(x) d x=\pi a_{0} \nonumber

This confirms the value for a_{0}.

Next, we need to find a_{n}. We will multiply the Fourier series (5.1) by \cos m x for some positive integer m. This is like multiplying by \cos 2 x, \cos 5 x, etc. We are multiplying by all possible \cos m x functions for different integers m all at the same time. We will see that this will allow us to solve for the a_{n} ’s.

We find the integrated sum of the series times \cos m x is given by

\begin{aligned} \int_{0}^{2 \pi} f(x) \cos m x d x &=\int_{0}^{2 \pi} \dfrac{a_{0}}{2} \cos m x d x \\[4pt] &+\int_{0}^{2 \pi} \sum_{n=1}^{\infty}\left[a_{n} \cos n x+b_{n} \sin n x\right] \cos m x d x \end{aligned} \nonumber

Integrating term by term, the right side becomes

\dfrac{a_{0}}{2} \int_{0}^{2 \pi} \cos m x d x+\sum_{n=1}^{\infty}\left[a_{n} \int_{0}^{2 \pi} \cos n x \cos m x d x+b_{n}^{2 \pi} \int_{0}^{2 \pi} \sin n x \cos m x d x\right].

We have already established that \int_{0}^{2 \pi} \cos m x d x=0, which implies that the first term vanishes.

Next we need to compute integrals of products of sines and cosines. This requires that we make use of some trigonometric identities. While you have seen such integrals before in your calculus class, we will review how to carry out such integrals. For future reference, we list several useful identities, some of which we will prove along the way.

\begin{aligned} Useful Trigonometric Identities $& \\[4pt] \sin (x \pm y) &=\sin x \cos y \pm \sin y \cos x \\[4pt] \cos (x \pm y) &=\cos x \cos y \mp \sin x \sin y \\[4pt] \sin ^{2} x &=\dfrac{1}{2}(1-\cos 2 x) \\[4pt] \cos ^{2} x &=\dfrac{1}{2}(1+\cos 2 x) \\[4pt] \sin x \sin y &=\dfrac{1}{2}(\cos (x-y)-\cos (x+y)) \\[4pt] \cos x \cos y &=\dfrac{1}{2}(\cos (x+y)+\cos (x-y)) \\[4pt] \sin x \cos y &=\dfrac{1}{2}(\sin (x+y)+\sin (x-y)) \\[4pt]$\hline \end{aligned} \nonumber

We first want to evaluate \int_{0}^{2 \pi} \cos n x \cos m x d x. We do this by using the product identity. We had done this in the last chapter, but will repeat the derivation for the reader’s benefit. Recall the addition formulae for cosines:

\cos (A+B)=\cos A \cos B-\sin A \sin B \nonumber

\cos (A-B)=\cos A \cos B+\sin A \sin B \nonumber

Adding these equations gives

2 \cos A \cos B=\cos (A+B)+\cos (A-B) . \nonumber

We can use this identity with A=m x and B=n x to complete the integration. We have

\begin{aligned} \int_{0}^{2 \pi} \cos n x \cos m x d x &=\dfrac{1}{2} \int_{0}^{2 \pi}[\cos (m+n) x+\cos (m-n) x] d x \\[4pt] &=\dfrac{1}{2}\left[\dfrac{\sin (m+n) x}{m+n}+\dfrac{\sin (m-n) x}{m-n}\right]_{0}^{2 \pi} \\[4pt] &=0 . \end{aligned} \nonumber

There is one caveat when doing such integrals. What if one of the denominators m \pm n vanishes? For our problem m+n \neq 0, since both m and n are positive integers. However, it is possible for m=n. This means that the vanishing of the integral can only happen when m \neq n. So, what can we do about the m=n case? One way is to start from scratch with our integration. (Another way is to compute the limit as n approaches m in our result and use L’Hopital’s Rule. Try it!)

So, for n=m we have to compute \int_{0}^{2 \pi} \cos ^{2} m x d x. This can also be handled using a trigonometric identity. Recall that

\cos ^{2} \theta=\dfrac{1}{2}(1+\cos 2 \theta \text {. }) \nonumber

Inserting this into the integral, we find

\begin{aligned} \int_{0}^{2 \pi} \cos ^{2} m x d x &=\dfrac{1}{2} \int_{0}^{2 \pi}\left(1+\cos ^{2} 2 m x\right) d x \\[4pt] &=\dfrac{1}{2}\left[x+\dfrac{1}{2 m} \sin 2 m x\right]_{0}^{2 \pi} \\[4pt] &=\dfrac{1}{2}(2 \pi)=\pi \end{aligned} \nonumber

To summarize, we have shown that

\int_{0}^{2 \pi} \cos n x \cos m x d x=\left\{\begin{array}{l} 0, m \neq n \\[4pt] \pi, m=n \end{array}\right. \nonumber

This holds true for m, n=0,1, \ldots [Why did we include m, n=0 ?] When we have such a set of functions, they are said to be an orthogonal set over the integration interval.

Definition 5.3. A set of (real) functions \left\{\phi_{n}(x)\right\} is said to be orthogonal on [a, b] if \int_{a}^{b} \phi_{n}(x) \phi_{m}(x) d x=0 when n \neq m. Furthermore, if we also have that \int_{a}^{b} \phi_{n}^{2}(x) d x=1, these functions are called orthonormal. The set of functions \{\cos n x\}_{n=0}^{\infty} are orthogonal on [0,2 \pi]. Actually, they are orthogonal on any interval of length 2 \pi. We can make them orthonormal by dividing each function by \sqrt{\pi} as indicated by Equation (5.15).

The notion of orthogonality is actually a generalization of the orthogonality of vectors in finite dimensional vector spaces. The integral \int_{a}^{b} f(x) f(x) d x is the generalization of the dot product, and is called the scalar product of f(x) and g(x), which are thought of as vectors in an infinite dimensional vector space spanned by a set of orthogonal functions. But that is another topic for later.

Returning to the evaluation of the integrals in equation (5.6), we still have to evaluate \int_{0}^{2 \pi} \sin n x \cos m x d x. This can also be evaluated using trigonometric identities. In this case, we need an identity involving products of sines and cosines. Such products occur in the addition formulae for sine functions:

\begin{aligned} &\sin (A+B)=\sin A \cos B+\sin B \cos A \\[4pt] &\sin (A-B)=\sin A \cos B-\sin B \cos A \end{aligned} \nonumber

Adding these equations, we find that

\sin (A+B)+\sin (A-B)=2 \sin A \cos B \nonumber

Setting A=n x and B=m x, we find that

\begin{aligned} \int_{0}^{2 \pi} \sin n x \cos m x d x &=\dfrac{1}{2} \int_{0}^{2 \pi}[\sin (n+m) x+\sin (n-m) x] d x \\[4pt] &=\dfrac{1}{2}\left[\dfrac{-\cos (n+m) x}{n+m}+\dfrac{-\cos (n-m) x}{n-m}\right]_{0}^{2 \pi} \\[4pt] &=(-1+1)+(-1+1)=0 \end{aligned} \nonumber

For these integrals we also should be careful about setting n=m. In this special case, we have the integrals

\int_{0}^{2 \pi} \sin m x \cos m x d x=\dfrac{1}{2} \int_{0}^{2 \pi} \sin 2 m x d x=\dfrac{1}{2}\left[\dfrac{-\cos 2 m x}{2 m}\right]_{0}^{2 \pi}=0 \nonumber

Finally, we can finish our evaluation of (5.6). We have determined that all but one integral vanishes. In that case, n=m. This leaves us with

\int_{0}^{2 \pi} f(x) \cos m x d x=a_{m} \pi \nonumber

Solving for a_{m} gives

a_{m}=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) \cos m x d x \nonumber

Since this is true for all m=1,2, \ldots, we have proven this part of the theorem. The only part left is finding the b_{n} ’s This will be left as an exercise for the reader.

We now consider examples of finding Fourier coefficients for given functions. In all of these cases we define f(x) on [0,2 \pi].

Example 5.4. f(x)=3 \cos 2 x, x \in[0,2 \pi].

We first compute the integrals for the Fourier coefficients.

\begin{aligned} &a_{0}=\dfrac{1}{\pi} \int_{0}^{2 \pi} 3 \cos 2 x d x=0 . \\[4pt] &a_{n}=\dfrac{1}{\pi} \int_{0}^{2 \pi} 3 \cos 2 x \cos n x d x=0, \quad n \neq 2 . \\[4pt] &a_{2}=\dfrac{1}{\pi} \int_{0}^{2 \pi} 3 \cos ^{2} 2 x d x=3, \\[4pt] &b_{n}=\dfrac{1}{\pi} \int_{0}^{2 \pi} 3 \cos 2 x \sin n x d x=0, \forall n . \end{aligned} \nonumber

Therefore, we have that the only nonvanishing coefficient is a_{2}=3. So there is one term and f(x)=3 \cos 2 x. Well, we should have know this before doing all of these integrals. So, if we have a function expressed simply in terms of sums of simple sines and cosines, then it should be easy to write down the Fourier coefficients without much work.

Example 5.5. f(x)=\sin ^{2} x, x \in[0,2 \pi].

We could determine the Fourier coefficients by integrating as in the last example. However, it is easier to use trigonometric identities. We know that

\sin ^{2} x=\dfrac{1}{2}(1-\cos 2 x)=\dfrac{1}{2}-\dfrac{1}{2} \cos 2 x . \nonumber

There are no sine terms, so b_{n}=0, n=1,2, \ldots There is a constant term, implying a_{0} / 2=1 / 2. So, a_{0}=1. There is a \cos 2 x term, corresponding to n=2, so a_{2}=-\dfrac{1}{2}. That leaves a_{n}=0 for n \neq 0,2.

Example 5.6. f(x)=\left\{\begin{array}{c}1, \quad 0<x<\pi \\[4pt] -1, \pi<x<2 \pi\end{array}\right..

This example will take a little more work. We cannot bypass evaluating any integrals at this time. This function is discontinuous, so we will have to compute each integral by breaking up the integration into two integrals, one over [0, \pi] and the other over [\pi, 2 \pi].

a_{0}=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) d x \nonumber

We have found the Fourier coefficients for this function. Before inserting them into the Fourier series (5.1), we note that \cos n \pi=(-1)^{n}. Therefore,

1-\cos n \pi=\left\{\begin{array}{l} 0, n \text { even } \\[4pt] 2, n \text { odd. } \end{array}\right. \nonumber

So, half of the b_{n} ’s are zero. While we could write the Fourier series representation as

f(x) \sim \dfrac{4}{\pi} \sum_{n=1, \text { odd }}^{\infty} \dfrac{1}{n} \sin n x \nonumber

we could let n=2 k-1 and write

f(x)=\dfrac{4}{\pi} \sum_{k=1}^{\infty} \dfrac{\sin (2 k-1) x}{2 k-1} \nonumber

But does this series converge? Does it converge to f(x) ? We will discuss this question later in the chapter.

\begin{aligned} & =\dfrac{1}{\pi} \int_{0}^{\pi} d x+\dfrac{1}{\pi} \int_{\pi}^{2 \pi}(-1) d x \\[4pt] & =\dfrac{1}{\pi}(\pi)+\dfrac{1}{\pi}(-2 \pi+\pi)=0 . \\[4pt] & a_{n}=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) \cos n x d x \\[4pt] & =\dfrac{1}{\pi}\left[\int_{0}^{\pi} \cos n x d x-\int_{\pi}^{2 \pi} \cos n x d x\right] \\[4pt] & =\dfrac{1}{\pi}\left[\left(\dfrac{1}{n} \sin n x\right)_{0}^{\pi}-\left(\dfrac{1}{n} \sin n x\right)_{\pi}^{2 \pi}\right] \\[4pt] & =0 \text {. } \\[4pt] & b_{n}=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) \sin n x d x \\[4pt] & =\dfrac{1}{\pi}\left[\int_{0}^{\pi} \sin n x d x-\int_{\pi}^{2 \pi} \sin n x d x\right] \\[4pt] & =\dfrac{1}{\pi}\left[\left(-\dfrac{1}{n} \cos n x\right)_{0}^{\pi}+\left(\dfrac{1}{n} \cos n x\right)_{\pi}^{2 \pi}\right] \\[4pt] & =\dfrac{1}{\pi}\left[-\dfrac{1}{n} \cos n \pi+\dfrac{1}{n}+\dfrac{1}{n}-\dfrac{1}{n} \cos n \pi\right] \\[4pt] & =\dfrac{2}{n \pi}(1-\cos n \pi) \text {. } \end{aligned} \nonumber

Fourier Series Over Other Intervals

In many applications we are interested in determining Fourier series representations of functions defined on intervals other than [0,2 \pi]. In this section we will determine the form of the series expansion and the Fourier coefficients in these cases.

The most general type of interval is given as [a, b]. However, this often is too general. More common intervals are of the form [-\pi, \pi],[0, L], or [-L / 2, L / 2]. The simplest generalization is to the interval [0, L]. Such intervals arise often in applications. For example, one can study vibrations of a one dimensional string of length L and set up the axes with the left end at x=0 and the right end at x=L. Another problem would be to study the temperature distribution along a one dimensional rod of length L. Such problems lead to the original studies of Fourier series. As we will see later, symmetric intervals, [-a, a], are also useful.

Given an interval [0, L], we could apply a transformation to an interval of length 2 \pi by simply rescaling our interval. Then we could apply this transformation to our Fourier series representation to obtain an equivalent one useful for functions defined on [0, L].

We define x \in[0,2 \pi] and t \in[0, L]. A linear transformation relating these intervals is simply x=\dfrac{2 \pi t}{L} as shown in Figure 5.5. So, t=0 maps to x=0 and t=L maps to x=2 \pi. Furthermore, this transformation maps f(x) to a new function g(t)=f(x(t)), which is defined on [0, L]. We will determine the Fourier series representation of this function using the representation for f(x)

image
Figure 5.5. A sketch of the transformation between intervals x \in[0,2 \pi] and t \in[0, L].

Recall the form of the Fourier representation for f(x) in Equation (5.1):

f(x) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos n x+b_{n} \sin n x\right] \nonumber

Inserting the transformation relating x and t, we have

g(t) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi t}{L}+b_{n} \sin \dfrac{2 n \pi t}{L}\right] . \nonumber

This gives the form of the series expansion for g(t) with t \in[0, L]. But, we still need to determine the Fourier coefficients. Recall, that

a_{n}=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) \cos n x d x \nonumber

We need to make a substitution in the integral of x=\dfrac{2 \pi t}{L}. We also will need to transform the differential, d x=\dfrac{2 \pi}{L} d t. Thus, the resulting form for our coefficient is

a_{n}=\dfrac{2}{L} \int_{0}^{L} g(t) \cos \dfrac{2 n \pi t}{L} d t \nonumber

Similarly, we find that

b_{n}=\dfrac{2}{L} \int_{0}^{L} g(t) \sin \dfrac{2 n \pi t}{L} d t \nonumber

We note first that when L=2 \pi we get back the series representation that we first studied. Also, the period of \cos \dfrac{2 n \pi t}{L} is L / n, which means that the representation for g(t) has a period of L.

At the end of this section we present the derivation of the Fourier series representation for a general interval for the interested reader. In Table 5.1 we summarize some commonly used Fourier series representations.

We will end our discussion for now with some special cases and an example for a function defined on [-\pi, \pi].

Example 5.7. Let f(x)=|x| on [-\pi, \pi] We compute the coefficients, beginning as usual with a_{0}. We have

\begin{aligned} a_{0} &=\dfrac{1}{\pi} \int_{-\pi}^{\pi}|x| d x \\[4pt] &=\dfrac{2}{\pi} \int_{0}^{\pi}|x| d x=\pi \end{aligned} \nonumber

At this point we need to remind the reader about the integration of even and odd functions.

  1. Even Functions: In this evaluation we made use of the fact that the integrand is an even function. Recall that f(x) is an even function if f(-x)=f(x) for all x. One can recognize even functions as they are symmetric with respect to the y-axis as shown in Figure 5.6(A). If one integrates an even function over a symmetric interval, then one has that

\int_{-a}^{a} f(x) d x=2 \int_{0}^{a} f(x) d x \nonumber

One can prove this by splitting off the integration over negative values of x, using the substitution x=-y, and employing the evenness of f(x). Thus, Table 5.1. Special Fourier Series Representations on Different Intervals

Fourier Series on [0, L]

\begin{aligned} f(x) & \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi x}{L}+b_{n} \sin \dfrac{2 n \pi x}{L}\right] \\[4pt] a_{n} &=\dfrac{2}{L} \int_{0}^{L} f(x) \cos \dfrac{2 n \pi x}{L} d x . \quad n=0,1,2, \ldots, \\[4pt] b_{n} &=\dfrac{2}{L} \int_{0}^{L} f(x) \sin \dfrac{2 n \pi x}{L} d x . \quad n=1,2, \ldots \end{aligned} \nonumber

Fourier Series on \left[-\dfrac{L}{2}, \dfrac{L}{2}\right]

\begin{aligned} f(x) & \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi x}{L}+b_{n} \sin \dfrac{2 n \pi x}{L}\right] . \\[4pt] a_{n} &=\dfrac{2}{L} \int_{-\dfrac{L}{2}}^{\dfrac{L}{2}} f(x) \cos \dfrac{2 n \pi x}{L} d x . \quad n=0,1,2, \ldots, \\[4pt] b_{n} &=\dfrac{2}{L} \int_{-\dfrac{L}{2}}^{\dfrac{L}{2}} f(x) \sin \dfrac{2 n \pi x}{L} d x . \quad n=1,2, \ldots \end{aligned} \nonumber

Fourier Series on [-\pi, \pi]

\begin{gathered} f(x) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos n x+b_{n} \sin n x\right] . \\[4pt] a_{n}=\dfrac{1}{\pi} \int_{-\pi}^{\pi} f(x) \cos n x d x . \quad n=0,1,2, \ldots, \\[4pt] b_{n}=\dfrac{1}{\pi} \int_{-\pi}^{\pi} f(x) \sin n x d x . \quad n=1,2, \ldots \end{gathered} \nonumber

\begin{aligned} \int_{-a}^{a} f(x) d x &=\int_{-a}^{0} f(x) d x+\int_{0}^{a} f(x) d x \\[4pt] &=-\int_{a}^{0} f(-y) d y+\int_{0}^{a} f(x) d x \\[4pt] &=\int_{0}^{a} f(y) d y+\int_{0}^{a} f(x) d x \\[4pt] &=2 \int_{0}^{a} f(x) d x \end{aligned} \nonumber

This can be visually verified by looking at Figure 5.6(\mathrm{~A}).

  1. Odd Functions: A similar computation could be done for odd functions. f(x) is an odd function if f(-x)=-f(x) for all x. The graphs of such functions are symmetric with respect to the origin as shown in Figure 5.6(\mathrm{~B}). If one integrates an odd function over a symmetric interval, then one has that
image
Figure 5.6. Examples of the areas under (A) even and (B) odd functions on symmetric intervals, [-a, a]

We now continue with our computation of the Fourier coefficients for f(x)=|x| on [-\pi, \pi]. We have

a_{n}=\dfrac{1}{\pi} \int_{-\pi}^{\pi}|x| \cos n x d x=\dfrac{2}{\pi} \int_{0}^{\pi} x \cos n x d x . \nonumber

Here we have made use of the fact that |x| \cos n x is an even function. In order to compute the resulting integral, we need to use integration by parts,

\int_{a}^{b} u d v=\left.u v\right|_{a} ^{b}-\int_{a}^{b} v d u \nonumber

by letting u=x and d v=\cos n x d x. Thus, d u=d x and v=\int d v=\dfrac{1}{n} \sin n x. Continuing with the computation, we have

\begin{aligned} a_{n} &=\dfrac{2}{\pi} \int_{0}^{\pi} x \cos n x d x \\[4pt] &=\dfrac{2}{\pi}\left[\left.\dfrac{1}{n} x \sin n x\right|_{0} ^{\pi}-\dfrac{1}{n} \int_{0}^{\pi} \sin n x d x\right] \\[4pt] &=-\dfrac{2}{n \pi}\left[-\dfrac{1}{n} \cos n x\right]_{0}^{\pi} \\[4pt] &=-\dfrac{2}{\pi n^{2}}\left(1-(-1)^{n}\right) \end{aligned} \nonumber

Here we have used the fact that \cos n \pi=(-1)^{n} for any integer n. This lead to a factor \left(1-(-1)^{n}\right). This factor can be simplified as

1-(-1)^{n}=\left\{\begin{array}{l} 2, n \text { odd } \\[4pt] 0, n \text { even } \end{array}\right. \nonumber

So, a_{n}=0 for n even and a_{n}=-\dfrac{4}{\pi n^{2}} for n odd.

Computing the b_{n} ’s is simpler. We note that we have to integrate |x| \sin n x from x=-\pi to \pi. The integrand is an odd function and this is a symmetric interval. So, the result is that b_{n}=0 for all n.

Putting this all together, the Fourier series representation of f(x)=|x| on [-\pi, \pi] is given as

f(x) \sim \dfrac{\pi}{2}-\dfrac{4}{\pi} \sum_{n=1, \text { odd }}^{\infty} \dfrac{\cos n x}{n^{2}} \nonumber

While this is correct, we can rewrite the sum over only odd n by reindexing. We let n=2 k-1 for k=1,2,3, \ldots Then we only get the odd integers. The series can then be written as

f(x) \sim \dfrac{\pi}{2}-\dfrac{4}{\pi} \sum_{k=1}^{\infty} \dfrac{\cos (2 k-1) x}{(2 k-1)^{2}} . \nonumber

Throughout our discussion we have referred to such results as Fourier representations. We have not looked at the convergence of these series. Here is an example of an infinite series of functions. What does this series sum to? We show in Figure 5.7 the first few partial sums. They appear to be converging to f(x)=|x| fairly quickly.

Even though f(x) was defined on [-\pi, \pi] we can still evaluate the Fourier series at values of x outside this interval. In Figure 5.8, we see that the representation agrees with f(x) on the interval [-\pi, \pi]. Outside this interval we have a periodic extension of f(x) with period 2 \pi.

Another example is the Fourier series representation of f(x)=x on [-\pi, \pi] as left for Problem 5.1. This is determined to be

f(x) \sim 2 \sum_{n=1}^{\infty} \dfrac{(-1)^{n+1}}{n} \sin n x . \nonumber

As seen in Figure 5.9 we again obtain the periodic extension of our function. In this case we needed many more terms. Also, the vertical parts of the first plot are nonexistent. In the second plot we only plot the points and not the typical connected points that most software packages plot as the default style.

image
Figure 5.7. Plot of the first partial sums of the Fourier series representation for f(x)= |x|
image
Figure 5.8. Plot of the first 10 terms of the Fourier series representation for f(x)=|x| on the interval [-2 \pi, 4 \pi].
image
Figure 5.9. Plot of the first 10 terms and 200 terms of the Fourier series representation for f(x)=x on the interval [-2 \pi, 4 \pi]

\pi=4\left[1-\dfrac{1}{3}+\dfrac{1}{5}-\dfrac{1}{7}+\ldots\right] \nonumber

Fourier Series on [a, b]

A Fourier series representation is also possible for a general interval, t \in[a, b]. As before, we just need to transform this interval to [0,2 \pi]. Let

x=2 \pi \dfrac{t-a}{b-a} . \nonumber

Inserting this into the Fourier series (5.1) representation for f(x) we obtain

g(t) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi(t-a)}{b-a}+b_{n} \sin \dfrac{2 n \pi(t-a)}{b-a}\right] \nonumber

Well, this expansion is ugly. It is not like the last example, where the transformation was straightforward. If one were to apply the theory to applications, it might seem to make sense to just shift the data so that a=0 and be done with any complicated expressions. However, mathematics students enjoy the challenge of developing such generalized expressions. So, let’s see what is involved.

First, we apply the addition identities for trigonometric functions and rearrange the terms.

\begin{aligned} g(t) \sim & \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi(t-a)}{b-a}+b_{n} \sin \dfrac{2 n \pi(t-a)}{b-a}\right] \\[4pt] =& \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a _ { n } \left(\cos \dfrac{2 n \pi t}{b-a} \cos \dfrac{2 n \pi a}{b-a}+\sin \dfrac{2 n \pi t}{b-a} \sin \dfrac{2 n \pi a}{b-}\right.\right.\\[4pt] &\left.+b_{n}\left(\sin \dfrac{2 n \pi t}{b-a} \cos \dfrac{2 n \pi a}{b-a}-\cos \dfrac{2 n \pi t}{b-a} \sin \dfrac{2 n \pi a}{b-a}\right)\right] \\[4pt] =& \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[\cos \dfrac{2 n \pi t}{b-a}\left(a_{n} \cos \dfrac{2 n \pi a}{b-a}-b_{n} \sin \dfrac{2 n \pi a}{b-a}\right)\right.\\[4pt] &\left.+\sin \dfrac{2 n \pi t}{b-a}\left(a_{n} \sin \dfrac{2 n \pi a}{b-a}+b_{n} \cos \dfrac{2 n \pi a}{b-a}\right)\right] \end{aligned} \nonumber

Defining A_{0}=a_{0} and

\begin{aligned} A_{n} & \equiv a_{n} \cos \dfrac{2 n \pi a}{b-a}-b_{n} \sin \dfrac{2 n \pi a}{b-a} \\[4pt] B_{n} \equiv a_{n} \sin \dfrac{2 n \pi a}{b-a}+b_{n} \cos \dfrac{2 n \pi a}{b-a} \end{aligned} \nonumber

we arrive at the more desirable form for the Fourier series representation of a function defined on the interval [a, b].

g(t) \sim \dfrac{A_{0}}{2}+\sum_{n=1}^{\infty}\left[A_{n} \cos \dfrac{2 n \pi t}{b-a}+B_{n} \sin \dfrac{2 n \pi t}{b-a}\right] . \nonumber

We next need to find expressions for the Fourier coefficients. We insert the known expressions for a_{n} and b_{n} and rearrange. First, we note that under the transformation x=2 \pi \dfrac{t-a}{b-a} we have

\begin{aligned} a_{n} &=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) \cos n x d x \\[4pt] &=\dfrac{2}{b-a} \int_{a}^{b} g(t) \cos \dfrac{2 n \pi(t-a)}{b-a} d t \end{aligned} \nonumber

and

\begin{aligned} b_{n} &=\dfrac{1}{\pi} \int_{0}^{2 \pi} f(x) \cos n x d x \\[4pt] &=\dfrac{2}{b-a} \int_{a}^{b} g(t) \sin \dfrac{2 n \pi(t-a)}{b-a} d t \end{aligned} \nonumber

Then, inserting these integrals in A_{n}, combining integrals and making use of the addition formula for the cosine of the sum of two angles, we obtain

\begin{aligned} A_{n} & \equiv a_{n} \cos \dfrac{2 n \pi a}{b-a}-b_{n} \sin \dfrac{2 n \pi a}{b-a} \\[4pt] &=\dfrac{2}{b-a} \int_{a}^{b} g(t)\left[\cos \dfrac{2 n \pi(t-a)}{b-a} \cos \dfrac{2 n \pi a}{b-a}-\sin \dfrac{2 n \pi(t-a)}{b-a} \sin \dfrac{2 n \pi a}{b-a}\right] d t \\[4pt] &=\dfrac{2}{b-a} \int_{a}^{b} g(t) \cos \dfrac{2 n \pi t}{b-a} d t \end{aligned} \nonumber

A similar computation gives

B_{n}=\dfrac{2}{b-a} \int_{a}^{b} g(t) \sin \dfrac{2 n \pi t}{b-a} d t \nonumber

Summarizing, we have shown that:

Theorem 5.9. The Fourier series representation of f(x) defined on [a, b] when it exists, is given by

f(x) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi x}{b-a}+b_{n} \sin \dfrac{2 n \pi x}{b-a}\right] . \nonumber

with Fourier coefficients

\begin{aligned} a_{n} &=\dfrac{2}{b-a} \int_{a}^{b} f(x) \cos \dfrac{2 n \pi x}{b-a} d x . \quad n=0,1,2, \ldots, \\[4pt] b_{n} &=\dfrac{2}{b-a} \int_{a}^{b} f(x) \sin \dfrac{2 n \pi x}{b-a} d x . \quad n=1,2, \ldots \end{aligned} \nonumber

5.4 Sine and Cosine Series

In the last two examples (f(x)=|x| and f(x)=x on [-\pi, \pi]) we have seen Fourier series representations that contain only sine or cosine terms. As we know, the sine functions are odd functions and thus sum to odd functions. Similarly, cosine functions sum to even functions. Such occurrences happen often in practice. Fourier representations involving just sines are called sine series and those involving just cosines (and the constant term) are called cosine series.

Another interesting result, based upon these examples, is that the original functions, |x| and x agree on the interval [0, \pi]. Note from Figures 5.7-5.9 that their Fourier series representations do as well. Thus, more than one series can be used to represent functions defined on finite intervals. All they need to do is to agree with the function over that particular interval. Sometimes one of these series is more useful because it has additional properties needed in the given application.

We have made the following observations from the previous examples: 1. There are several trigonometric series representations for a function defined on a finite interval.

  1. Odd functions on a symmetric interval are represented by sine series and even functions on a symmetric interval are represented by cosine series.

These two observations are related and are the subject of this section. We begin by defining a function f(x) on interval [0, L]. We have seen that the Fourier series representation of this function appears to converge to a periodic extension of the function.

image
Figure 5.10. This is a sketch of a function and its various extensions. The original function f(x) is defined on [0,1] and graphed in the upper left corner. To its right is the periodic extension, obtained by adding replicas. The two lower plots are obtained by first making the original function even or odd and then creating the periodic extensions of the new function.

In general, we obtain three different periodic representations. In order to distinguish these we will refer to them simply as the periodic, even and odd extensions. Now, starting with f(x) defined on [0, L], we would like to determine the Fourier series representations leading to these extensions. [For easy reference, the results are summarized in Table 5.2] We have already seen that the periodic extension of f(x) is obtained through the Fourier series representation in Equation (5.53).

Table 5.2. Fourier Cosine and Sine Series Representations on [0, L]

Fourier Series on [0, L]

\begin{aligned} f(x) & \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty}\left[a_{n} \cos \dfrac{2 n \pi x}{L}+b_{n} \sin \dfrac{2 n \pi x}{L}\right] . \\[4pt] a_{n} &=\dfrac{2}{L} \int_{0}^{L} f(x) \cos \dfrac{2 n \pi x}{L} d x . \quad n=0,1,2, \ldots, \\[4pt] b_{n} &=\dfrac{2}{L} \int_{0}^{L} f(x) \sin \dfrac{2 n \pi x}{L} d x . \quad n=1,2, \ldots \end{aligned} \nonumber

Fourier Cosine Series on [0, L]

f(x) \sim a_{0} / 2+\sum_{n=1}^{\infty} a_{n} \cos \dfrac{n \pi x}{L} . \nonumber

where

a_{n}=\dfrac{2}{L} \int_{0}^{L} f(x) \cos \dfrac{n \pi x}{L} d x . \quad n=0,1,2, \ldots \nonumber

Fourier Sine Series on [0, L]

f(x) \sim \sum_{n=1}^{\infty} b_{n} \sin \dfrac{n \pi x}{L} . \nonumber

where

b_{n}=\dfrac{2}{L} \int_{0}^{L} f(x) \sin \dfrac{n \pi x}{L} d x . \quad n=1,2, \ldots \nonumber

Given f(x) defined on [0, L], the even periodic extension is obtained by simply computing the Fourier series representation for the even function

f_{e}(x) \equiv\left\{\begin{array}{c} f(x), \quad 0<x<L \\[4pt] f(-x)-L<x<0 \end{array}\right. \nonumber

Since f_{e}(x) is an even function on a symmetric interval [-L, L], we expect that the resulting Fourier series will not contain sine terms. Therefore, the series expansion will be given by [Use the general case in (5.51) with a=-L and b=L .] :

f_{e}(x) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty} a_{n} \cos \dfrac{n \pi x}{L} . \nonumber

with Fourier coefficients

a_{n}=\dfrac{1}{L} \int_{-L}^{L} f_{e}(x) \cos \dfrac{n \pi x}{L} d x . \quad n=0,1,2, \ldots \nonumber

However, we can simplify this by noting that the integrand is even and the interval of integration can be replaced by [0, L]. On this interval f_{e}(x)=f(x). So, we have the Cosine Series Representation of f(x) for x \in[0, L] is given as

f(x) \sim \dfrac{a_{0}}{2}+\sum_{n=1}^{\infty} a_{n} \cos \dfrac{n \pi x}{L} . \nonumber

where

a_{n}=\dfrac{2}{L} \int_{0}^{L} f(x) \cos \dfrac{n \pi x}{L} d x . \quad n=0,1,2, \ldots \nonumber

Similarly, given f(x) defined on [0, L], the odd periodic extension is obtained by simply computing the Fourier series representation for the odd function

f_{o}(x) \equiv\left\{\begin{array}{c} f(x), \quad 0<x<L, \\[4pt] -f(-x)-L<x<0 \end{array}\right. \nonumber

The resulting series expansion leads to defining the Sine Series Representation of f(x) for x \in[0, L] as

f(x) \sim \sum_{n=1}^{\infty} b_{n} \sin \dfrac{n \pi x}{L} . \nonumber

where

b_{n}=\dfrac{2}{L} \int_{0}^{L} f(x) \sin \dfrac{n \pi x}{L} d x . \quad n=1,2, \ldots \nonumber

Example 5.10. In Figure 5.10 we actually provided plots of the various extensions of the function f(x)=x^{2} for x \in[0,1]. Let’s determine the representations of the periodic, even and odd extensions of this function.

For a change, we will use a CAS (Computer Algebra System) package to do the integrals. In this case we can use Maple. A general code for doing this for the periodic extension is shown in Table 5.3.

Example 5.11. Periodic Extension - Trigonometric Fourier Series

Using the above code, we have that a_{0}=\dfrac{2}{3} a_{n}=\dfrac{1}{n^{2} \pi^{2}} and b_{n}=-\dfrac{1}{n \pi}. Thus, the resulting series is given as

f(x) \sim \dfrac{1}{3}+\sum_{n=1}^{\infty}\left[\dfrac{1}{n^{2} \pi^{2}} \cos 2 n \pi x-\dfrac{1}{n \pi} \sin 2 n \pi x\right] \nonumber

Table 5.3. Maple code for computing Fourier coefficients and plotting partial sums of the Fourier series.

image

In Figure 5.11 we see the sum of the first 50 terms of this series. Generally, we see that the series seems to be converging to the periodic extension of f. There appear to be some problems with the convergence around integer values of x. We will later see that this is because of the discontinuities in the periodic extension and the resulting overshoot is referred to as the Gibbs phenomenon which is discussed in the appendix.

Example 5.12. Even Periodic Extension - Cosine Series

In this case we compute a_{0}=\dfrac{2}{3} and a_{n}=\dfrac{4(-1)^{n}}{n^{2} \pi^{2}}. Therefore, we have

f(x) \sim \dfrac{1}{3}+\dfrac{4}{\pi^{2}} \sum_{n=1}^{\infty} \dfrac{(-1)^{n}}{n^{2}} \cos n \pi x . \nonumber

In Figure 5.12 we see the sum of the first 50 terms of this series. In this case the convergence seems to be much better than in the periodic extension case. We also see that it is converging to the even extension.

Example 5.13. Odd Periodic Extension - Sine Series

Finally, we look at the sine series for this function. We find that b_{n}= -\dfrac{2}{n^{3} \pi^{3}}\left(n^{2} \pi^{2}(-1)^{n}-2(-1)^{n}+2\right). Therefore,

image
Figure 5.11. The periodic extension of f(x)=x^{2} on [0,1].
image
Figure 5.12. The even periodic extension of f(x)=x^{2} on [0,1]. Once again we see discontinuities in the extension as seen in Figure 5.13. However, we have verified that our sine series appears to be converging to the odd extension as we first sketched in Figure 5.10.
image
Figure 5.13. The odd periodic extension of f(x)=x^{2} on [0,1].

Appendix: The Gibbs Phenomenon

We have seen that when there is a jump discontinuity in the periodic extension of our functions, whether the function originally had a discontinuity or developed one due to a mismatch in the values of the endpoints. This can be seen in Figures 5.9,5.11 and 5.13. The Fourier series has a difficult time converging at the point of discontinuity and these graphs of the Fourier series show a distinct overshoot which does not go away. This is called the Gibbs phenomenon and the amount of overshoot can be computed.

In one of our first examples, Example 5.6, we found the Fourier series representation of the piecewise defined function

f(x)=\left\{\begin{array}{c} 1, \quad 0<x<\pi \\[4pt] -1, \quad \pi<x<2 \pi \end{array}\right. \nonumber

to be

f(x) \sim \dfrac{4}{\pi} \sum_{k=1}^{\infty} \dfrac{\sin (2 k-1) x}{2 k-1} . \nonumber

In Figure 5.14 we display the sum of the first ten terms. Note the wiggles, overshoots and under shoots near x=0, \pm \pi. These are seen more when we plot the representation for x \in[-3 \pi, 3 \pi], as shown in Figure 5.15. We note that the overshoots and undershoots occur at discontinuities in the periodic extension of f(x). These occur whenever f(x) has a discontinuity or if the values of f(x) at the endpoints of the domain do not agree.

One might expect that we only need to add more terms. In Figure 5.16 we show the sum for twenty terms. Note the sum appears to converge better for points far from the discontinuities. But, the overshoots and undershoots are still present. In Figures 5.17 and 5.18 show magnified plots of the overshoot at x=0 for N=100 and N=500, respectively. We see that the overshoot persists. The peak is at about the same height, but its location seems to be getting closer to the origin. We will show how one can estimate the size of the overshoot.

image
Figure 5.14. The Fourier series representation of a step function on [-\pi, \pi] for N=10.

We can study the Gibbs phenomenon by looking at the partial sums of general Fourier trigonometric series for functions f(x) defined on the interval [-L, L]. Writing out the partial sums, inserting the Fourier coefficients and rearranging, we have

image
Figure 5.15. The Fourier series representation of a step function on [-\pi, \pi] for N=10 plotted on [-3 \pi, 3 \pi] displaying the periodicity.
image
Figure 5.16. The Fourier series representation of a step function on [-\pi, \pi] for N=20.
image
Figure 5.17. The Fourier series representation of a step function on [-\pi, \pi] for N= 100 .

\begin{aligned} =& \dfrac{1}{2 L} \int_{-L}^{L} f(y) d y+\sum_{n=1}^{N}\left[\left(\dfrac{1}{L} \int_{-L}^{L} f(y) \cos \dfrac{n \pi y}{L} d y\right) \cos \dfrac{n \pi x}{L}\right.\\[4pt] &\left.+\left(\dfrac{1}{L} \int_{-L}^{L} f(y) \sin \dfrac{n \pi y}{L} d y\right) \sin \dfrac{n \pi x}{L}\right] \\[4pt] =&\left.\dfrac{1}{L} \int_{-L}^{L}\left\{\dfrac{1}{2}+\sum_{n=1}^{N}\left(\cos \dfrac{n \pi y}{L} \cos \dfrac{n \pi x}{L}+\sin \dfrac{n \pi y}{L} \sin \dfrac{n \pi x}{L}\right)\right\} f(y)\right\} f(y) d y \\[4pt] =& \dfrac{1}{L} \int_{-L}^{L}\left\{\dfrac{1}{2}+\sum_{n=1}^{N} \cos \dfrac{n \pi(y-x)}{L}\right\} \\[4pt] \equiv & \dfrac{1}{L} \int_{-L}^{L} D_{N}(y-x) f(y) d y . \end{aligned} \nonumber

We have defined

D_{N}(x)=\dfrac{1}{2}+\sum_{n=1}^{N} \cos \dfrac{n \pi x}{L}, \nonumber

which is called the N-th Dirichlet Kernel. We now prove

image
Figure 5.18. The Fourier series representation of a step function on [-\pi, \pi] for N= 500 .

Proposition:

D_{n}(x)= \begin{cases}\dfrac{\sin \left(\left(n+\dfrac{1}{2}\right) \dfrac{\pi x}{L}\right)}{2 \sin \dfrac{\pi x}{2 L}}, & \sin \dfrac{\pi x}{2 L} \neq 0 \\[4pt] n+\dfrac{1}{2}, & \sin \dfrac{\pi x}{2 L}=0\end{cases} \nonumber

Proof: Let \theta=\dfrac{\pi x}{L} and multiply D_{n}(x) by 2 \sin \dfrac{\theta}{2} to obtain:

\begin{aligned} 2 \sin \dfrac{\theta}{2} D_{n}(x)=& 2 \sin \dfrac{\theta}{2}\left[\dfrac{1}{2}+\cos \theta+\cdots+\cos n \theta\right] \\[4pt] =& \sin \dfrac{\theta}{2}+2 \cos \theta \sin \dfrac{\theta}{2}+2 \cos 2 \theta \sin \dfrac{\theta}{2}+\cdots+2 \cos n \theta \sin \dfrac{\theta}{2} \\[4pt] =& \sin \dfrac{\theta}{2}+\left(\sin \dfrac{3 \theta}{2}-\sin \dfrac{\theta}{2}\right)+\left(\sin \dfrac{5 \theta}{2}-\sin \dfrac{3 \theta}{2}\right)+\cdots \\[4pt] &+\left[\sin \left(n+\dfrac{1}{2}\right) \theta-\sin \left(n-\dfrac{1}{2}\right) \theta\right] \\[4pt] =& \sin \left(n+\dfrac{1}{2}\right) \theta \end{aligned} \nonumber

Thus,

2 \sin \dfrac{\theta}{2} D_{n}(x)=\sin \left(n+\dfrac{1}{2}\right) \theta \nonumber

or if \sin \dfrac{\theta}{2} \neq 0,

D_{n}(x)=\dfrac{\sin \left(n+\dfrac{1}{2}\right) \theta}{2 \sin \dfrac{\theta}{2}}, \quad \theta=\dfrac{\pi x}{L} \nonumber

If \sin \dfrac{\theta}{2}=0,then one needs to apply L’Hospital’s Rule:

\begin{aligned} \lim _{\theta \rightarrow 2 m \pi} \dfrac{\sin \left(n+\dfrac{1}{2}\right) \theta}{2 \sin \dfrac{\theta}{2}} &=\lim _{\theta \rightarrow 2 m \pi} \dfrac{\left(n+\dfrac{1}{2}\right) \cos \left(n+\dfrac{1}{2}\right) \theta}{\cos \dfrac{\theta}{2}} \\[4pt] &=\dfrac{\left(n+\dfrac{1}{2}\right) \cos (2 m n \pi+m \pi)}{\cos m \pi} \\[4pt] &=n+\dfrac{1}{2} . \end{aligned} \nonumber

We further note that D_{N}(x) is periodic with period 2 L and is an even function. So far, we have found that

S_{N}(x)=\dfrac{1}{L} \int_{-L}^{L} D_{N}(y-x) f(y) d y \nonumber

Now, make the substitution \xi=y-x. Then,

\begin{aligned} S_{N}(x) &=\dfrac{1}{L} \int_{-L-x}^{L-x} D_{N}(\xi) f(\xi+x) d \xi \\[4pt] &=\dfrac{1}{L} \int_{-L}^{L} D_{N}(\xi) f(\xi+x) d \xi \end{aligned} \nonumber

In the second integral we have made use of the fact that f(x) and D_{N}(x) are periodic with period 2 L and shifted the interval back to [-L, L].

Now split the integration and use the fact that D_{N}(x) is an even function. Then,

\begin{aligned} S_{N}(x) &=\dfrac{1}{L} \int_{-L}^{0} D_{N}(\xi) f(\xi+x) d \xi+\dfrac{1}{L} \int_{0}^{L} D_{N}(\xi) f(\xi+x) d \xi \\[4pt] &=\dfrac{1}{L} \int_{0}^{L}[f(x-\xi)+f(\xi+x)] D_{N}(\xi) d \xi \end{aligned} \nonumber

We can use this result to study the Gibbs phenomenon whenever it occurs. In particular, we will only concentrate on our earlier example. Namely,

f(x)=\left\{\begin{array}{c} 1, \quad 0<x<\pi \\[4pt] -1, \pi<x<2 \pi \end{array}\right. \nonumber

For this case, we have

S_{N}(x)=\dfrac{1}{\pi} \int_{0}^{\pi}[f(x-\xi)+f(\xi+x)] D_{N}(\xi) d \xi \nonumber

for

D_{N}(x)=\dfrac{1}{2}+\sum_{n=1}^{N} \cos n x \nonumber

Also, one can show that

f(x-\xi)+f(\xi+x)=\left\{\begin{array}{c} 2, \quad 0 \leq \xi<x \\[4pt] 0, \quad x \leq \xi<\pi-x \\[4pt] -2, \pi-x \leq \xi<\pi \end{array}\right. \nonumber

Thus, we have

\begin{aligned} S_{N}(x) &=\dfrac{2}{\pi} \int_{0}^{x} D_{N}(\xi) d \xi-\dfrac{2}{\pi} \int_{\pi-x}^{\pi} D_{N}(\xi) d \xi \\[4pt] &=\dfrac{2}{\pi} \int_{0}^{x} D_{N}(z) d z+\dfrac{2}{-} \int_{0}^{x} D_{N}(\pi-z) d z . \end{aligned} \nonumber

Here we made the substitution z=\pi-\xi in the second integral. The Dirichlet kernel in the proposition for L=\pi is given by

D_{N}(x)=\dfrac{\sin \left(N+\dfrac{1}{2}\right) x}{2 \sin \dfrac{x}{2}} . \nonumber

For N large, we have N+\dfrac{1}{2} \approx N, and for small x, we have \sin \dfrac{x}{2} \approx \dfrac{x}{2}. So, under these assumptions,

D_{N}(x) \approx \dfrac{\sin N x}{x} . \nonumber

Therefore,

S_{N}(x) \rightarrow \dfrac{2}{\pi} \int_{0}^{x} \dfrac{\sin N \xi}{\xi} d \xi . \nonumber

If we want to determine the locations of the minima and maxima, where the undershoot and overshoot occur, then we apply the first derivative test for extrema to S_{N}(x). Thus,

\dfrac{d}{d x} S_{N}(x)=\dfrac{2}{\pi} \dfrac{\sin N x}{x}=0 . \nonumber

The extrema occur for N x=m \pi, m=\pm 1, \pm 2, \ldots One can show that there is a maximum at x=\pi / N and a minimum for x=2 \pi / N. The value for the overshoot can be computed as

\begin{aligned} S_{N}(\pi / N) &=\dfrac{2}{\pi} \int_{0}^{\pi / N} \dfrac{\sin N \xi}{\xi} d \xi \\[4pt] &=\dfrac{2}{\pi} \int_{0}^{\pi} \dfrac{\sin t}{t} d t \\[4pt] &=\dfrac{2}{\pi} \operatorname{Si}(\pi) \\[4pt] &=1.178979744 \ldots . \end{aligned} \nonumber

Note that this value is independent of N and is given in terms of the sine integral,

\operatorname{Si}(x) \equiv \int_{0}^{x} \dfrac{\sin t}{t} d t \nonumber

Problems

5.1. Find the Fourier Series of each function f(x) of period 2 \pi. For each series, plot the N th partial sum,

S_{N}=\dfrac{a_{0}}{2}+\sum_{n=1}^{N}\left[a_{n} \cos n x+b_{n} \sin n x\right], \nonumber

for N=5,10,50 and describe the convergence (is it fast? what is it converging to, etc.) [Some simple Maple code for computing partial sums is shown below.]
a. f(x)=x,|x|<\pi.
b. f(x)=\dfrac{x^{2}}{4},|x|<\pi.
c. f(x)=\pi-|x|,|x|<\pi.
d. f(x)= \begin{cases}\dfrac{\pi}{2}, & 0<x<\pi \\[4pt] -\dfrac{\pi}{2}, & \pi<x<2 \pi\end{cases}
e. f(x)=\left\{\begin{array}{l}0,-\pi<x<0 \\[4pt] 1,0<x<\pi\end{array}\right.

A simple set of commands in Maple are shown below, where you fill in the Fourier coefficients that you have computed by hand and f(x) so that you can compare your results. Of course, other modifications may be needed.

image

5.2. Consider the function f(x)=4 \sin ^{3} 2 x

a. Derive an identity relating \sin ^{3} \theta in terms of \sin \theta and \sin 3 \theta and express f(x) in terms of simple sine functions.

b. Determine the Fourier coefficients of f(x) in a Fourier series expansion on [0,2 \pi] without computing any integrals!

5.3. Find the Fourier series of f(x)=x on the given interval with the given period T. Plot the N th partial sums and describe what you see.

a. 0<x<2, T=2.

b. -2<x<2, T=4.

5.4. The result in problem 5.1 b above gives a Fourier series representation of \dfrac{x^{2}}{4}. By picking the right value for x and a little arrangement of the series, show that [See Example 5.8.]

\dfrac{\pi^{2}}{6}=1+\dfrac{1}{2^{2}}+\dfrac{1}{3^{2}}+\dfrac{1}{4^{2}}+\cdots \nonumber

b.

\dfrac{\pi^{2}}{8}=1+\dfrac{1}{3^{2}}+\dfrac{1}{5^{2}}+\dfrac{1}{7^{2}}+\cdots \nonumber

5.5. Sketch (by hand) the graphs of each of the following functions over four periods. Then sketch the extensions each of the functions as both an even and odd periodic function. Determine the corresponding Fourier sine and cosine series and verify the convergence to the desired function using Maple.
a. f(x)=x^{2}, 0<x<1.
b. f(x)=x(2-x), 0<x<2.
c. f(x)=\left\{\begin{array}{l}0,0<x<1 \text {, } \\[4pt] 1,1<x<2 \text {. }\end{array}\right.
d. f(x)=\left\{\begin{array}{c}\pi, \quad 0<x<\pi \\[4pt] 2 \pi-x, \pi<x<2 \pi\end{array}\right.

Sturm-Liouville Eigenvalue Problems

Introduction

In the last chapters we have explored the solution of boundary value problems that led to trigonometric eigenfunctions. Such functions can be used to represent functions in Fourier series expansions. We would like to generalize some of those techniques in order to solve other boundary value problems. A class of problems to which our previous examples belong and which have eigenfunctions with similar properties are the Sturm-Liouville Eigenvalue Problems. These problems involve self-adjoint (differential) operators which play an important role in the spectral theory of linear operators and the existence of the eigenfunctions we described in Section 4.3.2. These ideas will be introduced in this chapter.

In physics many problems arise in the form of boundary value problems involving second order ordinary differential equations. For example, we might want to solve the equation

a_{2}(x) y^{\prime \prime}+a_{1}(x) y^{\prime}+a_{0}(x) y=f(x) \nonumber

subject to boundary conditions. We can write such an equation in operator form by defining the differential operator

L=a_{2}(x) \dfrac{d^{2}}{d x^{2}}+a_{1}(x) \dfrac{d}{d x}+a_{0}(x) . \nonumber

Then, Equation (6.1) takes the form

L y=f \text {. } \nonumber

As we saw in the general boundary value problem (4.20) in Section 4.3.2, we can solve some equations using eigenvalue expansions. Namely, we seek solutions to the eigenvalue problem

L \phi=\lambda \phi \nonumber

with homogeneous boundary conditions and then seek a solution as an expansion of the eigenfunctions. Formally, we let

y=\sum_{n=1}^{\infty} c_{n} \phi_{n} \nonumber

However, we are not guaranteed a nice set of eigenfunctions. We need an appropriate set to form a basis in the function space. Also, it would be nice to have orthogonality so that we can easily solve for the expansion coefficients as was done in Section 4.3.2. [Otherwise, we would have to solve a infinite coupled system of algebraic equations instead of an uncoupled and diagonal system.]

It turns out that any linear second order operator can be turned into an operator that possesses just the right properties (self-adjointedness to carry out this procedure. The resulting operator is referred to as a Sturm-Liouville operator. We will highlight some of the properties of such operators and prove a few key theorems, though this will not be an extensive review of SturmLiouville theory. The interested reader can review the literature and more advanced texts for a more in depth analysis.

We define the Sturm-Liouville operator as

\mathcal{L}=\dfrac{d}{d x} p(x) \dfrac{d}{d x}+q(x) \nonumber

The Sturm-Liouville eigenvalue problem is given by the differential equation

\mathcal{L} u=-\lambda \sigma(x) u \nonumber

\dfrac{d}{d x}\left(p(x) \dfrac{d u}{d x}\right)+q(x) u+\lambda \sigma(x) u=0 \nonumber

for x \in(a, b). The functions p(x), p^{\prime}(x), q(x) and \sigma(x) are assumed to be continuous on (a, b) and p(x)>0, \sigma(x)>0 on [a, b]. If the interval is finite and these assumptions on the coefficients are true on [a, b], then the problem is said to be regular. Otherwise, it is called singular.

We also need to impose the set of homogeneous boundary conditions

\begin{array}{r} \alpha_{1} u(a)+\beta_{1} u^{\prime}(a)=0 \\[4pt] \alpha_{2} u(b)+\beta_{2} u^{\prime}(b)=0 \end{array} \nonumber

The \alpha ’s and \beta ’s are constants. For different values, one has special types of boundary conditions. For \beta_{i}=0, we have what are called Dirichlet boundary conditions. Namely, u(a)=0 and u(b)=0. For \alpha_{i}=0, we have Neumann boundary conditions. In this case, u^{\prime}(a)=0 and u^{\prime}(b)=0. In terms of the heat equation example, Dirichlet conditions correspond to maintaining a fixed temperature at the ends of the rod. The Neumann boundary conditions would correspond to no heat flow across the ends, or insulating conditions, as there would be no temperature gradient at those points. The more general boundary conditions allow for partially insulated boundaries.

Another type of boundary condition that is often encountered is the periodic boundary condition. Consider the heated rod that has been bent to form a circle. Then the two end points are physically the same. So, we would expect that the temperature and the temperature gradient should agree at those points. For this case we write u(a)=u(b) and u^{\prime}(a)=u^{\prime}(b). Boundary value problems using these conditions have to be handled differently than the above homogeneous conditions. These conditions leads to different types of eigenfunctions and eigenvalues.

As previously mentioned, equations of the form (6.1) occur often. We now show that Equation (6.1) can be turned into a differential equation of SturmLiouville form:

\dfrac{d}{d x}\left(p(x) \dfrac{d y}{d x}\right)+q(x) y=F(x) \nonumber

Another way to phrase this is provided in the theorem:

Theorem 6.1. Any second order linear operator can be put into the form of the Sturm-Liouville operator (6.2).

The proof of this is straight forward, as we shall soon show. Consider the equation (6.1). If a_{1}(x)=a_{2}^{\prime}(x), then we can write the equation in the form

\begin{aligned} f(x) &=a_{2}(x) y^{\prime \prime}+a_{1}(x) y^{\prime}+a_{0}(x) y \\[4pt] &=\left(a_{2}(x) y^{\prime}\right)^{\prime}+a_{0}(x) y \end{aligned} \nonumber

This is in the correct form. We just identify p(x)=a_{2}(x) and q(x)=a_{0}(x).

However, consider the differential equation

x^{2} y^{\prime \prime}+x y^{\prime}+2 y=0 . \nonumber

In this case a_{2}(x)=x^{2} and a_{2}^{\prime}(x)=2 x \neq a_{1}(x). The linear differential operator in this equation is not of Sturm-Liouville type. But, we can change it to a Sturm Liouville operator.

In the Sturm Liouville operator the derivative terms are gathered together into one perfect derivative. This is similar to what we saw in the first chapter when we solved linear first order equations. In that case we sought an integrating factor. We can do the same thing here. We seek a multiplicative function \mu(x) that we can multiply through (6.1) so that it can be written in Sturm-Liouville form. We first divide out the a_{2}(x), giving

y^{\prime \prime}+\dfrac{a_{1}(x)}{a_{2}(x)} y^{\prime}+\dfrac{a_{0}(x)}{a_{2}(x)} y=\dfrac{f(x)}{a_{2}(x)} . \nonumber

Now, we multiply the differential equation by \mu :

\mu(x) y^{\prime \prime}+\mu(x) \dfrac{a_{1}(x)}{a_{2}(x)} y^{\prime}+\mu(x) \dfrac{a_{0}(x)}{a_{2}(x)} y=\mu(x) \dfrac{f(x)}{a_{2}(x)} \nonumber

The first two terms can now be combined into an exact derivative \left(\mu y^{\prime}\right)^{\prime} if \mu(x) satisfies

\dfrac{d \mu}{d x}=\mu(x) \dfrac{a_{1}(x)}{a_{2}(x)} . \nonumber

This is formally solved to give

\mu(x)=e^{\int \dfrac{a_{1}(x)}{a_{2}(x)} d x} . \nonumber

Thus, the original equation can be multiplied by factor

\dfrac{\mu(x)}{a_{2}(x)}=\dfrac{1}{a_{2}(x)} e^{\int \dfrac{a_{1}(x)}{a_{2}(x)} d x} \nonumber

to turn it into Sturm-Liouville form.

In summary,

\begin{aligned} &\text { Equation (6.1), } \\[4pt] &\qquad a_{2}(x) y^{\prime \prime}+a_{1}(x) y^{\prime}+a_{0}(x) y=f(x) \end{aligned} \nonumber

can be put into the Sturm-Liouville form

\dfrac{d}{d x}\left(p(x) \dfrac{d y}{d x}\right)+q(x) y=F(x) \nonumber

where

\begin{aligned} p(x) &=e^{\int \dfrac{a_{1}(x)}{a_{2}(x)} d x} \\[4pt] q(x) &=p(x) \dfrac{a_{0}(x)}{a_{2}(x)} \\[4pt] F(x) &=p(x) \dfrac{f(x)}{a_{2}(x)} \end{aligned} \nonumber

Example 6.2. For the example above,

x^{2} y^{\prime \prime}+x y^{\prime}+2 y=0 . \nonumber

We need only multiply this equation by

\dfrac{1}{x^{2}} e^{\int \dfrac{d x}{x}}=\dfrac{1}{x}, \nonumber

to put the equation in Sturm-Liouville form:

\begin{aligned} 0 &=x y^{\prime \prime}+y^{\prime}+\dfrac{2}{x} y \\[4pt] &=\left(x y^{\prime}\right)^{\prime}+\dfrac{2}{x} y \end{aligned} \nonumber

Properties of Sturm-Liouville Eigenvalue Problems

There are several properties that can be proven for the (regular) SturmLiouville eigenvalue problem. However, we will not prove them all here. We will merely list some of the important facts and focus on a few of the properties.

  1. The eigenvalues are real, countable, ordered and there is a smallest eigenvalue. Thus, we can write them as \lambda_{1}<\lambda_{2}<\ldots. However, there is no largest eigenvalue and n \rightarrow \infty, \lambda_{n} \rightarrow \infty.
  2. For each eigenvalue \lambda_{n} there exists an eigenfunction \phi_{n} with n-1 zeros on (a, b).
  3. Eigenfunctions corresponding to different eigenvalues are orthogonal with respect to the weight function, \sigma(x). Defining the inner product of f(x) and g(x) as

<f, g>=\int_{a}^{b} f(x) g(x) \sigma(x) d x \nonumber

then the orthogonality of the eigenfunctios can be written in the form

<\phi_{n}, \phi_{m}>=<\phi_{n}, \phi_{n}>\delta_{n m}, \quad n, m=1,2, \ldots \nonumber

  1. The set of eigenfunctions is complete; i.e., any piecewise smooth function can be represented by a generalized Fourier series expansion of the eigenfunctions,

f(x) \sim \sum_{n=1}^{\infty} c_{n} \phi_{n}(x) \nonumber

where

c_{n}=\dfrac{<f, \phi_{n}>}{<\phi_{n}, \phi_{n}>} \nonumber

Actually, one needs f(x) \in L_{\sigma}^{2}[a, b], the set of square integrable functions over [a, b] with weight function \sigma(x). By square integrable, we mean that <f, f><\infty. One can show that such a space is isomorphic to a Hilbert space, a complete inner product space.

  1. Multiply the eigenvalue problem

\mathcal{L} \phi_{n}=-\lambda_{n} \sigma(x) \phi_{n} \nonumber

by \phi_{n} and integrate. Solve this result for \lambda_{n}, to find the Rayleigh Quotient

\lambda_{n}=\dfrac{-\left.p \phi_{n} \dfrac{d \phi_{n}}{d x}\right|_{a} ^{b}-\int_{a}^{b}\left[p\left(\dfrac{d \phi_{n}}{d x}\right)^{2}-q \phi_{n}^{2}\right] d x}{<\phi_{n}, \phi_{n}>} \nonumber

The Rayleigh quotient is useful for getting estimates of eigenvalues and proving some of the other properties. Example 6.3. We seek the eigenfunctions of the operator found in Example 6.2. Namely, we want to solve the eigenvalue problem

\mathcal{L} y=\left(x y^{\prime}\right)^{\prime}+\dfrac{2}{x} y=-\lambda \sigma y \nonumber

subject to a set of boundary conditions. Let’s use the boundary conditions

y^{\prime}(1)=0, \quad y^{\prime}(2)=0 \nonumber

[Note that we do not know \sigma(x) yet, but will choose an appropriate function to obtain solutions.]

Expanding the derivative, we have

x y^{\prime \prime}+y^{\prime}+\dfrac{2}{x} y=-\lambda \sigma y . \nonumber

Multiply through by x to obtain

x^{2} y^{\prime \prime}+x y^{\prime}+(2+\lambda x \sigma) y=0 \nonumber

Notice that if we choose \sigma(x)=x^{-1}, then this equation can be made a Cauchy-Euler type equation. Thus, we have

x^{2} y^{\prime \prime}+x y^{\prime}+(\lambda+2) y=0 . \nonumber

The characteristic equation is

r^{2}+\lambda+2=0 . \nonumber

For oscillatory solutions, we need \lambda+2>0. Thus, the general solution is

y(x)=c_{1} \cos (\sqrt{\lambda+2} \ln |x|)+c_{2} \sin (\sqrt{\lambda+2} \ln |x|) . \nonumber

Next we apply the boundary conditions. y^{\prime}(1)=0 forces c_{2}=0. This leaves

y(x)=c_{1} \cos (\sqrt{\lambda+2} \ln x) . \nonumber

The second condition, y^{\prime}(2)=0, yields

\sin (\sqrt{\lambda+2} \ln 2)=0 \nonumber

This will give nontrivial solutions when

\sqrt{\lambda+2} \ln 2=n \pi, \quad n=0,1,2,3 \ldots \nonumber

In summary, the eigenfunctions for this eigenvalue problem are

y_{n}(x)=\cos \left(\dfrac{n \pi}{\ln 2} \ln x\right), \quad 1 \leq x \leq 2 \nonumber

and the eigenvalues are \lambda_{n}=2+\left(\dfrac{n \pi}{\ln 2}\right)^{2} for n=0,1,2, \ldots

Note: We include the n=0 case because y(x)= constant is a solution of the \lambda=-2 case. More specifically, in this case the characteristic equation reduces to r^{2}=0. Thus, the general solution of this Cauchy-Euler equation is

y(x)=c_{1}+c_{2} \ln |x| \nonumber

Setting y^{\prime}(1)=0, forces c_{2}=0 . y^{\prime}(2) automatically vanishes, leaving the solution in this case as y(x)=c_{1}.

We note that some of the properties listed in the beginning of the section hold for this example. The eigenvalues are seen to be real, countable and ordered. There is a least one, \lambda=2. Next, one can find the zeros of each eigenfunction on [1,2]. Then the argument of the cosine, \dfrac{n \pi}{\ln 2} \ln x, takes values 0 to n \pi for x \in[1,2]. The cosine function has n-1 roots on this interval.

Orthogonality can be checked as well. We set up the integral and use the substitution y=\pi \ln x / \ln 2. This gives

\begin{aligned} <y_{n}, y_{m}>&=\int_{1}^{2} \cos \left(\dfrac{n \pi}{\ln 2} \ln x\right) \cos \left(\dfrac{m \pi}{\ln 2} \ln x\right) \dfrac{d x}{x} \\[4pt] &=\dfrac{\ln 2}{\pi} \int_{0}^{\pi} \cos n y \cos m y d y \\[4pt] &=\dfrac{\ln 2}{2} \delta_{n, m} \end{aligned} \nonumber

Adjoint Operators

In the study of the spectral theory of matrices, one learns about the adjoint of the matrix, A^{\dagger}, and the role that self-adjoint, or Hermitian, matrices play in diagonalization. also, one needs the concept of adjoint to discuss the existence of solutions to the matrix problem \mathbf{y}=A \mathbf{x}. In the same spirit, one is interested in the existence of solutions of the operator equation L u=f and solutions of the corresponding eigenvalue problem. The study of linear operator on Hilbert spaces is a generalization of what the reader had seen in a linear algebra course.

Just as one can find a basis of eigenvectors and diagonalize Hermitian, or self-adjoint, matrices (or, real symmetric in the case of real matrices), we will see that the Sturm-Liouville operator is self-adjoint. In this section we will define the domain of an operator and introduce the notion of adjoint operators. In the last section we discuss the role the adjpoint plays in the existence of solutions to the operator equation L u=f.

We first introduce some definitions.

Definition 6.4. The domain of a differential operator L is the set of all u \in L_{\sigma}^{2}[a, b] satisfying a given set of homogeneous boundary conditions.

Definition 6.5. The adjoint, L^{\dagger}, of operator L satisfies

<u, L v>=<L^{\dagger} u, v> \nonumber

for all v in the domain of L and u in the domain of L^{\dagger}.

Example 6.6. As an example, we find the adjoint of second order linear differential operator L=a_{2}(x) \dfrac{d^{2}}{d x^{2}}+a_{1}(x) \dfrac{d}{d x}+a_{0}(x).

In order to find the adjoint, we place the operator under an integral. So, we consider the inner product

<u, L v>=\int_{a}^{b} u\left(a_{2} v^{\prime \prime}+a_{1} v^{\prime}+a_{0} v\right) d x \nonumber

We have to move the operator L from v and determine what operator is acting on u in order to formally preserve the inner product. For a simple operator like L=\dfrac{d}{d x}, this is easily done using integration by parts. For the given operator, we will need to apply several integrations by parts to the individual terms. We will consider the individual terms.

First we consider the a_{1} v^{\prime} term. Integration by parts yields

\int_{a}^{b} u(x) a_{1}(x) v^{\prime}(x) d x=\left.a_{1}(x) u(x) v(x)\right|_{a} ^{b}-\int_{a}^{b}\left(u(x) a_{1}(x)\right)^{\prime} v(x) d x \nonumber

Now, we consider the a_{2} v^{\prime \prime} term. In this case it will take two integrations by parts:

\begin{aligned} \int_{a}^{b} u(x) a_{2}(x) v^{\prime \prime}(x) d x &=\left.a_{2}(x) u(x) v^{\prime}(x)\right|_{a} ^{b}-\int_{a}^{b}\left(u(x) a_{2}(x)\right)^{\prime} v(x)^{\prime} \\[4pt] =& {\left.\left[a_{2}(x) u(x) v^{\prime}(x)-\left(a_{2}(x) u(x)\right)^{\prime} v(x)\right]\right|_{a} ^{b} } \\[4pt] &+\int_{a}^{b}\left(u(x) a_{2}(x)\right)^{\prime \prime} v(x) d x \end{aligned} \nonumber

Combining these results, we obtain

\begin{aligned} <u, L v>&=\int_{a}^{b} u\left(a_{2} v^{\prime \prime}+a_{1} v^{\prime}+a_{0} v\right) d x \\[4pt] =& {\left.\left[a_{1}(x) u(x) v(x)+a_{2}(x) u(x) v^{\prime}(x)-\left(a_{2}(x) u(x)\right)^{\prime} v(x)\right]\right|_{a} ^{b} } \\[4pt] &+\int_{a}^{b}\left[\left(a_{2} u\right)^{\prime \prime}-\left(a_{1} u\right)^{\prime}+a_{0} u\right] v d x \end{aligned} \nonumber

Inserting the boundary conditions for v, one has to determine boundary conditions for u such that

\left.\left[a_{1}(x) u(x) v(x)+a_{2}(x) u(x) v^{\prime}(x)-\left(a_{2}(x) u(x)\right)^{\prime} v(x)\right]\right|_{a} ^{b}=0 \nonumber

This leaves

<u, L v>=\int_{a}^{b}\left[\left(a_{2} u\right)^{\prime \prime}-\left(a_{1} u\right)^{\prime}+a_{0} u\right] v d x \equiv<L^{\dagger} u, v>. \nonumber

Therefore,

L^{\dagger}=\dfrac{d^{2}}{d x^{2}} a_{2}(x)-\dfrac{d}{d x} a_{1}(x)+a_{0}(x) \nonumber

When L^{\dagger}=L, the operator is called formally self-adjoint. When the domain of L is the same as the domain of L^{\dagger}, the term self-adjoint is used. As the domain is important in establishing self-adjointness, we need to do a complete example in which the domain of the adjoint is found.

Example 6.7. Determine L^{\dagger} and its domain for operator L u=\dfrac{d u}{d x} where u satisfies the boundary conditions u(0)=2 u(1) on [0,1].

We need to find the adjoint operator satisfying <v, L u>=<L^{\dagger} v, u>. Therefore, we rewrite the integral

<v, L u>=\int_{0}^{1} v \dfrac{d u}{d x} d x=\left.u v\right|_{0} ^{1}-\int_{0}^{1} u \dfrac{d v}{d x} d x=<L^{\dagger} v, u>\text {. } \nonumber

From this we have the adjoint problem consisting of an adjoint operator and the associated boundary condition:

  1. L^{\dagger}=-\dfrac{d}{d x}
  2. \left.u v\right|_{0} ^{1}=0 \Rightarrow 0=u(1)[v(1)-2 v(0)] \Rightarrow v(1)=2 v(0)

Lagrange’s and Green’s Identities

Before turning to the proofs that the eigenvalues of a Sturm-Liouville problem are real and the associated eigenfunctions orthogonal, we will first need to introduce two important identities. For the Sturm-Liouville operator,

\mathcal{L}=\dfrac{d}{d x}\left(p \dfrac{d}{d x}\right)+q \nonumber

we have the two identities:

Lagrange’s Identity u \mathcal{L} v-v \mathcal{L} u=\left[p\left(u v^{\prime}-v u^{\prime}\right)\right]^{\prime}.

Green’s Identity \int_{a}^{b}(u \mathcal{L} v-v \mathcal{L} u) d x=\left.\left[p\left(u v^{\prime}-v u^{\prime}\right)\right]\right|_{a} ^{b}.

Proof. The proof of Lagrange’s identity follows by a simple manipulations of the operator:

\begin{aligned} u \mathcal{L} v-v \mathcal{L} u &=u\left[\dfrac{d}{d x}\left(p \dfrac{d v}{d x}\right)+q v\right]-v\left[\dfrac{d}{d x}\left(p \dfrac{d u}{d x}\right)+q u\right] \\[4pt] &=u \dfrac{d}{d x}\left(p \dfrac{d v}{d x}\right)-v \dfrac{d}{d x}\left(p \dfrac{d u}{d x}\right) \\[4pt] &=u \dfrac{d}{d x}\left(p \dfrac{d v}{d x}\right)+p \dfrac{d u}{d x} \dfrac{d v}{d x}-v \dfrac{d}{d x}\left(p \dfrac{d u}{d x}\right)-p \dfrac{d u}{d x} \dfrac{d v}{d x} \\[4pt] &=\dfrac{d}{d x}\left[p u \dfrac{d v}{d x}-p v \dfrac{d u}{d x}\right] \end{aligned} \nonumber

Green’s identity is simply proven by integrating Lagrange’s identity.

Orthogonality and Reality

We are now ready to prove that the eigenvalues of a Sturm-Liouville problem are real and the corresponding eigenfunctions are orthogonal. These are easily established using Green’s identity, which in turn is a statement about the Sturm-Liouville operator being self-adjoint.

Theorem 6.8. The eigenvalues of the Sturm-Liouville problem are real.

Proof. Let \phi_{n}(x) be a solution of the eigenvalue problem associated with \lambda_{n} :

\mathcal{L} \phi_{n}=-\lambda_{n} \sigma \phi_{n} . \nonumber

The complex conjugate of this equation is

\mathcal{L} \bar{\phi}_{n}=-\bar{\lambda}_{n} \sigma \bar{\phi}_{n} . \nonumber

Now, multiply the first equation by \bar{\phi}_{n} and the second equation by \phi_{n} and then subtract the results. We obtain

\bar{\phi}_{n} \mathcal{L} \phi_{n}-\phi_{n} \mathcal{L} \bar{\phi}_{n}=\left(\bar{\lambda}_{n}-\lambda_{n}\right) \sigma \phi_{n} \bar{\phi}_{n} \nonumber

Integrate both sides of this equation:

\int_{a}^{b}\left(\bar{\phi}_{n} \mathcal{L} \phi_{n}-\phi_{n} \mathcal{L} \bar{\phi}_{n}\right) d x=\left(\bar{\lambda}_{n}-\lambda_{n}\right) \int_{a}^{b} \sigma \phi_{n} \bar{\phi}_{n} d x \nonumber

Apply Green’s identity to the left hand side to find

\left.\left[p\left(\bar{\phi}_{n} \phi_{n}^{\prime}-\phi_{n} \bar{\phi}_{n}^{\prime}\right)\right]\right|_{a} ^{b}=\left(\bar{\lambda}_{n}-\lambda_{n}\right) \int_{a}^{b} \sigma \phi_{n} \bar{\phi}_{n} d x \nonumber

Using the homogeneous boundary conditions for a self-adjoint operator, the left side vanishes to give

0=\left(\bar{\lambda}_{n}-\lambda_{n}\right) \int_{a}^{b} \sigma\left\|\phi_{n}\right\|^{2} d x \nonumber

The integral is nonnegative, so we must have \bar{\lambda}_{n}=\lambda_{n}. Therefore, the eigenvalues are real. Theorem 6.9. The eigenfunctions corresponding to different eigenvalues of the Sturm-Liouville problem are orthogonal.

Proof. This is proven similar to the last theorem. Let \phi_{n}(x) be a solution of the eigenvalue problem associated with \lambda_{n}

\mathcal{L} \phi_{n}=-\lambda_{n} \sigma \phi_{n}, \nonumber

and let \phi_{m}(x) be a solution of the eigenvalue problem associated with \lambda_{m} \neq \lambda_{n}

\mathcal{L} \phi_{m}=-\lambda_{m} \sigma \phi_{m}, \nonumber

Now, multiply the first equation by \phi_{m} and the second equation by \phi_{n}. Subtracting the results, we obtain

\phi_{m} \mathcal{L} \phi_{n}-\phi_{n} \mathcal{L} \phi_{m}=\left(\lambda_{m}-\lambda_{n}\right) \sigma \phi_{n} \phi_{m} \nonumber

Similar to the previous prooof, we integrate both sides of the equation and use Green’s identity and the boundary conditions for a self-adjoint operator. This leaves

0=\left(\lambda_{m}-\lambda_{n}\right) \int_{a}^{b} \sigma \phi_{n} \phi_{m} d x . \nonumber

Since the eigenvalues are distinct, we can divide by \lambda_{m}-\lambda_{n}, leaving the desired result,

\int_{a}^{b} \sigma \phi_{n} \phi_{m} d x=0 . \nonumber

Therefore, the eigenfunctions are orthogonal with respect to the weight function \sigma(x).

The Rayleigh Quotient

The Rayleigh quotient is useful for getting estimates of eigenvalues and proving some of the other properties associated with Sturm-Liouville eigenvalue problems. We begin by multiplying the eigenvalue problem

\mathcal{L} \phi_{n}=-\lambda_{n} \sigma(x) \phi_{n} \nonumber

by \phi_{n} and integrating. This gives

\int_{a}^{b}\left[\phi_{n} \dfrac{d}{d x}\left(p \dfrac{d \phi_{n}}{d x}\right)+q \phi_{n}^{2}\right] d x=-\lambda \int_{a}^{b} \phi_{n}^{2} d x \nonumber

One can solve the last equation for \lambda to find

\lambda=\dfrac{-\int_{a}^{b}\left[\phi_{n} \dfrac{d}{d x}\left(p \dfrac{d \phi_{n}}{d x}\right)+q \phi_{n}^{2}\right] d x}{\int_{a}^{b} \phi_{n}^{2} \sigma d x} \nonumber

It appears that we have solved for the eigenvalue and have not needed the machinery we had developed in Chapter 4 for studying boundary value problems. However, we really cannot evaluate this expression because we do not know the eigenfunctions, \phi_{n}(x) yet. Nevertheless, we will see what we can determine.

One can rewrite this result by performing an integration by parts on the first term in the numerator. Namely, pick u=\phi_{n} and d v=\dfrac{d}{d x}\left(p \dfrac{d \phi_{n}}{d x}\right) d x for the standard integration by parts formula. Then, we have

\int_{a}^{b} \phi_{n} \dfrac{d}{d x}\left(p \dfrac{d \phi_{n}}{d x}\right) d x=\left.p \phi_{n} \dfrac{d \phi_{n}}{d x}\right|_{a} ^{b}-\int_{a}^{b}\left[p\left(\dfrac{d \phi_{n}}{d x}\right)^{2}-q \phi_{n}^{2}\right] d x \nonumber

Inserting the new formula into the expression for \lambda, leads to the Rayleigh Quotient

\lambda_{n}=\dfrac{-\left.p \phi_{n} \dfrac{d \phi_{n}}{d x}\right|_{a} ^{b}+\int_{a}^{b}\left[p\left(\dfrac{d \phi_{n}}{d x}\right)^{2}-q \phi_{n}^{2}\right] d x}{\int_{a}^{b} \phi_{n}^{2} \sigma d x} . \nonumber

In many applications the sign of the eigenvalue is important. As we had seen in the solution of the heat equation, T^{\prime}+k \lambda T=0. Since we expect the heat energy to diffuse, the solutions should decay in time. Thus, we would expect \lambda>0. In studying the wave equation, one expects vibrations and these are only possible with the correct sign of the eigenvalue (positive again). Thus, in order to have nonnegative eigenvalues, we see from (6.21) that

a. q(x) \leq 0, and

b. -\left.p \phi_{n} \dfrac{d \phi_{n}}{d x}\right|_{a} ^{b} \geq 0

Furthermore, if \lambda is a zero eigenvalue, then q(x) \equiv 0 and \alpha_{1}=\alpha_{2}=0 in the homogeneous boundary conditions. This can be seen by setting the numerator equal to zero. Then, q(x)=0 and \phi_{n}^{\prime}(x)=0. The second of these conditions inserted into the boundary conditions forces the restriction on the type of boundary conditions.

One of the (unproven here) properties of Sturm-Liouville eigenvalue problems with homogeneous boundary conditions is that the eigenvalues are ordered, \lambda_{1}<\lambda_{2}<\ldots. Thus, there is a smallest eigenvalue. It turns out that for any continuous function, y(x)

\lambda_{1}=\min _{y(x)} \dfrac{-\left.p y \dfrac{d y}{d x}\right|_{a} ^{b}+\int_{a}^{b}\left[p\left(\dfrac{d y}{d x}\right)^{2}-q y^{2}\right] d x}{\int_{a}^{b} y^{2} \sigma d x} \nonumber

and this minimum is obtained when y(x)=\phi_{1}(x). This result can be used to get estimates of the minimum eigenvalue by using trial functions which are continuous and satisfy the boundary conditions, but do not necessarily satisfy the differential equation. Example 6.10. We have already solved the eigenvalue problem \phi^{\prime \prime}+\lambda \phi=0, \phi(0)=0, \phi(1)=0. In this case, the lowest eigenvalue is \lambda_{1}=\pi^{2}. We can pick a nice function satisfying the boundary conditions, say y(x)=x-x^{2} Inserting this into Equation (6.22), we find

\lambda_{1} \leq \dfrac{\int_{0}^{1}(1-2 x)^{2} d x}{\int_{0}^{1}\left(x-x^{2}\right)^{2} d x}=10 \nonumber

Indeed, 10 \geq \pi^{2}

The Eigenfunction Expansion Method

In section 4.3 .2 we saw generally how one can use the eigenfunctions of a differential operator to solve a nonhomogeneous boundary value problem. In this chapter we have seen that Sturm-Liouville eigenvalue problems have the requisite set of orthogonal eigenfunctions. In this section we will apply the eigenfunction expansion method to solve a particular nonhomogenous boundary value problem.

Recall that one starts with a nonhomogeneous differential equation

\mathcal{L} y=f, \nonumber

where y(x) is to satisfy given homogeneous boundary conditions. The method makes use of the eigenfunctions satisfying the eigenvalue problem

\mathcal{L} \phi_{n}=-\lambda_{n} \sigma \phi_{n} \nonumber

subject to the given boundary conditions. Then, one assumes that y(x) can be written as an expansion in the eigenfunctions,

y(x)=\sum_{n=1}^{\infty} c_{n} \phi_{n}(x), \nonumber

and inserts the expansion into the nonhomogeneous equation. This gives

f(x)=\mathcal{L}\left(\sum_{n=1}^{\infty} c_{n} \phi_{n}(x)\right)=-\sum_{n=1}^{\infty} c_{n} \lambda_{n} \sigma(x) \phi_{n}(x) \nonumber

The expansion coefficients are then found by making use of the orthogonality of the eigenfunctions. Namely, we multiply the last equation by \phi_{m}(x) and integrate. We obtain

\int_{a}^{b} f(x) \phi_{m}(x) d x=-\sum_{n=1}^{\infty} c_{n} \lambda_{n} \int_{a}^{b} \phi_{n}(x) \phi_{m}(x) \sigma(x) d x \nonumber

Orthogonality yields

\int_{a}^{b} f(x) \phi_{m}(x) d x=-c_{m} \lambda_{m} \int_{a}^{b} \phi_{m}^{2}(x) \sigma(x) d x \nonumber

Solving for c_{m}, we have

c_{m}=-\dfrac{\int_{a}^{b} f(x) \phi_{m}(x) d x}{\lambda_{m} \int_{a}^{b} \phi_{m}^{2}(x) \sigma(x) d x} . \nonumber

Example 6.11. As an example, we consider the solution of the boundary value problem

\begin{aligned} \left(x y^{\prime}\right)^{\prime}+\dfrac{y}{x} &=\dfrac{1}{x}, \quad x \in[1, e], \\[4pt] y(1) &=0=y(e) . \end{aligned} \nonumber

This equation is already in self-adjoint form. So, we know that the associated Sturm-Liouville eigenvalue problem has an orthogonal set of eigenfunctions. We first determine this set. Namely, we need to solve

\left(x \phi^{\prime}\right)^{\prime}+\dfrac{\phi}{x}=-\lambda \sigma \phi, \quad \phi(1)=0=\phi(e) . \nonumber

Rearranging the terms and multiplying by x, we have that

x^{2} \phi^{\prime \prime}+x \phi^{\prime}+(1+\lambda \sigma x) \phi=0 . \nonumber

This is almost an equation of Cauchy-Euler type. Picking the weight function \sigma(x)=\dfrac{1}{x}, we have

x^{2} \phi^{\prime \prime}+x \phi^{\prime}+(1+\lambda) \phi=0 . \nonumber

This is easily solved. The characteristic equation is

r^{2}+(1+\lambda)=0 . \nonumber

One obtains nontrivial solutions of the eigenvalue problem satisfying the boundary conditions when \lambda>-1. The solutions are

\phi_{n}(x)=A \sin (n \pi \ln x), \quad n=1,2, \ldots \nonumber

where \lambda_{n}=n^{2} \pi^{2}-1

It is often useful to normalize the eigenfunctions. This means that one chooses A so that the norm of each eigenfunction is one. Thus, we have

\begin{aligned} 1 &=\int_{1}^{e} \phi_{n}(x)^{2} \sigma(x) d x \\[4pt] &=A^{2} \int_{1}^{e} \sin (n \pi \ln x) \dfrac{1}{x} d x \\[4pt] &=A^{2} \int_{0}^{1} \sin (n \pi y) d y=\dfrac{1}{2} A^{2} \end{aligned} \nonumber

Thus, A=\sqrt{2}

We now turn towards solving the nonhomogeneous problem, \mathcal{L} y=\dfrac{1}{x}. We first expand the unknown solution in terms of the eigenfunctions,

y(x)=\sum_{n=1}^{\infty} c_{n} \sqrt{2} \sin (n \pi \ln x) . \nonumber

Inserting this solution into the differential equation, we have

\dfrac{1}{x}=\mathcal{L} y=-\sum_{n=1}^{\infty} c_{n} \lambda_{n} \sqrt{2} \sin (n \pi \ln x) \dfrac{1}{x} \nonumber

Next, we make use of orthogonality. Multiplying both sides by \phi_{m}(x)= \sqrt{2} \sin (m \pi \ln x) and integrating, gives

\lambda_{m} c_{m}=\int_{1}^{e} \sqrt{2} \sin (m \pi \ln x) \dfrac{1}{x} d x=\dfrac{\sqrt{2}}{m \pi}\left[(-1)^{m}-1\right] . \nonumber

Solving for c_{m}, we have

c_{m}=\dfrac{\sqrt{2}}{m \pi} \dfrac{\left[(-1)^{m}-1\right]}{m^{2} \pi^{2}-1} . \nonumber

Finally, we insert our coefficients into the expansion for y(x). The solution is then

y(x)=\sum_{n=1}^{\infty} \dfrac{2}{n \pi} \dfrac{\left[(-1)^{n}-1\right]}{n^{2} \pi^{2}-1} \sin (n \pi \ln (x)) . \nonumber

The Fredholm Alternative Theorem

Given that L y=f, when can one expect to find a solution? Is it unique? These questions are answered by the Fredholm Alternative Theorem. This theorem occurs in many forms from a statement about solutions to systems of algebraic equations to solutions of boundary value problems and integral equations. The theorem comes in two parts, thus the term "alternative". Either the equation has exactly one solution for all f, or the equation has many solutions for some f^{\prime} ’s and none for the rest.

The reader is familiar with the statements of the Fredholm Alternative for the solution of systems of algebraic equations. One seeks solutions of the system A x=b for A an n \times m matrix. Defining the matrix adjoint, A^{*} through <A x, y>=<x, A^{*} y> for all x, y, \in \mathcal{C}^{n}, then either

Theorem 6.12. First Alternative

The equation A x=b has a solution if and only if <b, v>=0 for all v such that A^{*} v=0 Theorem 6.13. Second Alternative

A solution of A x=b, if it exists, is unique if and only if x=0 is the only solution of A x=0.

The second alternative is more familiar when given in the form: The solution of a nonhomogeneous system of n equations and n unknowns is unique if the only solution to the homogeneous problem is the zero solution. Or, equivalently, A is invertible, or has nonzero determinant.

Proof. We prove the second theorem first. Assume that A x=0 for x \neq 0 and A x_{0}=b. Then A\left(x_{0}+\alpha x\right)=b for all \alpha. Therefore, the solution is not unique. Conversely, if there are two different solutions, x_{1} and x_{2}, satisfying A x_{1}=b and A x_{2}=b, then one has a nonzero solution x=x_{1}-x_{2} such that A x=A\left(x_{1}-x_{2}\right)=0.

The proof of the first part of the first theorem is simple. Let A^{*} v=0 and A x_{0}=b. Then we have

<b, v>=<A x_{0}, v>=<x_{0}, A^{*} v>=0 . \nonumber

For the second part we assume that \langle b, v\rangle=0 for all v such that A^{*} v=0. Write b as the sum of a part that is in the range of A and a part that in the space orthogonal to the range of A, b=b_{R}+b_{O}. Then, 0=<b_{O}, A x>=< A^{*} b, x> for all x. Thus, A^{*} b_{O}. Since \langle b, v>=0 for all v in the nullspace of A^{*}, then <b, b_{O}>=0. Therefore, <b, v>=0 implies that 0=<b, O>=< b_{R}+b_{O}, b_{O}>=<b_{O}, b_{O}>. This means that b_{O}=0, giving b=b_{R} is in the range of A. So, A x=b has a solution.

Example 6.14. Determine the allowed forms of \mathbf{b} for a solution of A \mathbf{x}=\mathbf{b} to exist, where

A=\left(\begin{array}{ll} 1 & 2 \\[4pt] 3 & 6 \end{array}\right) \nonumber

First note that A^{*}=\bar{A}^{T}. This is seen by looking at

\begin{aligned} <A \mathbf{x}, \mathbf{y}>&=<\mathbf{x}, A^{*} \mathbf{y}>\\[4pt] \sum_{i=1}^{n} \sum_{j=1}^{n} a_{i j} x_{j} \bar{y}_{i} &=\sum_{j=1}^{n} x_{j} \sum_{j=1}^{n} a_{i j} \bar{y}_{i} \\[4pt] &=\sum_{j=1}^{n} x_{j} \sum_{j=1}^{n}\left(\bar{a}^{T}\right)_{j i} y_{i} \end{aligned} \nonumber

For this example,

A^{*}=\left(\begin{array}{ll} 1 & 3 \\[4pt] 2 & 6 \end{array}\right) \nonumber

We next solve A^{*} \mathbf{v}=0. This means, v_{1}+3 v_{2}=0. So, the nullspace of A^{*} is spanned by \mathbf{v}=(3,-1)^{T}. For a solution of A \mathbf{x}=\mathbf{b} to exist, \mathbf{b} would have to be orthogonal to \mathbf{v}. Therefore, a solution exists when

\mathbf{b}=\alpha\left(\begin{array}{l} 1 \\[4pt] 3 \end{array}\right) \nonumber

So, what does this say about solutions of boundary value problems? There is a more general theory for linear operators. The matrix formulations follows, since matrices are simply representations of linear transformations. A more general statement would be

Theorem 6.15. If L is a bounded linear operator on a Hilbert space, then L y=f has a solution if and only if <f, v>=0 for every v such that L^{\dagger} v=0

The statement for boundary value problems is similar. However, we need to be careful to treat the boundary conditions in our statement. As we have seen, after several integrations by parts we have that

<\mathcal{L} u, v>=S(u, v)+<u, \mathcal{L}^{\dagger} v> \nonumber

where S(u, v) involves the boundary conditions on u and v. Note that for nonhomogeneous boundary conditions, this term may no longer vanish.

Theorem 6.16. The solution of the boundary value problem \mathcal{L} u=f with boundary conditions B u=g exists if and only if

<f, v>-S(u, v)=0 \nonumber

for all v satisfying \mathcal{L}^{\dagger} v=0 and B^{\dagger} v=0.

Example 6.17. Consider the problem

u^{\prime \prime}+u=f(x), \quad u(0)-u(2 \pi)=\alpha, u^{\prime}(0)-u^{\prime}(2 \pi)=\beta \nonumber

Only certain values of \alpha and \beta will lead to solutions. We first note that L=L^{\dagger}

\dfrac{d^{2}}{d x^{2}}+1 \nonumber

Solutions of

L^{\dagger} v=0, \quad v(0)-v(2 \pi)=0, v^{\prime}(0)-v^{\prime}(2 \pi)=0 \nonumber

are easily found to be linear combinations of v=\sin x and v=\cos x. Next one computes

\begin{aligned} S(u, v) &=\left[u^{\prime} v-u v^{\prime}\right]_{0}^{2 \pi} \\[4pt] &=u^{\prime}(2 \pi) v(2 \pi)-u(2 \pi) v^{\prime}(2 \pi)-u^{\prime}(0) v(0)+u(0) v^{\prime}(0) \end{aligned} \nonumber

For v(x)=\sin x, this yields

S(u, \sin x)=-u(2 \pi)+u(0)=\alpha \nonumber

Similarly,

S(u, \cos x)=\beta \nonumber

Using <f, v>-S(u, v)=0, this leads to the conditions

\begin{aligned} &\int_{0}^{2 \pi} f(x) \sin x d x=\alpha \\[4pt] &\int_{0}^{2 \pi} f(x) \cos x d x=\beta \end{aligned} \nonumber

Problems

6.1. Find the adjoint operator and its domain for L u=u^{\prime \prime}+4 u^{\prime}-3 u, u^{\prime}(0)+ 4 u(0)=0, u^{\prime}(1)+4 u(1)=0.

6.2. Show that a Sturm-Liouville operator with periodic boundary conditions on [a, b] is self-adjoint if and only if p(a)=p(b). [Recall, periodic boundary conditions are given as u(a)=u(b) and \left.u^{\prime}(a)=u^{\prime}(b) .\right]

6.3. The Hermite differential equation is given by y^{\prime \prime}-2 x y^{\prime}+\lambda y=0. Rewrite this equation in self-adjoint form. From the Sturm-Liouville form obtained, verify that the differential operator is self adjoint on (-\infty, \infty). Give the integral form for the orthogonality of the eigenfunctions.

6.4. Find the eigenvalues and eigenfunctions of the given Sturm-Liouville problems.

a. y^{\prime \prime}+\lambda y=0, y^{\prime}(0)=0=y^{\prime}(\pi).

b. \left(x y^{\prime}\right)^{\prime}+\dfrac{\lambda}{x} y=0, y(1)=y\left(e^{2}\right)=0.

6.5. The eigenvalue problem x^{2} y^{\prime \prime}-\lambda x y^{\prime}+\lambda y=0 with y(1)=y(2)=0 is not a Sturm-Liouville eigenvalue problem. Show that none of the eigenvalues are real by solving this eigenvalue problem.

6.6. In Example 6.10 we found a bound on the lowest eigenvalue for the given eigenvalue problem.

a. Verify the computation in the example. b. Apply the method using

y(x)=\left\{\begin{array}{cc} x, & 0<x<\dfrac{1}{2} \\[4pt] 1-x, & \dfrac{1}{2}<x<1 \end{array}\right. \nonumber

Is this an upper bound on \lambda_{1}

c. Use the Rayleigh quotient to obtain a good upper bound for the lowest eigenvalue of the eigenvalue problem: \phi^{\prime \prime}+\left(\lambda-x^{2}\right) \phi=0, \phi(0)=0, \phi^{\prime}(1)=0.

6.7. Use the method of eigenfunction expansions to solve the problem:

y^{\prime \prime}+4 y=x^{2}, \quad y(0)=y(1)=0 \nonumber

6.8. Determine the solvability conditions for the nonhomogeneous boundary value problem: u^{\prime \prime}+4 u=f(x), u(0)=\alpha, u^{\prime}(1)=\beta.

Special Functions

In this chapter we will look at some additional functions which arise often in physical applications and are eigenfunctions for some Sturm-Liouville boundary value problem. We begin with a collection of special functions, called the classical orthogonal polynomials. These include such polynomial functions as the Legendre polynomials, the Hermite polynomials, the Tchebychef and the Gegenbauer polynomials. Also, Bessel functions occur quite often. We will spend more time exploring the Legendre and Bessel functions. These functions are typically found as solutions of differential equations using power series methods in a first course in differential equations.

Classical Orthogonal Polynomials

We begin by noting that the sequence of functions \left\{1, x, x^{2}, \ldots\right\} is a basis of linearly independent functions. In fact, by the Stone-Weierstrass Approximation Theorem this set is a basis of L_{\sigma}^{2}(a, b), the space of square integrable functions over the interval [a, b] relative to weight \sigma(x). We are familiar with being able to expand functions over this basis, since the expansions are just power series representations of the functions,

f(x) \sim \sum_{n=0}^{\infty} c_{n} x^{n} . \nonumber

However, this basis is not an orthogonal set of basis functions. One can easily see this by integrating the product of two even, or two odd, basis functions with \sigma(x)=1 and (a, b)=(-1,1). For example,

<1, x^{2}>=\int_{-1}^{1} x^{0} x^{2} d x=\dfrac{2}{3} \nonumber

Since we have found that orthogonal bases have been useful in determining the coefficients for expansions of given functions, we might ask if it is possible to obtain an orthogonal basis involving these powers of x. Of course, finite combinations of these basis element are just polynomials!

\mathrm{OK}, we will ask. "Given a set of linearly independent basis vectors, can one find an orthogonal basis of the given space?" The answer is yes. We recall from introductory linear algebra, which mostly covers finite dimensional vector spaces, that there is a method for carrying this out called the GramSchmidt Orthogonalization Process. We will recall this process for finite dimensional vectors and then generalize to function spaces.

image
Figure 7.1. The basis \mathbf{a}_{1}, \mathbf{a}_{2}, and \mathbf{a}_{3}, of \mathbf{R}^{3} considered in the text.

Let’s assume that we have three vectors that span \mathbf{R}^{3}, given by \mathbf{a}_{1}, \mathbf{a}_{2}, and \mathbf{a}_{3} and shown in Figure 7.1. We seek an orthogonal basis \mathbf{e}_{1}, \mathbf{e}_{2}, and \mathbf{e}_{3}, beginning one vector at a time.

First we take one of the original basis vectors, say \mathbf{a}_{1}, and define

\mathbf{e}_{1}=\mathbf{a}_{1} \nonumber

Of course, we might want to normalize our new basis vectors, so we would denote such a normalized vector with a "hat":

\hat{\mathbf{e}}_{1}=\dfrac{\mathbf{e}_{1}}{e_{1}}, \nonumber

where e_{1}=\sqrt{\mathbf{e}_{1} \cdot \mathbf{e}_{1}}.

image
Figure 7.2. A plot of the vectors \mathbf{e}_{1}, \mathbf{a}_{2}, and \mathbf{e}_{2} needed to find the projection of \mathbf{a}_{2}, on \mathbf{e}_{1}

Note that this is easily proven by writing the projection as a vector of length a_{2} \cos \theta in direction \hat{\mathbf{e}}_{1}, where \theta is the angle between \mathbf{e}_{1} and \mathbf{a}_{2}. Using the definition of the dot product, \mathbf{a} \cdot \mathbf{b}=a b \cos \theta, the projection formula follows.

Combining Equations (7.1)-(7.2), we find that

\mathbf{e}_{2}=\mathbf{a}_{2}-\dfrac{\mathbf{a}_{2} \cdot \mathbf{e}_{1}}{e_{1}^{2}} \mathbf{e}_{1} \nonumber

It is a simple matter to verify that \mathbf{e}_{2} is orthogonal to \mathbf{e}_{1} :

\begin{aligned} \mathbf{e}_{2} \cdot \mathbf{e}_{1} &=\mathbf{a}_{2} \cdot \mathbf{e}_{1}-\dfrac{\mathbf{a}_{2} \cdot \mathbf{e}_{1}}{e_{1}^{2}} \mathbf{e}_{1} \cdot \mathbf{e}_{1} \\[4pt] &=\mathbf{a}_{2} \cdot \mathbf{e}_{1}-\mathbf{a}_{2} \cdot \mathbf{e}_{1}=0 \end{aligned} \nonumber

Now, we seek a third vector \mathbf{e}_{3} that is orthogonal to both \mathbf{e}_{1} and \mathbf{e}_{2}. Pictorially, we can write the given vector \mathbf{a}_{3} as a combination of vector projections along \mathbf{e}_{1} and \mathbf{e}_{2} and the new vector. This is shown in Figure 7.3. Then we have,

\mathbf{e}_{3}=\mathbf{a}_{3}-\dfrac{\mathbf{a}_{3} \cdot \mathbf{e}_{1}}{e_{1}^{2}} \mathbf{e}_{1}-\dfrac{\mathbf{a}_{3} \cdot \mathbf{e}_{2}}{e_{2}^{2}} \mathbf{e}_{2} . \nonumber

Again, it is a simple matter to compute the scalar products with \mathbf{e}_{1} and \mathbf{e}_{2} to verify orthogonality.

We can easily generalize the procedure to the N-dimensional case.

Gram-Schmidt Orthogonalization in N-Dimensions

image
Figure 7.3. A plot of the vectors and their projections for determining \mathbf{e}_{3}.

Now, we can generalize this idea to (real) function spaces.

Gram-Schmidt Orthogonalization for Function Spaces

Let f_{n}(x), n \in N_{0}=\{0,1,2, \ldots\}, be a linearly independent sequence of continuous functions defined for x \in[a, b]. Then, an orthogonal basis of functions, \phi_{n}(x), n \in N_{0} can be found and is given by

\phi_{0}(x)=f_{0}(x) \nonumber

and

\phi_{n}(x)=f_{n}(x)-\sum_{j=0}^{n-1} \dfrac{<f_{n}, \phi_{j}>}{\left\|\phi_{j}\right\|^{2}} \phi_{j}(x), \quad n=1,2, \ldots \nonumber

Here we are using inner products relative to weight \sigma(x),

<f, g>=\int_{a}^{b} f(x) g(x) \sigma(x) d x \nonumber

Note the similarity between the orthogonal basis in (7.7) and the expression for the finite dimensional case in Equation (7.6).

Example 7.1. Apply the Gram-Schmidt Orthogonalization process to the set f_{n}(x)=x^{n}, n \in N_{0}, when x \in(-1,1) and \sigma(x)=1.

First, we have \phi_{0}(x)=f_{0}(x)=1. Note that

\int_{-1}^{1} \phi_{0}^{2}(x) d x=\dfrac{1}{2} \nonumber

We could use this result to fix the normalization of our new basis, but we will hold off on doing that for now.

Now, we compute the second basis element:

\begin{aligned} \phi_{1}(x) &=f_{1}(x)-\dfrac{<f_{1}, \phi_{0}>}{\left\|\phi_{0}\right\|^{2}} \phi_{0}(x) \\[4pt] &=x-\dfrac{<x, 1>}{\|1\|^{2}} 1=x \end{aligned} \nonumber

since <x, 1> is the integral of an odd function over a symmetric interval.

For \phi_{2}(x), we have

\begin{aligned} \phi_{2}(x) &=f_{2}(x)-\dfrac{<f_{2}, \phi_{0}>}{\left\|\phi_{0}\right\|^{2}} \phi_{0}(x)-\dfrac{<f_{2}, \phi_{1}>}{\left\|\phi_{1}\right\|^{2}} \phi_{1} \\[4pt] &=x^{2}-\dfrac{<x^{2}, 1>}{\|1\|^{2}} 1-\dfrac{<x^{2}, x>}{\|x\|^{2}} x \\[4pt] &=x^{2}-\dfrac{\int_{-1}^{1} x^{2} d x}{\int_{-1}^{1} d x} \\[4pt] &=x^{2}-\dfrac{1}{3} \end{aligned} \nonumber

So far, we have the orthogonal set \left\{1, x, x^{2}-\dfrac{1}{3}\right\}. If one chooses to normalize these by forcing \phi_{n}(1)=1, then one obtains the classical Legendre polynomials, P_{n}(x)=\phi_{1}(x). Thus,

P_{2}(x)=\dfrac{1}{2}\left(3 x^{2}-1\right) . \nonumber

Note that this normalization is different than the usual one. In fact, we see that P_{2}(x) does not have a unit norm,

\left\|P_{2}\right\|^{2}=\int_{-1}^{1} P_{2}^{2}(x) d x=\dfrac{2}{5} \nonumber

The set of Legendre polynomials is just one set of classical orthogonal polynomials that can be obtained in this way. Many had originally appeared as solutions of important boundary value problems in physics. They all have similar properties and we will just elaborate some of these for the Legendre functions in the next section. Other orthogonal polynomials in this group are shown in Table 7.1.

For reference, we also note the differential equations satisfied by these functions.

7.2 Legendre Polynomials

In the last section we saw the Legendre polynomials in the context of orthogonal bases for a set of square integrable functions in L^{2}(-1,1). In your first course in differential equations, you saw these polynomials as one of the solutions of the differential equation

Polynomial Symbol Interval \sigma(x)
Hermite H_{n}(x) (-\infty, \infty) e^{-x^{2}}
Laguerre L_{n}^{\alpha}(x) [0, \infty) e^{-x}
Legendre P_{n}(x) (-1,1) 1
Gegenbauer C_{n}^{\lambda}(x) (-1,1) \left(1-x^{2}\right)^{\lambda-1 / 2}
Tchebychef of the 1st kind T_{n}(x) (-1,1) \left(1-x^{2}\right)^{-1 / 2}
Tchebychef of the 2 nd kind U_{n}(x) (-1,1) \left(1-x^{2}\right)^{-1 / 2}
Jacobi P_{n}^{(\nu, \mu)}(x) (-1,1) (1-x)^{\nu}(1-x)^{\mu}

Table 7.1. Common classical orthogonal polynomials with the interval and weight function used to define them.

Polynomial Differential Equation
Hermite y^{\prime \prime}-2 x y^{\prime}+2 n y=0
Laguerre x y^{\prime \prime}+(\alpha+1-x) y^{\prime}+n y=0
Legendre \left(1-x^{2}\right) y^{\prime \prime}-2 x y^{\prime}+n(n+1) y=0
Gegenbauer \left(1-x^{2}\right) y^{\prime \prime}-(2 n+3) x y^{\prime}+\lambda y=0
  \left(1-x^{2}\right) y^{\prime \prime}-x y^{\prime}+n^{2} y=0
Tchebychef of the 1st kind \left(1-x^{2}\right) y^{\prime \prime}+(\nu-\mu+(\mu+\nu+2) x) y^{\prime}+n(n+1+\mu+\nu) y=0
Jacobi  

Table 7.2. Differential equations satisfied by some of the common classical orthogonal polynomials.

\left(1-x^{2}\right) y^{\prime \prime}-2 x y^{\prime}+n(n+1) y=0, \quad n \in N_{0} . \nonumber

Recall that these were obtained by using power series expansion methods. In this section we will explore a few of the properties of these functions.

For completeness, we recall the solution of Equation (7.11) using the power series method. We assume that the solution takes the form

y(x)=\sum_{k=0}^{\infty} a_{k} x^{k} . \nonumber

The goal is to determine the coefficients, a_{k}. Inserting this series into Equation (7.11), we have

\left(1-x^{2}\right) \sum_{k=0}^{\infty} k(k-1) a_{k} x^{k-2}-\sum_{k=0}^{\infty} 2 a_{k} k x^{k}+\sum_{k=0}^{\infty} n(n+1) a_{k} x^{k}=0 \nonumber

\sum_{k=2}^{\infty} k(k-1) a_{k} x^{k-2}-\sum_{k=2}^{\infty} k(k-1) a_{k} x^{k}+\sum_{k=0}^{\infty}[-2 k+n(n+1)] a_{k} x^{k}=0 \nonumber

We can combine some of these terms:

\sum_{k=2}^{\infty} k(k-1) a_{k} x^{k-2}+\sum_{k=0}^{\infty}[-k(k-1)-2 k+n(n+1)] a_{k} x^{k}=0 . \nonumber

Further simplification yields

\sum_{k=2}^{\infty} k(k-1) a_{k} x^{k-2}+\sum_{k=0}^{\infty}[n(n+1)-k(k+1)] a_{k} x^{k}=0 \nonumber

We need to collect like powers of x. This can be done by reindexing each sum. In the first sum, we let m=k-2, or k=m+2. In the second sum we independently let k=m. Then all powers of x are of the form x^{m}. This gives

\sum_{m=0}^{\infty}(m+2)(m+1) a_{m+2} x^{m}+\sum_{m=0}^{\infty}[n(n+1)-m(m+1)] a_{m} x^{m}=0 \nonumber

Combining these sums, we have

\sum_{m=0}^{\infty}\left[(m+2)(m+1) a_{m+2}+(n(n+1)-m(m+1)) a_{m}\right] x^{m}=0 \nonumber

This has to hold for all x. So, the coefficients of x^{m} must vanish:

(m+2)(m+1) a_{m+2}+(n(n+1)-m(m+1)) a_{m} \nonumber

Solving for a_{m+2}, we obtain the recursion relation

a_{m+2}=\dfrac{n(n+1)-m(m+1)}{(m+2)(m+1)} a_{m}, \quad m \geq 0 . \nonumber

Thus, a_{m+2} is proportional to a_{m}. We can iterate and show that each coefficient is either proportional to a_{0} or a_{1}. However, for n an integer, sooner, or later, m=n and the series truncates. a_{m}=0 for m>n. Thus, we obtain polynomial solutions. These polynomial solutions are the Legendre polynomials, which we designate as y(x)=P_{n}(x). Furthermore, for n an even integer, P_{n}(x) is an even function and for n an odd integer, P_{n}(x) is an odd function.

Actually, this is a trimmed down version of the method. We would need to find a second linearly independent solution. We will not discuss these solutions and leave that for the interested reader to investigate.

The Rodrigues Formula

The first property that the Legendre polynomials have is the Rodrigues formula:

P_{n}(x)=\dfrac{1}{2^{n} n !} \dfrac{d^{n}}{d x^{n}}\left(x^{2}-1\right)^{n}, \quad n \in N_{0} . \nonumber

From the Rodrigues formula, one can show that P_{n}(x) is an nth degree polynomial. Also, for n odd, the polynomial is an odd function and for n even, the polynomial is an even function.

As an example, we determine P_{2}(x) from Rodrigues formula:

\begin{aligned} P_{2}(x) &=\dfrac{1}{2^{2} 2 !} \dfrac{d^{2}}{d x^{2}}\left(x^{2}-1\right)^{2} \\[4pt] &=\dfrac{1}{8} \dfrac{d^{2}}{d x^{2}}\left(x^{4}-2 x^{2}+1\right) \\[4pt] &=\dfrac{1}{8} \dfrac{d}{d x}\left(4 x^{3}-4 x\right) \\[4pt] &=\dfrac{1}{8}\left(12 x^{2}-4\right) \\[4pt] &=\dfrac{1}{2}\left(3 x^{2}-1\right) \end{aligned} \nonumber

Note that we get the same result as we found in the last section using orthogonalization.

One can systematically generate the Legendre polynomials in tabular form as shown in Table 7.2 .1. In Figure 7.4 we show a few Legendre polynomials.

n \left(x^{2}-1\right)^{n} \dfrac{d^{n}}{d x^{n}}\left(x^{2}-1\right)^{n} \dfrac{1}{2^{n} n !} P_{n}(x)
0 1 1 1 1
1 x^{2}-1 2 x \dfrac{1}{2} x
2 x^{4}-2 x^{2}+1 12 x^{2}-4 \dfrac{1}{8} \dfrac{1}{2}\left(3 x^{2}-1\right)
3 x^{6}-3 x^{4}+3 x^{2}-1 120 x^{3}-72 x \dfrac{1}{48} \dfrac{1}{2}\left(5 x^{3}-3 x\right)

Table 7.3. Tabular computation of the Legendre polynomials using the Rodrigues formula.

image
Figure 7.4. Plots of the Legendre polynomials P_{2}(x), P_{3}(x), P_{4}(x), and P_{5}(x).

Three Term Recursion Formula

The classical orthogonal polynomials also satisfy three term recursion formulae. In the case of the Legendre polynomials, we have

(2 n+1) x P_{n}(x)=(n+1) P_{n+1}(x)+n P_{n-1}(x), \quad n=1,2, \ldots \nonumber

This can also be rewritten by replacing n with n-1 as

(2 n-1) x P_{n-1}(x)=n P_{n}(x)+(n-1) P_{n-2}(x), \quad n=1,2, \ldots \nonumber

We will prove this recursion formula in two ways. First we use the orthogonality properties of Legendre polynomials and the following lemma.

Lemma 7.2. The leading coefficient of x^{n} in P_{n}(x) is \dfrac{1}{2^{n} n !} \dfrac{(2 n) !}{n !}.

Proof. We can prove this using Rodrigues formula. first, we focus on the leading coefficient of \left(x^{2}-1\right)^{n}, which is x^{2 n}. The first derivative of x^{2 n} is 2 n x^{2 n-1}. The second derivative is 2 n(2 n-1) x^{2 n-2}. The j th derivative is

\dfrac{d^{j} x^{2 n}}{d x^{j}}=[2 n(2 n-1) \ldots(2 n-j+1)] x^{2 n-j} \nonumber

Thus, the nth derivative is given by

\dfrac{d^{n} x^{2 n}}{d x^{n}}=[2 n(2 n-1) \ldots(n+1)] x^{n} \nonumber

This proves that P_{n}(x) has degree n. The leading coefficient of P_{n}(x) can now be written as

\begin{aligned} \dfrac{1}{2^{n} n !}[2 n(2 n-1) \ldots(n+1)] &=\dfrac{1}{2^{n} n !}[2 n(2 n-1) \ldots(n+1)] \dfrac{n(n-1) \ldots 1}{n(n-1) \ldots 1} \\[4pt] &=\dfrac{1}{2^{n} n !} \dfrac{(2 n) !}{n !} \end{aligned} \nonumber

In order to prove the three term recursion formula we consider the expression (2 n-1) x P_{n-1}(x)-n P_{n}(x). While each term is a polynomial of degree n, the leading order terms cancel. We need only look at the coefficient of the leading order term first expression. It is

(2 n-1) \dfrac{1}{2^{n-1}(n-1) !} \dfrac{(2 n-2) !}{(n-1) !}=\dfrac{1}{2^{n-1}(n-1) !} \dfrac{(2 n-1) !}{(n-1) !}=\dfrac{(2 n-1) !}{2^{n-1}[(n-1) !]^{2}} . \nonumber

The coefficient of the leading term for n P_{n}(x) can be written as

n \dfrac{1}{2^{n} n !} \dfrac{(2 n) !}{n !}=n\left(\dfrac{2 n}{2 n^{2}}\right)\left(\dfrac{1}{2^{n-1}(n-1) !}\right) \dfrac{(2 n-1) !}{(n-1) !} \dfrac{(2 n-1) !}{2^{n-1}[(n-1) !]^{2}} . \nonumber

It is easy to see that the leading order terms in (2 n-1) x P_{n-1}(x)-n P_{n}(x) cancel.

The next terms will be of degree n-2. This is because the P_{n} ’s are either even or odd functions, thus only containing even, or odd, powers of x. We conclude that

(2 n-1) x P_{n-1}(x)-n P_{n}(x)=\text { polynomial of degree } n-2 . \nonumber

Therefore, since the Legendre polynomials form a basis, we can write this polynomial as a linear combination of of Legendre polynomials:

(2 n-1) x P_{n-1}(x)-n P_{n}(x)=c_{0} P_{0}(x)+c_{1} P_{1}(x)+\ldots+c_{n-2} P_{n-2}(x) . \nonumber

Multiplying Equation (7.17) by P_{m}(x) for m=0,1, \ldots, n-3, integrating from -1 to 1 , and using orthogonality, we obtain

0=c_{m}\left\|P_{m}\right\|^{2}, \quad m=0,1, \ldots, n-3 . \nonumber

[Note: \int_{-1}^{1} x^{k} P_{n}(x) d x=0 for k \leq n-1. Thus, \int_{-1}^{1} x P_{n-1}(x) P_{m}(x) d x=0 for m \leq n-3 .]

Thus, all of these c_{m} ’s are zero, leaving Equation (7.17) as

(2 n-1) x P_{n-1}(x)-n P_{n}(x)=c_{n-2} P_{n-2}(x) . \nonumber

The final coefficient can be found by using the normalization condition, P_{n}(1)=1. Thus, c_{n-2}=(2 n-1)-n=n-1.

The Generating Function

image
Figure 7.5. The position vectors used to describe the tidal force on the Earth due to the moon.

where \theta is the angle between \mathbf{r}_{1} and \mathbf{r}_{2}.

Typically, one of the position vectors is much larger than the other. Let’s assume that r_{1} \ll r_{2}. Then, one can write

\Phi \propto \dfrac{1}{\sqrt{r_{1}^{2}-2 r_{1} r_{2} \cos \theta+r_{2}^{2}}}=\dfrac{1}{r_{2}} \dfrac{1}{\sqrt{1-2 \dfrac{r_{1}}{r_{2}} \cos \theta+\left(\dfrac{r_{1}}{r_{2}}\right)^{2}}} \nonumber

Now, define x=\cos \theta and t=\dfrac{r_{1}}{r_{2}}. We then have the tidal potential is proportional to the generating function for the Legendre polynomials! So, we can write the tidal potential as

\Phi \propto \dfrac{1}{r_{2}} \sum_{n=0}^{\infty} P_{n}(\cos \theta)\left(\dfrac{r_{1}}{r_{2}}\right)^{n} . \nonumber

The first term in the expansion is the gravitational potential that gives the usual force between the Earth and the moon. [Recall that the force is the gradient of the potential, \mathbf{F}=\nabla\left(\dfrac{1}{r}\right).] The next terms will give expressions for the tidal effects

Now that we have some idea as to where this generating function might have originated, we can proceed to use it. First of all, the generating function can be used to obtain special values of the Legendre polynomials.

Example 7.3. Evaluate P_{n}(0) . P_{n}(0) is found by considering g(0, t). Setting x=0 in Equation (7.18), we have

g(0, t)=\dfrac{1}{\sqrt{1+t^{2}}}=\sum_{n=0}^{\infty} P_{n}(0) t^{n} \nonumber

We can use the binomial expansion to find our final answer. [See the last section of this chapter for a review.] Namely, we have

\dfrac{1}{\sqrt{1+t^{2}}}=1-\dfrac{1}{2} t^{2}+\dfrac{3}{8} t^{4}+\ldots \nonumber

Comparing these expansions, we have the P_{n}(0)=0 for n odd and for even integers one can show (see Problem 7.10 ) that

P_{2 n}(0)=(-1)^{n} \dfrac{(2 n-1) ! !}{(2 n) ! !} \nonumber

where n ! ! is the double factorial,

n ! !=\left\{\begin{array}{c} n(n-2) \ldots(3) 1, n>0, \text { odd } \\[4pt] n(n-2) \ldots(4) 2, n>0, \text { even } \\[4pt] 1 \quad n=0,-1 \end{array}\right. \nonumber

Example 7.4. Evaluate P_{n}(-1). This is a simpler problem. In this case we have

g(-1, t)=\dfrac{1}{\sqrt{1+2 t+t^{2}}}=\dfrac{1}{1+t}=1-t+t^{2}-t^{3}+\ldots \nonumber

Therefore, P_{n}(-1)=(-1)^{n}.

We can also use the generating function to find recursion relations. To prove the three term recursion (7.14) that we introduced above, then we need only differentiate the generating function with respect to t in Equation (7.18) and rearrange the result. First note that

\dfrac{\partial g}{\partial t}=\dfrac{x-t}{\left(1-2 x t+t^{2}\right)^{3 / 2}}=\dfrac{x-t}{1-2 x t+t^{2}} g(x, t) \nonumber

Combining this with

\dfrac{\partial g}{\partial t}=\sum_{n=0}^{\infty} n P_{n}(x) t^{n-1} \nonumber

we have

(x-t) g(x, t)=\left(1-2 x t+t^{2}\right) \sum_{n=0}^{\infty} n P_{n}(x) t^{n-1} \nonumber

Inserting the series expression for g(x, t) and distributing the sum on the right side, we obtain

(x-t) \sum_{n=0}^{\infty} P_{n}(x) t^{n}=\sum_{n=0}^{\infty} n P_{n}(x) t^{n-1}-\sum_{n=0}^{\infty} 2 n x P_{n}(x) t^{n}+\sum_{n=0}^{\infty} n P_{n}(x) t^{n+1} \nonumber

Rearranging leads to three separate sums:

\sum_{n=0}^{\infty} n P_{n}(x) t^{n-1}-\sum_{n=0}^{\infty}(2 n+1) x P_{n}(x) t^{n}+\sum_{n=0}^{\infty}(n+1) P_{n}(x) t^{n+1}=0 \nonumber

Each term contains powers of t that we would like to combine into a single sum. This is done by reindexing. For the first sum, we could use the new index k=n-1. Then, the first sum can be written

\sum_{n=0}^{\infty} n P_{n}(x) t^{n-1}=\sum_{k=-1}^{\infty}(k+1) P_{k+1}(x) t^{k} \nonumber

Using different indices is just another way of writing out the terms. Note that

\sum_{n=0}^{\infty} n P_{n}(x) t^{n-1}=0+P_{1}(x)+2 P_{2}(x) t+3 P_{3}(x) t^{2}+\ldots \nonumber

and

\sum_{k=-1}^{\infty}(k+1) P_{k+1}(x) t^{k}=0+P_{1}(x)+2 P_{2}(x) t+3 P_{3}(x) t^{2}+\ldots \nonumber

actually give the same sum. The indices are sometimes referred to as dummy indices because they do not show up in the expanded expression and can be replaced with another letter.

If we want to do so, we could now replace all of the k ’s with n ’s. However, we will leave the k ’s in the first term and now reindex the next sums in Equation (7.21). The second sum just needs the replacement n=k and the last sum we reindex using k=n+1. Therefore, Equation (7.21) becomes

\sum_{k=-1}^{\infty}(k+1) P_{k+1}(x) t^{k}-\sum_{k=0}^{\infty}(2 k+1) x P_{k}(x) t^{k}+\sum_{k=1}^{\infty} k P_{k-1}(x) t^{k}=0 . \nonumber

We can now combine all of the terms, noting the k=-1 term is automatically zero and the k=0 terms give

P_{1}(x)-x P_{0}(x)=0 . \nonumber

Of course, we know this already. So, that leaves the k>0 terms:

\sum_{k=1}^{\infty}\left[(k+1) P_{k+1}(x)-(2 k+1) x P_{k}(x)+k P_{k-1}(x)\right] t^{k}=0 \nonumber

Since this is true for all t, the coefficients of the t^{k} ’s are zero, or

(k+1) P_{k+1}(x)-(2 k+1) x P_{k}(x)+k P_{k-1}(x)=0, \quad k=1,2, \ldots \nonumber

There are other recursion relations. For example,

P_{n+1}^{\prime}(x)-P_{n-1}^{\prime}(x)=(2 n+1) P_{n}(x) . \nonumber

This can be proven using the generating function by differentiating g(x, t) with respect to x and rearranging the resulting infinite series just as in this last manipulation. This will be left as Problem 7.4.

Another use of the generating function is to obtain the normalization constant. Namely, \left\|P_{n}\right\|^{2}. Squaring the generating function, we have

\dfrac{1}{1-2 x t+t^{2}}=\left[\sum_{n=0}^{\infty} P_{n}(x) t^{n}\right]^{2}=\sum_{n=0}^{\infty} \sum_{m=0}^{\infty} P_{n}(x) P_{m}(x) t^{n+m} \nonumber

Integrating from -1 to 1 and using the orthogonality of the Legendre polynomials, we have

\begin{aligned} \int_{-1}^{1} \dfrac{d x}{1-2 x t+t^{2}} &=\sum_{n=0}^{\infty} \sum_{m=0}^{\infty} t^{n+m} \int_{-1}^{1} P_{n}(x) P_{m}(x) d x \\[4pt] &=\sum_{n=0}^{\infty} t^{2 n} \int_{-1}^{1} P_{n}^{2}(x) d x \end{aligned} \nonumber

However, one can show that

\int_{-1}^{1} \dfrac{d x}{1-2 x t+t^{2}}=\dfrac{1}{t} \ln \left(\dfrac{1+t}{1-t}\right) \nonumber

Expanding this expression about t=0, we obtain

\dfrac{1}{t} \ln \left(\dfrac{1+t}{1-t}\right)=\sum_{n=0}^{\infty} \dfrac{2}{2 n+1} t^{2 n} \nonumber

Comparing this result with Equation (7.27), we find that

\left\|P_{n}\right\|^{2}=\int_{-1}^{1} P_{n}(x) P_{m}(x) d x=\dfrac{2}{2 n+1} . \nonumber

Eigenfunction Expansions

Finally, we can expand other functions in this orthogonal basis. This is just a generalized Fourier series. A Fourier-Legendre series expansion for f(x) on [-1,1] takes the form

f(x) \sim \sum_{n=0}^{\infty} c_{n} P_{n}(x) . \nonumber

As before, we can determine the coefficients by multiplying both sides by P_{m}(x) and integrating. Orthogonality gives the usual form for the generalized Fourier coefficients. In this case, we have

c_{n}=\dfrac{<f, P_{n}>}{\left\|P_{n}\right\|^{2}}, \nonumber

where

<f, P_{n}>=\int_{-1}^{1} f(x) P_{n}(x) d x \nonumber

We have just found \left\|P_{n}\right\|^{2}=\dfrac{2}{2 n+1}. Therefore, the Fourier-Legendre coefficients are

c_{n}=\dfrac{2 n+1}{2} \int_{-1}^{1} f(x) P_{n}(x) d x . \nonumber

Example 7.5. Expand f(x)=x^{3} in a Fourier-Legendre series.

We simply need to compute

c_{n}=\dfrac{2 n+1}{2} \int_{-1}^{1} x^{3} P_{n}(x) d x . \nonumber

We first note that

\int_{-1}^{1} x^{m} P_{n}(x) d x=0 \quad \text { for } m<n \nonumber

This is simply proven using Rodrigues formula. Inserting Equation (7.12), we have

\int_{-1}^{1} x^{m} P_{n}(x) d x=\dfrac{1}{2^{n} n !} \int_{-1}^{1} x^{m} \dfrac{d^{n}}{d x^{n}}\left(x^{2}-1\right)^{n} d x \nonumber

Since m<n, we can integrate by parts m-times to show the result, using P_{n}(1)=1 and P_{n}(-1)=(-1)^{n}. As a result, we will have for this example that c_{n}=0 for n>3.

We could just compute \int_{-1}^{1} x^{3} P_{m}(x) d x for m=0,1,2, \ldots outright. But, noting that x^{3} is an odd function, we easily confirm that c_{0}=0 and c_{2}=0. This leaves us with only two coefficients to compute. These are

c_{1}=\dfrac{3}{2} \int_{-1}^{1} x^{4} d x=\dfrac{3}{5} \nonumber

and

c_{3}=\dfrac{7}{2} \int_{-1}^{1} x^{3}\left[\dfrac{1}{2}\left(5 x^{3}-3 x\right)\right] d x=\dfrac{2}{5} \nonumber

Thus,

x^{3}=\dfrac{3}{5} P_{1}(x)+\dfrac{2}{5} P_{3}(x) . \nonumber

Of course, this is simple to check using Table 7.2 .1 :

\dfrac{3}{5} P_{1}(x)+\dfrac{2}{5} P_{3}(x)=\dfrac{3}{5} x+\dfrac{2}{5}\left[\dfrac{1}{2}\left(5 x^{3}-3 x\right)\right]=x^{3} \nonumber

Well, maybe we could have guessed this without doing any integration. Let’s see,

\begin{aligned} x^{3} &=c_{1} x+\dfrac{1}{2} c_{2}\left(5 x^{3}-3 x\right) \\[4pt] &=\left(c_{1}-\dfrac{3}{2} c_{2}\right) x+\dfrac{5}{2} c_{2} x^{3} \end{aligned} \nonumber

Equating coefficients of like terms, we have that c_{2}=\dfrac{2}{5} and c_{1}=\dfrac{3}{2} c_{2}=\dfrac{3}{5}. Example 7.6. Expand the Heaviside function in a Fourier-Legendre series.

The Heaviside function is defined as

H(x)=\left\{\begin{array}{l} 1, x>0 \\[4pt] 0, x<0 \end{array}\right. \nonumber

In this case, we cannot find the expansion coefficients without some integration. We have to compute

\begin{aligned} c_{n} &=\dfrac{2 n+1}{2} \int_{-1}^{1} f(x) P_{n}(x) d x \\[4pt] &=\dfrac{2 n+1}{2} \int_{0}^{1} P_{n}(x) d x, \quad n=0,1,2, \ldots \end{aligned} \nonumber

For n=0, we have

c_{0}=\dfrac{1}{2} \int_{0}^{1} d x=\dfrac{1}{2} . \nonumber

For n>1, we make use of the identity (7.25) to find

c_{n}=\dfrac{1}{2} \int_{0}^{1}\left[P_{n+1}^{\prime}(x)-P_{n-1}^{\prime}(x)\right] d x=\dfrac{1}{2}\left[P_{n-1}(0)-P_{n+1}(0)\right] . \nonumber

Thus, the Fourier-Bessel series for the Heaviside function is

f(x) \sim \dfrac{1}{2}+\dfrac{1}{2} \sum_{n=1}^{\infty}\left[P_{n-1}(0)-P_{n+1}(0)\right] P_{n}(x) . \nonumber

We need to evaluate P_{n-1}(0)-P_{n+1}(0). Since P_{n}(0)=0 for n odd, the c_{n} ’s vanish for n even. Letting n=2 k-1, we have

f(x) \sim \dfrac{1}{2}+\dfrac{1}{2} \sum_{k=1}^{\infty}\left[P_{2 k-2}(0)-P_{2 k}(0)\right] P_{2 k-1}(x) . \nonumber

We can use Equation (7.20)

P_{2 k}(0)=(-1)^{k} \dfrac{(2 k-1) ! !}{(2 k) ! !}, \nonumber

to compute the coefficients:

\begin{aligned} f(x) & \sim \dfrac{1}{2}+\dfrac{1}{2} \sum_{k=1}^{\infty}\left[P_{2 k-2}(0)-P_{2 k}(0)\right] P_{2 k-1}(x) \\[4pt] &\left.=\dfrac{1}{2}+\dfrac{1}{2} \sum_{k=1}^{\infty}\left[(-1)^{k-1} \dfrac{(2 k-3) ! !}{(2 k-2) ! !}-(-1)^{k} \dfrac{(2 k-1) ! !}{(2 k) ! !}\right] P_{2 k-1}(x)\right) \\[4pt] &=\dfrac{1}{2}-\dfrac{1}{2} \sum_{k=1}^{\infty}(-1)^{k} \dfrac{(2 k-3) ! !}{(2 k-2) ! !}\left[1+\dfrac{2 k-1}{2 k}\right] P_{2 k-1}(x) \\[4pt] &=\dfrac{1}{2}-\dfrac{1}{2} \sum_{k=1}^{\infty}(-1)^{k} \dfrac{(2 k-3) ! !}{(2 k-2) ! !} \dfrac{4 k-1}{2 k} P_{2 k-1}(x) \end{aligned} \nonumber

The sum of the first 21 terms are shown in Figure 7.6. We note the slow convergence to the Heaviside function. Also, we see that the Gibbs phenomenon is present due to the jump discontinuity at x=0.

image
Figure 7.6. Sum of first 21 terms for Fourier-Legendre series expansion of Heaviside function.

Gamma Function

Another function that often occurs in the study of special functions is the Gamma function. We will need the Gamma function in the next section on Bessel functions.

For x>0 we define the Gamma function as

\Gamma(x)=\int_{0}^{\infty} t^{x-1} e^{-t} d t, \quad x>0 . \nonumber

The Gamma function is a generalization of the factorial function. In fact, we have

\Gamma(1)=1 \nonumber

and

\Gamma(x+1)=x \Gamma(x) . \nonumber

The reader can prove this identity by simply performing an integration by parts. (See Problem 7.7.) In particular, for integers n \in Z^{+}, we then have

\Gamma(n+1)=n \Gamma(n)=n(n-1) \Gamma(n-2)=n(n-1) \cdots 2 \Gamma(1)=n ! \nonumber

We can also define the Gamma function for negative, non-integer values of x. We first note that by iteration on n \in Z^{+}, we have

\Gamma(x+n)=(x+n-1) \cdots(x+1) x \Gamma(x), \quad x<0, \quad x+n>0 \nonumber

Solving for \Gamma(x), we then find

\Gamma(x)=\dfrac{\Gamma(x+n)}{(x+n-1) \cdots(x+1) x}, \quad-n<x<0 \nonumber

Note that the Gamma function is undefined at zero and the negative integers.

Example 7.7. We now prove that

\Gamma\left(\dfrac{1}{2}\right)=\sqrt{\pi} . \nonumber

This is done by direct computation of the integral:

\Gamma\left(\dfrac{1}{2}\right)=\int_{0}^{\infty} t^{-\dfrac{1}{2}} e^{-t} d t \nonumber

Letting t=z^{2}, we have

\Gamma\left(\dfrac{1}{2}\right)=2 \int_{0}^{\infty} e^{-z^{2}} d z \nonumber

Due to the symmetry of the integrand, we obtain the classic integral

\Gamma\left(\dfrac{1}{2}\right)=\int_{-\infty}^{\infty} e^{-z^{2}} d z \nonumber

which can be performed using a standard trick. Consider the integral

I=\int_{-\infty}^{\infty} e^{-x^{2}} d x \nonumber

Then,

I^{2}=\int_{-\infty}^{\infty} e^{-x^{2}} d x \int_{-\infty}^{\infty} e^{-y^{2}} d y . \nonumber

Note that we changed the integration variable. This will allow us to write this product of integrals as a double integral:

I^{2}=\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} e^{-\left(x^{2}+y^{2}\right)} d x d y . \nonumber

This is an integral over the entire x y-plane. We can transform this Cartesian integration to an integration over polar coordinates. The integral becomes

I^{2}=\int_{0}^{2 \pi} \int_{0}^{\infty} e^{-r^{2}} r d r d \theta \nonumber

This is simple to integrate and we have I^{2}=\pi. So, the final result is found by taking the square root of both sides:

\Gamma\left(\dfrac{1}{2}\right)=I=\sqrt{\pi} . \nonumber

We have seen that the factorial function can be written in terms of Gamma functions. One can write the even and odd double factorials as

(2 n) ! !=2^{n} n !, \quad(2 n+1) ! !=\dfrac{(2 n+1) !}{2^{n} n !} \nonumber

In particular, one can write

\Gamma\left(n+\dfrac{1}{2}\right)=\dfrac{(2 n-1) ! !}{2^{n}} \sqrt{\pi} \nonumber

Another useful relation, which we only state, is

\Gamma(x) \Gamma(1-x)=\dfrac{\pi}{\sin \pi x} \nonumber

7.4 Bessel Functions

Another important differential equation that arises in many physics applications is

x^{2} y^{\prime \prime}+x y^{\prime}+\left(x^{2}-p^{2}\right) y=0 . \nonumber

This equation is readily put into self-adjoint form as

\left(x y^{\prime}\right)^{\prime}+\left(x-\dfrac{p^{2}}{x}\right) y=0 . \nonumber

This equation was solved in the first course on differential equations using power series methods, namely by using the Frobenius Method. One assumes a series solution of the form

y(x)=\sum_{n=0}^{\infty} a_{n} x^{n+s} \nonumber

and one seeks allowed values of the constant s and a recursion relation for the coefficients, a_{n}. One finds that s=\pm p and

a_{n}=-\dfrac{a_{n-2}}{(n+s)^{2}-p^{2}}, \quad n \geq 2 . \nonumber

One solution of the differential equation is the Bessel function of the first kind of order p, given as

y(x)=J_{p}(x)=\sum_{n=0}^{\infty} \dfrac{(-1)^{n}}{\Gamma(n+1) \Gamma(n+p+1)}\left(\dfrac{x}{2}\right)^{2 n+p} . \nonumber

In Figure 7.7 we display the first few Bessel functions of the first kind of integer order. Note that these functions can be described as decaying oscillatory functions.

image
Figure 7.7. Plots of the Bessel functions J_{0}(x), J_{1}(x), J_{2}(x), and J_{3}(x).

A second linearly independent solution is obtained for p not an integer as J_{-p}(x). However, for p an integer, the \Gamma(n+p+1) factor leads to evaluations of the Gamma function at zero, or negative integers, when p is negative. Thus, the above series is not defined in these cases.

Another method for obtaining a second linearly independent solution is through a linear combination of J_{p}(x) and J_{-p}(x) as

N_{p}(x)=Y_{p}(x)=\dfrac{\cos \pi p J_{p}(x)-J_{-p}(x)}{\sin \pi p} . \nonumber

These functions are called the Neumann functions, or Bessel functions of the second kind of order p.

In Figure 7.8 we display the first few Bessel functions of the second kind of integer order. Note that these functions are also decaying oscillatory functions. However, they are singular at x=0.

In many applications these functions do not satisfy the boundary condition that one desires a bounded solution at x=0. For example, one standard problem is to describe the oscillations of a circular drumhead. For this problem one solves the wave equation using separation of variables in cylindrical coordinates. The r equation leads to a Bessel equation. The Bessel function solutions describe the radial part of the solution and one does not expect a singular solution at the center of the drum. The amplitude of the oscillation must remain finite. Thus, only Bessel functions of the first kind can be used.

Bessel functions satisfy a variety of properties, which we will only list at this time for Bessel functions of the first kind.

Derivative Identities

image
Figure 7.8. Plots of the Neumann functions N_{0}(x), N_{1}(x), N_{2}(x), and N_{3}(x).

\dfrac{d}{d x}\left[x^{-p} J_{p}(x)\right]=-x^{-p} J_{p+1}(x) \nonumber

Recursion Formulae

\begin{aligned} &J_{p-1}(x)+J_{p+1}(x)=\dfrac{2 p}{x} J_{p}(x) \\[4pt] &J_{p-1}(x)-J_{p+1}(x)=2 J_{p}^{\prime}(x) \end{aligned} \nonumber

Orthogonality

\int_{0}^{a} x J_{p}\left(j_{p n} \dfrac{x}{a}\right) J_{p}\left(j_{p m} \dfrac{x}{a}\right) d x=\dfrac{a^{2}}{2}\left[J_{p+1}\left(j_{p n}\right)\right]^{2} \delta_{n, m} \nonumber

where j_{p n} is the nth root of J_{p}(x), J_{p}\left(j_{p n}\right)=0, n=1,2, \ldots A list of some of these roots are provided in Table 7.4.

n p=0 p=1 p=2 p=3 p=4 p=5
1 2.405 3.832 5.135 6.379 7.586 8.780
2 5.520 7.016 8.147 9.760 11.064 12.339
3 8.654 10.173 11.620 13.017 14.373 15.700
4 11.792 13.323 14.796 16.224 17.616 18.982
5 14.931 16.470 17.960 19.410 20.827 22.220
6 18.071 19.616 21.117 22.583 24.018 25.431
7 21.212 22.760 24.270 25.749 27.200 28.628
8 24.353 25.903 27.421 28.909 30.371 31.813
9 27.494 29.047 30.571 32.050 33.512 34.983

Table 7.4. The zeros of Bessel Functions

Generating Function

e^{x\left(t-\dfrac{1}{t}\right) / 2}=\sum_{n=-\infty}^{\infty} J_{n}(x) t^{n}, \quad x>0, t \neq 0 \nonumber

Integral Representation

J_{n}(x)=\dfrac{1}{\pi} \int_{0}^{\pi} \cos (x \sin \theta-n \theta) d \theta, \quad x>0, n \in \mathrm{Z} . \nonumber

Fourier-Bessel Series

Since the Bessel functions are an orthogonal set of eigenfunctions of a Sturm-Liouville problem, we can expand square integrable functions in this basis. In fact, the eigenvalue problem is given in the form

x^{2} y^{\prime \prime}+x y^{\prime}+\left(\lambda x^{2}-p^{2}\right) y=0 . \nonumber

The solutions are then of the form J_{p}(\sqrt{\lambda} x), as can be shown by making the substitution t=\sqrt{\lambda} x in the differential equation.

Furthermore, one can solve the differential equation on a finite domain, [0, a], with the boundary conditions: y(x) is bounded at x=0 and y(a)= 0 . One can show that J_{p}\left(j_{p n} \dfrac{x}{a}\right) is a basis of eigenfunctions and the resulting Fourier-Bessel series expansion of f(x) defined on x \in[0, a] is

f(x)=\sum_{n=1}^{\infty} c_{n} J_{p}\left(j_{p n} \dfrac{x}{a}\right), \nonumber

where the Fourier-Bessel coefficients are found using the orthogonality relation as

c_{n}=\dfrac{2}{a^{2}\left[J_{p+1}\left(j_{p n}\right)\right]^{2}} \int_{0}^{a} x f(x) J_{p}\left(j_{p n} \dfrac{x}{a}\right) d x . \nonumber

Example 7.8. Expand f(x)=1 for 0 \leq x \leq 1 in a Fourier-Bessel series of the form

f(x)=\sum_{n=1}^{\infty} c_{n} J_{0}\left(j_{0 n} x\right) \nonumber

We need only compute the Fourier-Bessel coefficients in Equation (7.50):

c_{n}=\dfrac{2}{\left[J_{1}\left(j_{0 n}\right)\right]^{2}} \int_{0}^{1} x J_{0}\left(j_{0 n} x\right) d x . \nonumber

From Equation (7.41) we have

\begin{aligned} \int_{0}^{1} x J_{0}\left(j_{0 n} x\right) d x &=\dfrac{1}{j_{0 n}^{2}} \int_{0}^{j_{0 n}} y J_{0}(y) d y \\[4pt] &=\dfrac{1}{j_{0 n}^{2}} \int_{0}^{j_{0 n}} \dfrac{d}{d y}\left[y J_{1}(y)\right] d y \\[4pt] &=\dfrac{1}{j_{0 n}^{2}}\left[y J_{1}(y)\right]_{0}^{j_{0 n}} \\[4pt] &=\dfrac{1}{j_{0 n}} J_{1}\left(j_{0 n}\right) \end{aligned} \nonumber

As a result, we have found that the desired Fourier-Bessel expansion is

1=2 \sum_{n=1}^{\infty} \dfrac{J_{0}\left(j_{0 n} x\right)}{j_{0 n} J_{1}\left(j_{0 n}\right)}, \quad 0<x<1 \nonumber

In Figure 7.9 we show the partial sum for the first fifty terms of this series. We see that there is slow convergence due to the Gibbs’ phenomenon.

image
Figure 7.9. Plot of the first 50 terms of the Fourier-Bessel series in Equation (7.53) for f(x)=1 on 0<x<1.

Hypergeometric Functions

Hypergeometric functions are probably the most useful, but least understood, class of functions. They typically do not make it into the undergraduate curriculum and seldom in graduate curriculum. Most functions that you know can be expressed using hypergeometric functions. There are many approaches to these functions and the literature can fill books. 1

In 1812 Gauss published a study of the hypergeometric series

\begin{aligned} y(x)=1 &+\dfrac{\alpha \beta}{\gamma} x+\dfrac{\alpha(1+\alpha)(1+\beta)}{2 ! \gamma(1+\gamma)} x^{2} \\[4pt] &+\dfrac{\alpha(1+\alpha)(2+\alpha) \beta(1+\beta)(2+\beta)}{3 ! \gamma(1+\gamma)(2+\gamma)} x^{3}+\ldots \end{aligned} \nonumber

Here \alpha, \beta, \gamma, and x are real numbers. If one sets \alpha=1 and \beta=\gamma, this series reduces to the familiar geometric series

y(x)=1+x+x^{2}+x^{3}+\ldots . \nonumber

The hypergeometric series is actually a solution of the differential equation

x(1-x) y^{\prime \prime}+[\gamma-(\alpha+\beta+1) x] y^{\prime}-\alpha \beta y=0 \nonumber

This equation was first introduced by Euler and latter studied extensively by Gauss, Kummer and Riemann. It is sometimes called Gauss’ equation. Note that there is a symmetry in that \alpha and \beta may be interchanged without changing the equation. The points x=0 and x=1 are regular singular points. Series solutions may be sought using the Frobenius method. It can be confirmed that the above hypergeometric series results.

A more compact form for the hypergeometric series may be obtained by introducing new notation. One typically introduces the Pochhammer symbol, (\alpha)_{n}, satisfying (i) (\alpha)_{0}=1 if \alpha \neq 0. and (ii) (\alpha)_{k}=\alpha(1+\alpha) \ldots(k-1+\alpha), for k=1,2, \ldots

Consider (1)_{n}. For n=0,(1)_{0}=1. For n>0,

(1)_{n}=1(1+1)(2+1) \ldots[(n-1)+1] \nonumber

This reduces to (1)_{n}=n !. In fact, one can show that

(k)_{n}=\dfrac{(n+k-1) !}{(k-1) !} \nonumber

for k and n positive integers. In fact, one can extend this result to noninteger values for k by introducing the gamma function:

(\alpha)_{n}=\dfrac{\Gamma(\alpha+n)}{\Gamma(\alpha)} \nonumber

We can now write the hypergeometric series in standard notation as

{ }^{1} See for example Special Functions by G. E. Andrews, R. Askey, and R. Roy, 1999, Cambridge University Press.

{ }_{2} F_{1}(\alpha, \beta ; \gamma ; x)=\sum_{n=0}^{\infty} \dfrac{(\alpha)_{n}(\beta)_{n}}{n !(\gamma)_{n}} x^{n} \nonumber

Using this one can show that the general solution of Gauss’ equation is

y(x)=A_{2} F_{1}(\alpha, \beta ; \gamma ; x)+B_{2} x_{2}^{1-\gamma} F_{1}(1-\gamma+\alpha, 1-\gamma+\beta ; 2-\gamma ; x) . \nonumber

By carefully letting \beta approach \infty, one obtains what is called the confluent hypergeometric function. This in effect changes the nature of the differential equation. Gauss’ equation has three regular singular points at x=0,1, \infty. One can transform Gauss’ equation by letting x=u / \beta. This changes the regular singular points to u=0, \beta, \infty. Letting \beta \rightarrow \infty, two of the singular points merge.

The new confluent hypergeometric function is then given as

{ }_{1} F_{1}(\alpha ; \gamma ; u)=\lim _{\beta \rightarrow \infty}{ }_{2} F_{1}\left(\alpha, \beta ; \gamma ; \dfrac{u}{\beta}\right) . \nonumber

This function satisfies the differential equation

x y^{\prime \prime}+(\gamma-x) y^{\prime}-\alpha y=0 . \nonumber

The purpose of this section is only to introduce the hypergeometric function. Many other special functions are related to the hypergeometric function after making some variable transformations. For example, the Legendre polynomials are given by

P_{n}(x)={ }_{2} F_{1}\left(-n, n+1 ; 1 ; \dfrac{1-x}{2}\right) . \nonumber

In fact, one can also show that

\sin ^{-1} x=x_{2} F_{1}\left(\dfrac{1}{2}, \dfrac{1}{2} ; \dfrac{3}{2} ; x^{2}\right) . \nonumber

The Bessel function J_{p}(x) can be written in terms of confluent geometric functions as

J_{p}(x)=\dfrac{1}{\Gamma(p+1)}\left(\dfrac{z}{2}\right)^{p} e^{-i z}{ }_{1} F_{1}\left(\dfrac{1}{2}+p, 1+2 p ; 2 i z\right) . \nonumber

These are just a few connections of the powerful hypergeometric functions to some of the elementary functions that you know.

Appendix: The Binomial Expansion

In this section we had to recall the binomial expansion. This is simply the expansion of the expression (a+b)^{p}. We will investigate this expansion first for nonnegative integer powers p and then derive the expansion for other values of p.

Lets list some of the common expansions for nonnegative integer powers.

\begin{aligned} &(a+b)^{0}=1 \\[4pt] &(a+b)^{1}=a+b \\[4pt] &(a+b)^{2}=a^{2}+2 a b+b^{2} \\[4pt] &(a+b)^{3}=a^{3}+3 a^{2} b+3 a b^{2}+b^{3} \\[4pt] &(a+b)^{4}=a^{4}+4 a^{3} b+6 a^{2} b^{2}+4 a b^{3}+b^{4} \end{aligned} \nonumber

We now look at the patterns of the terms in the expansions. First, we note that each term consists of a product of a power of a and a power of b. The powers of a are decreasing from n to 0 in the expansion of (a+b)^{n}. Similarly, the powers of b increase from 0 to n. The sums of the exponents in each term is n. So, we can write the (k+1) st term in the expansion as a^{n-k} b^{k}. For example, in the expansion of (a+b)^{51} the 6 th term is a^{51-5} b^{5}=a^{46} b^{5}. However, we do not know the numerical coefficient in the expansion.

We now list the coefficients for the above expansions.

This pattern is the famous Pascal’s triangle. There are many interesting features of this triangle. But we will first ask how each row can be generated.

We see that each row begins and ends with a one. Next the second term and next to last term has a coefficient of n. Next we note that consecutive pairs in each row can be added to obtain entries in the next row. For example, we have

image

With this in mind, we can generate the next several rows of our triangle.

Of course, it would take a while to compute each row up to the desired n. We need a simple expression for computing a specific coefficient. Consider

\begin{aligned} & n=0: \quad 1 \\[4pt] & n=1: \quad 1 \quad 1 \\[4pt] & n=2: \quad 1 \quad 2 \quad 1 \\[4pt] & n=3: \quad 1 \quad 3 \quad 3 \quad 1 \\[4pt] & n=3: \quad 1 \quad 3 \quad 3 \quad 1 \\[4pt] & n=4: \quad 1 \quad 4 \quad 6 \quad 4 \quad 1 \end{aligned} \nonumber

image

the k th term in the expansion of (a+b)^{n}. Let r=k-1. Then this term is of the form C_{r}^{n} a^{n-r} b^{r}. We have seen the the coefficients satisfy

C_{r}^{n}=C_{r}^{n-1}+C_{r-1}^{n-1} \nonumber

Actually, the coefficients have been found to take a simple form.

C_{r}^{n}=\dfrac{n !}{(n-r) ! r !}=\left(\begin{array}{c} n \\[4pt] r \end{array}\right) \nonumber

This is nothing other than the combinatoric symbol for determining how to choose n things r at a time. In our case, this makes sense. We have to count the number of ways that we can arrange the products of r b’s with n-r a ’s. There are n slots to place the b ’s. For example, the r=2 case for n=4 involves the six products: a a b b, a b a b, a b b a, b a a b, b a b a, and bbaa. Thus, it is natural to use this notation. The original problem that concerned Pascal was in gambling.

So, we have found that

(a+b)^{n}=\sum_{r=0}^{n}\left(\begin{array}{c} n \\[4pt] r \end{array}\right) a^{n-r} b^{r} \nonumber

What if a \gg b ? Can we use this to get an approximation to (a+b)^{n} ? If we neglect b then (a+b)^{n} \simeq a^{n}. How good of an approximation is this? This is where it would be nice to know the order of the next term in the expansion, which we could state using big O notation. In order to do this we first divide out a as

(a+b)^{n}=a^{n}\left(1+\dfrac{b}{a}\right)^{n} . \nonumber

Now we have a small parameter, \dfrac{b}{a}. According to what we have seen above, we can use the binomial expansion to write

\left(1+\dfrac{b}{a}\right)^{n}=\sum_{r=0}^{n}\left(\begin{array}{l} n \\[4pt] r \end{array}\right)\left(\dfrac{b}{a}\right)^{r} \nonumber

Thus, we have a finite sum of terms involving powers of \dfrac{b}{a}. Since a \gg b, most of these terms can be neglected. So, we can write

\left(1+\dfrac{b}{a}\right)^{n}=1+n \dfrac{b}{a}+O\left(\left(\dfrac{b}{a}\right)^{2}\right) \nonumber

note that we have used the observation that the second coefficient in the nth row of Pascal’s triangle is n.

Summarizing, this then gives

\begin{aligned} (a+b)^{n} &=a^{n}\left(1+\dfrac{b}{a}\right)^{n} \\[4pt] &=a^{n}\left(1+n \dfrac{b}{a}+O\left(\left(\dfrac{b}{a}\right)^{2}\right)\right) \\[4pt] &=a^{n}+n a^{n} \dfrac{b}{a}+a^{n} O\left(\left(\dfrac{b}{a}\right)^{2}\right) \end{aligned} \nonumber

Therefore, we can approximate (a+b)^{n} \simeq a^{n}+n b a^{n-1}, with an error on the order of b a^{n-2}. Note that the order of the error does not include the constant factor from the expansion. We could also use the approximation that (a+b)^{n} \simeq a^{n}, but it is not as good because the error in this case is of the order b a^{n-1}.

We have seen that

\dfrac{1}{1-x}=1+x+x^{2}+\ldots \nonumber

But, \dfrac{1}{1-x}=(1-x)^{-1}. This is again a binomial to a power, but the power is not a nonnegative integer. It turns out that the coefficients of such a binomial expansion can be written similar to the form in Equation (7.60).

This example suggests that our sum may no longer be finite. So, for p a real number, we write

(1+x)^{p}=\sum_{r=0}^{\infty}\left(\begin{array}{l} p \\[4pt] r \end{array}\right) x^{r} . \nonumber

However, we quickly run into problems with this form. Consider the coefficient for r=1 in an expansion of (1+x)^{-1}. This is given by

\left(\begin{array}{c} -1 \\[4pt] 1 \end{array}\right)=\dfrac{(-1) !}{(-1-1) ! 1 !}=\dfrac{(-1) !}{(-2) ! 1 !} \text {. } \nonumber

But what is (-1) ! ? By definition, it is

(-1) !=(-1)(-2)(-3) \cdots . \nonumber

This product does not seem to exist! But with a little care, we note that

\dfrac{(-1) !}{(-2) !}=\dfrac{(-1)(-2) !}{(-2) !}=-1 \text {. } \nonumber

So, we need to be careful not to interpret the combinatorial coefficient literally. There are better ways to write the general binomial expansion. We can write the general coefficient as

\begin{aligned} \left(\begin{array}{l} p \\[4pt] r \end{array}\right) &=\dfrac{p !}{(p-r) ! r !} \\[4pt] &=\dfrac{p(p-1) \cdots(p-r+1)(p-r) !}{(p-r) ! r !} \\[4pt] &=\dfrac{p(p-1) \cdots(p-r+1)}{r !} . \end{aligned} \nonumber

With this in mind we now state the theorem:

General Binomial Expansion The general binomial expansion for (1+ x)^{p} is a simple generalization of Equation (7.60). For p real, we have that

\begin{aligned} (1+x)^{p} &=\sum_{r=0}^{\infty} \dfrac{p(p-1) \cdots(p-r+1)}{r !} x^{r} \\[4pt] &=\sum_{r=0}^{\infty} \dfrac{\Gamma(p+1)}{r ! \Gamma(p-r+1)} x^{r} \end{aligned} \nonumber

Often we need the first few terms for the case that x \ll 1 :

(1+x)^{p}=1+p x+\dfrac{p(p-1)}{2} x^{2}+O\left(x^{3}\right) . \nonumber

Problems

7.1. Consider the set of vectors (-1,1,1),(1,-1,1),(1,1,-1).

a. Use the Gram-Schmidt process to find an orthonormal basis for R^{3} using this set in the given order.

b. What do you get if you do reverse the order of these vectors?

7.2. Use the Gram-Schmidt process to find the first four orthogonal polynomials satisfying the following:

a. Interval: (-\infty, \infty) Weight Function: e^{-x^{2}} .

b. Interval: (0, \infty) Weight Function: e^{-x}.

7.3. Find P_{4}(x) using

a. The Rodrigues Formula in Equation (7.12)

b. The three term recursion formula in Equation (7.14).

7.4. Use the generating function for Legendre polynomials to derive the recursion formula P_{n+1}^{\prime}(x)-P_{n-1}^{\prime}(x)=(2 n+1) P_{n}(x). Namely, consider \dfrac{\partial g(x, t)}{\partial x} using Equation (7.18) to derive a three term derivative formula. Then use three term recursion formula (7.14) to obtain the above result.

7.5. Use the recursion relation (7.14) to evaluate \int_{-1}^{1} x P_{n}(x) P_{m}(x) d x, n \leq m.

7.6. Expand the following in a Fourier-Legendre series for x \in(-1,1).
a. f(x)=x^{2}.
b. f(x)=5 x^{4}+2 x^{3}-x+3.
c. f(x)=\left\{\begin{array}{c}-1,-1<x<0, \\[4pt] 1, \quad 0<x<1 .\end{array}\right. d. f(x)=\left\{\begin{array}{l}x,-1<x<0 \\[4pt] 0,0<x<1\end{array}\right.

7.7. Use integration by parts to show \Gamma(x+1)=x \Gamma(x).

7.8. Express the following as Gamma functions. Namely, noting the form \Gamma(x+1)=\int_{0}^{\infty} t^{x} e^{-t} d t and using an appropriate substitution, each expression can be written in terms of a Gamma function.

a. \int_{0}^{\infty} x^{2 / 3} e^{-x} d x

b. \int_{0}^{\infty} x^{5} e^{-x^{2}} d x

c. \int_{0}^{1}\left[\ln \left(\dfrac{1}{x}\right)\right]^{n} d x

7.9. The Hermite polynomials, H_{n}(x), satisfy the following:

i. <H_{n}, H_{m}>=\int_{-\infty}^{\infty} e^{-x^{2}} H_{n}(x) H_{m}(x) d x=\sqrt{\pi} 2^{n} n ! \delta_{n, m}.

ii. H_{n}^{\prime}(x)=2 n H_{n-1}(x).

iii. H_{n+1}(x)=2 x H_{n}(x)-2 n H_{n-1}(x).

iv. H_{n}(x)=(-1)^{n} e^{x^{2}} \dfrac{d^{n}}{d x^{n}}\left(e^{-x^{2}}\right)

Using these, show that

a. H_{n}^{\prime \prime}-2 x H_{n}^{\prime}+2 n H_{n}=0. [Use properties ii. and iii.]

b. \int_{-\infty}^{\infty} x e^{-x^{2}} H_{n}(x) H_{m}(x) d x=\sqrt{\pi} 2^{n-1} n !\left[\delta_{m, n-1}+2(n+1) \delta_{m, n+1}\right]. [Use properties i. and iii.]

c. H_{n}(0)=\left\{\begin{array}{cc}0, & n \text { odd, } \\[4pt] (-1)^{m} \dfrac{(2 m) !}{m !}, & n=2 m\end{array}\right.. [Let x=0 in iii. and iterate. Note from iv. that H_{0}(x)=1 and H_{1}(x)=1.

7.10. In Maple one can type simplify(LegendreP \left.\left(2^{*} \mathbf{n}-2,0\right)-\operatorname{Legendre} \mathbf{P}\left(2^{*} \mathbf{n}, 0\right)\right); to find a value for P_{2 n-2}(0)-P_{2 n}(0). It gives the result in terms of Gamma functions. However, in Example 7.6 for Fourier-Legendre series, the value is given in terms of double factorials! So, we have

P_{2 n-2}(0)-P_{2 n}(0)=\dfrac{\sqrt{\pi}(4 n-1)}{2 \Gamma(n+1) \Gamma\left(\dfrac{3}{2}-n\right)}=(-1)^{n} \dfrac{(2 n-3) ! !}{(2 n-2) ! !} \dfrac{4 n-1}{2 n} . \nonumber

You will verify that both results are the same by doing the following:

a. Prove that P_{2 n}(0)=(-1)^{n} \dfrac{(2 n-1) ! !}{(2 n) ! !} using the generating function and a binomial expansion.

b. Prove that \Gamma\left(n+\dfrac{1}{2}\right)=\dfrac{(2 n-1) ! !}{2^{n}} \sqrt{\pi} using \Gamma(x)=(x-1) \Gamma(x-1) and iteration.

c. Verify the result from Maple that P_{2 n-2}(0)-P_{2 n}(0)=\dfrac{\sqrt{\pi}(4 n-1)}{2 \Gamma(n+1) \Gamma\left(\dfrac{3}{2}-n\right)}.

d. Can either expression for P_{2 n-2}(0)-P_{2 n}(0) be simplified further? 7.11. A solution Bessel’s equation, x^{2} y^{\prime \prime}+x y^{\prime}+\left(x^{2}-n^{2}\right) y=0, , can be found using the guess y(x)=\sum_{j=0}^{\infty} a_{j} x^{j+n}. One obtains the recurrence relation a_{j}=\dfrac{-1}{j(2 n+j)} a_{j-2}. Show that for a_{0}=\left(n ! 2^{n}\right)^{-1} we get the Bessel function of the first kind of order n from the even values j=2 k :

J_{n}(x)=\sum_{k=0}^{\infty} \dfrac{(-1)^{k}}{k !(n+k) !}\left(\dfrac{x}{2}\right)^{n+2 k} \nonumber

7.12. Use the infinite series in the last problem to derive the derivative identities (7.41) and (7.42):

a. \dfrac{d}{d x}\left[x^{n} J_{n}(x)\right]=x^{n} J_{n-1}(x).

b. \dfrac{d}{d x}\left[x^{-n} J_{n}(x)\right]=-x^{-n} J_{n+1}(x)

7.13. Bessel functions J_{p}(\lambda x) are solutions of x^{2} y^{\prime \prime}+x y^{\prime}+\left(\lambda^{2} x^{2}-p^{2}\right) y=0. Assume that x \in(0,1) and that J_{p}(\lambda)=0 and J_{p}(0) is finite.

a. Put this differential equation into Sturm-Liouville form.

b. Prove that solutions corresponding to different eigenvalues are orthogonal by first writing the corresponding Green’s identity using these Bessel functions.

c. Prove that

\int_{0}^{1} x J_{p}(\lambda x) J_{p}(\mu x) d x=\dfrac{1}{2} J_{p+1}^{2}(\lambda)=\dfrac{1}{2} J_{p}^{\prime 2}(\lambda) \nonumber

Note that \lambda is a zero of J_{p}(x)

7.14. We can rewrite our Bessel function in a form which will allow the order to be non-integer by using the gamma function. You will need the results from Problem 7.10 \mathrm{~b} for \Gamma\left(k+\dfrac{1}{2}\right).

a. Extend the series definition of the Bessel function of the first kind of order \nu, J_{\nu}(x), for \nu \geq 0 by writing the series solution for y(x) in Problem 7.11 using the gamma function.

b. Extend the series to J_{-\nu(x)}, for \nu \geq 0. Discuss the resulting series and what happens when \nu is a positive integer.

c. Use these results to obtain closed form expressions for J_{1 / 2}(x) and J_{-1 / 2}(x). Use the recursion formula for Bessel functions to obtain a closed form for J_{3 / 2}(x).

7.15. In this problem you will derive the expansion

x^{2}=\dfrac{c^{2}}{2}+4 \sum_{j=2}^{\infty} \dfrac{J_{0}\left(\alpha_{j} x\right)}{\alpha_{j}^{2} J_{0}\left(\alpha_{j} c\right)}, \quad 0<x<c \nonumber

where the \alpha_{j}^{\prime} s are the positive roots of J_{1}(\alpha c)=0, by following the below steps. a. List the first five values of \alpha for J_{1}(\alpha c)=0 using the Table 7.4 and Figure 7.7. [Note: Be careful determining \alpha_{1}.]

b. Show that \left\|J_{0}\left(\alpha_{1} x\right)\right\|^{2}=\dfrac{c^{2}}{2}. Recall,

\left\|J_{0}\left(\alpha_{j} x\right)\right\|^{2}=\int_{0}^{c} x J_{0}^{2}\left(\alpha_{j} x\right) d x \nonumber

c. Show that \left\|J_{0}\left(\alpha_{j} x\right)\right\|^{2}=\dfrac{c^{2}}{2}\left[J_{0}\left(\alpha_{j} c\right)\right]^{2}, j=2,3, \ldots (This is the most involved step.) First note from Problem 7.13 that y(x)=J_{0}\left(\alpha_{j} x\right) is a solution of

x^{2} y^{\prime \prime}+x y^{\prime}+\alpha_{j}^{2} x^{2} y=0 . \nonumber

i. Show that the Sturm-Liouville form of this differential equation is \left(x y^{\prime}\right)^{\prime}=-\alpha_{j}^{2} x y

ii. Multiply the equation in part i. by y(x) and integrate from x=0 to x=c to obtain

\begin{aligned} \int_{0}^{c}\left(x y^{\prime}\right)^{\prime} y d x &=-\alpha_{j}^{2} \int_{0}^{c} x y^{2} d x \\[4pt] &=-\alpha_{j}^{2} \int_{0}^{c} x J_{0}^{2}\left(\alpha_{j} x\right) d x \end{aligned} \nonumber

iii. Noting that y(x)=J_{0}\left(\alpha_{j} x\right), integrate the left hand side by parts and use the following to simplify the resulting equation.

  1. J_{0}^{\prime}(x)=-J_{1}(x) from Equation (7.42).
  2. Equation (7.45)
  3. J_{2}\left(\alpha_{j} c\right)+J_{0}\left(\alpha_{j} c\right)=0 from Equation (7.43).

iv. Now you should have enough information to complete this part.

d. Use the results from parts b and c to derive the expansion coefficients for

x^{2}=\sum_{j=1}^{\infty} c_{j} J_{0}\left(\alpha_{j} x\right) \nonumber

in order to obtain the desired expansion.

7.16. Use the derivative identities of Bessel functions, (7.41)-(7.42), and integration by parts to show that

\int x^{3} J_{0}(x) d x=x^{3} J_{1}(x)-2 x^{2} J_{2}(x) \nonumber

Green’s Functions

In this chapter we will investigate the solution of nonhomogeneous differential equations using Green’s functions. Our goal is to solve the nonhomogeneous differential equation

L[u]=f, \nonumber

where L is a differential operator. The solution is formally given by

u=L^{-1}[f] \nonumber

The inverse of a differential operator is an integral operator, which we seek to write in the form

u=\int G(x, \xi) f(\xi) d \xi \nonumber

The function G(x, \xi) is referred to as the kernel of the integral operator and is called the Green’s function.

The history of the Green’s function dates back to 1828 , when George Green published work in which he sought solutions of Poisson’s equation \nabla^{2} u=f for the electric potential u defined inside a bounded volume with specified boundary conditions on the surface of the volume. He introduced a function now identified as what Riemann later coined the "Green’s function".

We will restrict our discussion to Green’s functions for ordinary differential equations. Extensions to partial differential equations are typically one of the subjects of a PDE course. We will begin our investigations by examining solutions of nonhomogeneous second order linear differential equations using the Method of Variation of Parameters, which is typically seen in a first course on differential equations. We will identify the Green’s function for both initial value and boundary value problems. We will then focus on boundary value Green’s functions and their properties. Determination of Green’s functions is also possible using Sturm-Liouville theory. This leads to series representation of Green’s functions, which we will study in the last section of this chapter.

The Method of Variation of Parameters

We are interested in solving nonhomogeneous second order linear differential equations of the form

a_{2}(x) y^{\prime \prime}(x)+a_{1}(x) y^{\prime}(x)+a_{0}(x) y(x)=f(x) . \nonumber

The general solution of this nonhomogeneous second order linear differential equation is found as a sum of the general solution of the homogeneous equation,

a_{2}(x) y^{\prime \prime}(x)+a_{1}(x) y^{\prime}(x)+a_{0}(x) y(x)=0, \nonumber

and a particular solution of the nonhomogeneous equation. Recall from Chapter 1 that there are several approaches to finding particular solutions of nonhomogeneous equations. Any guess would be sufficient. An intelligent guess, based upon the Method of Undetermined Coefficients, was reviewed previously in Chapter 1. However, a more methodical method, which is first seen in a first course in differential equations, is the Method of Variation of Parameters. Also, we explored the matrix version of this method in Section 2.8. We will review this method in this section and extend it to the solution of boundary value problems.

While it is sufficient to derive the method for the general differential equation above, we will instead consider solving equations that are in SturmLiouville, or self-adjoint, form. Therefore, we will apply the Method of Variation of Parameters to the equation

\dfrac{d}{d x}\left(p(x) \dfrac{d y(x)}{d x}\right)+q(x) y(x)=f(x) \nonumber

Note that f(x) in this equation is not the same function as in the general equation posed at the beginning of this section.

We begin by assuming that we have determined two linearly independent solutions of the homogeneous equation. The general solution is then given by

y(x)=c_{1} y_{1}(x)+c_{2} y_{2}(x) . \nonumber

In order to determine a particular solution of the nonhomogeneous equation, we vary the parameters c_{1} and c_{2} in the solution of the homogeneous problem by making them functions of the independent variable. Thus, we seek a particular solution of the nonhomogeneous equation in the form

y_{p}(x)=c_{1}(x) y_{1}(x)+c_{2}(x) y_{2}(x) . \nonumber

In order for this to be a solution, we need to show that it satisfies the differential equation. We first compute the derivatives of y_{p}(x). The first derivative

y_{p}^{\prime}(x)=c_{1}(x) y_{1}^{\prime}(x)+c_{2}(x) y_{2}^{\prime}(x)+c_{1}^{\prime}(x) y_{1}(x)+c_{2}^{\prime}(x) y_{2}(x) . \nonumber

Without loss of generality, we will set the sum of the last two terms to zero. (One can show that the same results would be obtained if we did not. See Problem 8.2.) Then, we have

c_{1}^{\prime}(x) y_{1}(x)+c_{2}^{\prime}(x) y_{2}(x)=0 . \nonumber

Now, we take the second derivative of the remaining terms to obtain

y_{p}^{\prime \prime}(x)=c_{1}(x) y_{1}^{\prime \prime}(x)+c_{2}(x) y_{2}^{\prime \prime}(x)+c_{1}^{\prime}(x) y_{1}^{\prime}(x)+c_{2}^{\prime}(x) y_{2}^{\prime}(x) \nonumber

Expanding the derivative term in Equation (8.3),

p(x) y_{p}^{\prime \prime}(x)+p^{\prime}(x) y_{p}^{\prime}(x)+q(x) y_{p}(x)=f(x), \nonumber

and inserting the expressions for y_{p}, y_{p}^{\prime}(x), and y_{p}^{\prime \prime}(x), we have

\begin{aligned} f(x)=& p(x)\left[c_{1}(x) y_{1}^{\prime \prime}(x)+c_{2}(x) y_{2}^{\prime \prime}(x)+c_{1}^{\prime}(x) y_{1}^{\prime}(x)+c_{2}^{\prime}(x) y_{2}^{\prime}(x)\right] \\[4pt] &+p^{\prime}(x)\left[c_{1}(x) y_{1}^{\prime}(x)+c_{2}(x) y_{2}^{\prime}(x)\right]+q(x)\left[c_{1}(x) y_{1}(x)+c_{2}(x) y_{2}(x)\right] . \end{aligned} \nonumber

Rearranging terms, we find

\begin{aligned} f(x)=& c_{1}(x)\left[p(x) y_{1}^{\prime \prime}(x)+p^{\prime}(x) y_{1}^{\prime}(x)+q(x) y_{1}(x)\right] \\[4pt] &+c_{2}(x)\left[p(x) y_{2}^{\prime \prime}(x)+p^{\prime}(x) y_{2}^{\prime}(x)+q(x) y_{2}(x)\right] \\[4pt] &+p(x)\left[c_{1}^{\prime}(x) y_{1}^{\prime}(x)+c_{2}^{\prime}(x) y_{2}^{\prime}(x)\right] . \end{aligned} \nonumber

Since y_{1}(x) and y_{2}(x) are both solutions of the homogeneous equation. The first two bracketed expressions vanish. Dividing by p(x), we have that

c_{1}^{\prime}(x) y_{1}^{\prime}(x)+c_{2}^{\prime}(x) y_{2}^{\prime}(x)=\dfrac{f(x)}{p(x)} . \nonumber

Our goal is to determine c_{1}(x) and c_{2}(x). In this analysis, we have found that the derivatives of these functions satisfy a linear system of equations (in the c_{i} ’s):

Linear System for Variation of Parameters
c_{1}^{\prime}(x) y_{1}(x)+c_{2}^{\prime}(x) y_{2}(x)=0 .
c_{1}^{\prime}(x) y_{1}^{\prime}(x)+c_{2}^{\prime}(x) y_{2}^{\prime}(x)=\dfrac{f(x)}{p(x)}

This system is easily solved to give

\begin{aligned} c_{1}^{\prime}(x) &=-\dfrac{f(x) y_{2}(x)}{p(x)\left[y_{1}(x) y_{2}^{\prime}(x)-y_{1}^{\prime}(x) y_{2}(x)\right]} \\[4pt] c_{2}^{\prime}(x) &=\dfrac{f(x) y_{1}(x)}{p(x)\left[y_{1}(x) y_{2}^{\prime}(x)-y_{1}^{\prime}(x) y_{2}(x)\right]} \end{aligned} \nonumber

We note that the denominator in these expressions involves the Wronskian of the solutions to the homogeneous problem. Recall that

W\left(y_{1}, y_{2}\right)(x)=\left|\begin{array}{ll} y_{1}(x) & y_{2}(x) \\[4pt] y_{1}^{\prime}(x) & y_{2}^{\prime}(x) \end{array}\right| \nonumber

Furthermore, we can show that the denominator, p(x) W(x), is constant. Differentiating this expression and using the homogeneous form of the differential equation proves this assertion.

\begin{aligned} \dfrac{d}{d x}(p(x) W(x))=& \dfrac{d}{d x}\left[p(x)\left(y_{1}(x) y_{2}^{\prime}(x)-y_{1}^{\prime}(x) y_{2}(x)\right)\right] \\[4pt] =&\left.y_{1}(x) \dfrac{d}{d x}\left(p(x) y_{2}^{\prime}(x)\right)\right)+p(x) y_{2}^{\prime}(x) y_{1}^{\prime}(x) \\[4pt] &\left.-y_{2}(x) \dfrac{d}{d x}\left(p(x) y_{1}^{\prime}(x)\right)\right)-p(x) y_{1}^{\prime}(x) y_{2}^{\prime}(x) \\[4pt] =&-y_{1}(x) q(x) y_{2}(x)+y_{2}(x) q(x) y_{1}(x)=0 \end{aligned} \nonumber

Therefore,

p(x) W(x)=\text { constant. } \nonumber

So, after an integration, we find the parameters as

\begin{aligned} &c_{1}(x)=-\int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \\[4pt] &c_{2}(x)=\int_{x_{1}}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi \end{aligned} \nonumber

where x_{0} and x_{1} are arbitrary constants to be determined later.

Therefore, the particular solution of (8.3) can be written as

y_{p}(x)=y_{2}(x) \int_{x_{1}}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber

As a further note, we usually do not rewrite our initial value problems in self-adjoint form. Recall that for an equation of the form

a_{2}(x) y^{\prime \prime}(x)+a_{1}(x) y^{\prime}(x)+a_{0}(x) y(x)=g(x) . \nonumber

we obtained the self-adjoint form by multiplying the equation by

\dfrac{1}{a_{2}(x)} e^{\int \dfrac{a_{1}(x)}{a_{2}(x)} d x}=\dfrac{1}{a_{2}(x)} p(x) . \nonumber

This gives the standard form

\left(p(x) y^{\prime}(x)\right)^{\prime}+q(x) y(x)=f(x) \nonumber

where

f(x)=\dfrac{1}{a_{2}(x)} p(x) g(x) . \nonumber

With this in mind, Equation (8.13) becomes

y_{p}(x)=y_{2}(x) \int_{x_{1}}^{x} \dfrac{g(\xi) y_{1}(\xi)}{a_{2}(\xi) W(\xi)} d \xi-y_{1}(x) \int_{x_{0}}^{x} \dfrac{g(\xi) y_{2}(\xi)}{\left.a_{(} \xi\right) W(\xi)} d \xi . \nonumber

Example 8.1. Consider the nonhomogeneous differential equation

y^{\prime \prime}-y^{\prime}-6 y=20 e^{-2 x} . \nonumber

We seek a particular solution to this equation. First, we note two linearly independent solutions of this equation are

y_{1}(x)=e^{3 x}, \quad y_{2}(x)=e^{-2 x} . \nonumber

So, the particular solution takes the form

y_{p}(x)=c_{1}(x) e^{3 x}+c_{2}(x) e^{-2 x} \nonumber

We just need to determine the c_{i} ’s. Since this problem is not in self-adjoint form, we will use

\dfrac{f(x)}{p(x)}=\dfrac{g(x)}{a_{2}(x)}=20 e^{-2 x} \nonumber

as seen above. Then the linear system we have to solve is

\begin{aligned} c_{1}^{\prime}(x) e^{3 x}+c_{2}^{\prime}(x) e^{-2 x} &=0 \\[4pt] 3 c_{1}^{\prime}(x) e^{3 x}-2 c_{2}^{\prime}(x) e^{-2 x} &=20 e^{-2 x} \end{aligned} \nonumber

Multiplying the first equation by 2 and adding the equations yields

5 c_{1}^{\prime}(x) e^{3 x}=20 e^{-2 x} \nonumber

or

c_{1}^{\prime}(x)=4 e^{-5 x} \nonumber

Inserting this back into the first equation in the system, we have

4 e^{-2 x}+c_{2}^{\prime}(x) e^{-2 x}=0, \nonumber

leading to

c_{2}^{\prime}(x)=-4 . \nonumber

These equations are easily integrated to give

c_{1}(x)=-\dfrac{4}{5} e^{-5 x}, \quad c_{2}(x)=-4 x . \nonumber

Therefore, the particular solution has been found as

\begin{aligned} y_{p}(x) &=c_{1}(x) e^{3 x}+c_{2}(x) e^{-2 x} \\[4pt] &=-\dfrac{4}{5} e^{-5 x} e^{3 x}-4 x e^{-2 x} \\[4pt] &=-\dfrac{4}{5} e^{-2 x}-4 x e^{-2 x} \end{aligned} \nonumber

Noting that the first term can be absorbed into the solution of the homogeneous problem. So, the particular solution can simply be written as

y_{p}(x)=-4 x e^{-2 x} \nonumber

This is the answer you would have found had you used the Modified Method of Undetermined Coefficients.

Example 8.2. Revisiting the last example, y^{\prime \prime}-y^{\prime}-6 y=20 e^{-2 x}.

The formal solution in Equation (8.13) was not used in the last example. Instead, we proceeded from the Linear System for Variation of Parameters earlier in this section. This is the more natural approach towards finding the particular solution of the nonhomogeneous equation. Since we will be using Equation (8.13) to obtain solutions to initial value and boundary value problems, it might be useful to use it to solve this problem.

From the last example we have

y_{1}(x)=e^{3 x}, \quad y_{2}(x)=e^{-2 x} \nonumber

We need to compute the Wronskian:

W(x)=W\left(y_{1}, y_{2}\right)(x)=\left|\begin{array}{cc} e^{3 x} & e^{-2 x} \\[4pt] 3 e^{3 x} & -2 e^{-2 x} \end{array}\right|=-5 e^{x} \nonumber

Also, we need p(x), which is given by

p(x)=\exp \left(-\int d x\right)=e^{-x} . \nonumber

So, we see that p(x) W(x)=-5. It is indeed constant, just as we had proven earlier.

Finally, we need f(x). Here is where one needs to be careful as the original problem was not in self-adjoint form. We have from the original equation that g(x)=20 e^{-2 x} and a_{2}(x)=1. So,

f(x)=\dfrac{p(x)}{a_{2}(x)} g(x)=20 e^{-3 x} \nonumber

Now we are ready to construct the solution.

\begin{aligned} y_{p}(x) &=y_{2}(x) \int_{x_{1}}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \\[4pt] &=e^{-2 x} \int_{x_{1}}^{x} \dfrac{20 e^{-3 \xi} e^{3 \xi}}{-5} d \xi-e^{3 x} \int_{x_{0}}^{x} \dfrac{20 e^{-3 \xi} e^{-2 \xi}}{-5} d \xi \\[4pt] &=-4 e^{-2 x} \int_{x_{1}}^{x} d \xi+4 e^{3 x} \int_{x_{0}}^{x} e^{-5 x} d \xi \\[4pt] &=-\left.4 \xi e^{-2 x}\right|_{x_{1}} ^{x}-\left.\dfrac{4}{5} e^{3 x} e^{-5 \xi}\right|_{x_{0}} ^{x} \\[4pt] &=-4 x e^{-2 x}-\dfrac{4}{5} e^{-2 x}+4 x_{1} e^{-2 x}+\dfrac{4}{5} e^{-5 x_{0}} e^{3 x} \end{aligned} \nonumber

Note that the first two terms we had found in the last example. The remaining two terms are simply linear combinations of y_{1} and y_{2}. Thus, we really have the solution to the homogeneous problem contained within the solution when we use the arbitrary constant limits in the integrals. In the next section we will make use of these constants when solving initial value and boundary value problems.

In the next section we will determine the unknown constants subject to either initial conditions or boundary conditions. This will allow us to combine the two integrals and then determine the appropriate Green’s functions.

Initial and Boundary Value Green’s Functions

We begin with the particular solution (8.13) of our nonhomogeneous differential equation (8.3). This can be combined with the general solution of the homogeneous problem to give the general solution of the nonhomogeneous differential equation:

y(x)=c_{1} y_{1}(x)+c_{2} y_{2}(x)+y_{2}(x) \int_{x_{1}}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber

As seen in the last section, an appropriate choice of x_{0} and x_{1} could be found so that we need not explicitly write out the solution to the homogeneous problem, c_{1} y_{1}(x)+c_{2} y_{2}(x). However, setting up the solution in this form will allow us to use x_{0} and x_{1} to determine particular solutions which satisfies certain homogeneous conditions.

We will now consider initial value and boundary value problems. Each type of problem will lead to a solution of the form

y(x)=c_{1} y_{1}(x)+c_{2} y_{2}(x)+\int_{a}^{b} G(x, \xi) f(\xi) d \xi \nonumber

where the function G(x, \xi) will be identified as the Green’s function and the integration limits will be found on the integral. Having identified the Green’s function, we will look at other methods in the last section for determining the Green’s function.

Initial Value Green’s Function

We begin by considering the solution of the initial value problem

\begin{array}{r} \dfrac{d}{d x}\left(p(x) \dfrac{d y(x)}{d x}\right)+q(x) y(x)=f(x) . \\[4pt] y(0)=y_{0}, \quad y^{\prime}(0)=v_{0} . \end{array} \nonumber

Of course, we could have studied the original form of our differential equation without writing it in self-adjoint form. However, this form is useful when studying boundary value problems. We will return to this point later.

We first note that we can solve this initial value problem by solving two separate initial value problems. We assume that the solution of the homogeneous problem satisfies the original initial conditions:

\begin{aligned} \dfrac{d}{d x}\left(p(x) \dfrac{d y_{h}(x)}{d x}\right)+q(x) y_{h}(x) &=0 \\[4pt] y_{h}(0)=y_{0}, \quad y_{h}^{\prime}(0) &=v_{0} \end{aligned} \nonumber

We then assume that the particular solution satisfies the problem

\begin{array}{r} \dfrac{d}{d x}\left(p(x) \dfrac{d y_{p}(x)}{d x}\right)+q(x) y_{p}(x)=f(x) \\[4pt] y_{p}(0)=0, \quad y_{p}^{\prime}(0)=0 \end{array} \nonumber

Since the differential equation is linear, then we know that y(x)=y_{h}(x)+ y_{p}(x) is a solution of the nonhomogeneous equation. However, this solution satisfies the initial conditions:

\begin{gathered} y(0)=y_{h}(0)+y_{p}(0)=y_{0}+0=y_{0}, \\[4pt] y^{\prime}(0)=y_{h}^{\prime}(0)+y_{p}^{\prime}(0)=v_{0}+0=v_{0} . \end{gathered} \nonumber

Therefore, we need only focus on solving for the particular solution that satisfies homogeneous initial conditions.

Recall Equation (8.13) from the last section,

y_{p}(x)=y_{2}(x) \int_{x_{1}}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber

We now seek values for x_{0} and x_{1} which satisfies the homogeneous initial conditions, y_{p}(0)=0 and y_{p}^{\prime}(0)=0.

First, we consider y_{p}(0)=0. We have

y_{p}(0)=y_{2}(0) \int_{x_{1}}^{0} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(0) \int_{x_{0}}^{0} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber

Here, y_{1}(x) and y_{2}(x) are taken to be any solutions of the homogeneous differential equation. Let’s assume that y_{1}(0)=0 and y_{2} \neq(0)=0. Then we have

y_{p}(0)=y_{2}(0) \int_{x_{1}}^{0} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber

We can force y_{p}(0)=0 if we set x_{1}=0.

Now, we consider y_{p}^{\prime}(0)=0. First we differentiate the solution and find that

y_{p}^{\prime}(x)=y_{2}^{\prime}(x) \int_{0}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}^{\prime}(x) \int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber

since the contributions from differentiating the integrals will cancel. Evaluating this result at x=0, we have

y_{p}^{\prime}(0)=-y_{1}^{\prime}(0) \int_{x_{0}}^{0} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber

Assuming that y_{1}^{\prime}(0) \neq 0, we can set x_{0}=0.

Thus, we have found that

\begin{aligned} y_{p}(x) &=y_{2}(x) \int_{0}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{0}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \\[4pt] &=\int_{0}^{x}\left[\dfrac{y_{1}(\xi) y_{2}(x)-y_{1}(x) y_{2}(\xi)}{p(\xi) W \xi)}\right] f(\xi) d \xi \end{aligned} \nonumber

This result is in the correct form and we can identify the temporal, or initial value, Green’s function. So, the particular solution is given as

y_{p}(x)=\int_{0}^{x} G(x, \xi) f(\xi) d \xi, \nonumber

where the initial value Green’s function is defined as

G(x, \xi)=\dfrac{y_{1}(\xi) y_{2}(x)-y_{1}(x) y_{2}(\xi)}{p(\xi) W \xi)} \nonumber

We summarize

Solution of Initial Value Problem (8.21)

The solution of the initial value problem (8.21) takes the form

y(x)=y_{h}(x)+\int_{0}^{x} G(x, \xi) f(\xi) d \xi, \nonumber

where

G(x, \xi)=\dfrac{y_{1}(\xi) y_{2}(x)-y_{1}(x) y_{2}(\xi)}{p(\xi) W \xi)} \nonumber

and the solution of the homogeneous problem satisfies the initial conditions,

y_{h}(0)=y_{0}, \quad y_{h}^{\prime}(0)=v_{0} \nonumber

Example 8.3. Solve the forced oscillator problem

x^{\prime \prime}+x=2 \cos t, \quad x(0)=4, \quad x^{\prime}(0)=0 . \nonumber

This problem was solved in Chapter 2 using the theory of nonhomogeneous systems. We first solve the homogeneous problem with nonhomogeneous initial conditions:

x_{h}^{\prime \prime}+x_{h}=0, \quad x_{h}(0)=4, \quad x_{h}^{\prime}(0)=0 . \nonumber

The solution is easily seen to be x_{h}(t)=4 \cos t.

Next, we construct the Green’s function. We need two linearly independent solutions, y_{1}(x), y_{2}(x), to the homogeneous differential equation satisfying y_{1}(0)=0 and y_{2}^{\prime}(0)=0. So, we pick y_{1}(t)=\sin t and y_{2}(t)=\cos t. The Wronskian is found as

W(t)=y_{1}(t) y_{2}^{\prime}(t)-y_{1}^{\prime}(t) y_{2}(t)=-\sin ^{2} t-\cos ^{2} t=-1 . \nonumber

Since p(t)=1 in this problem, we have

\begin{aligned} G(t, \tau) &=\dfrac{y_{1}(\tau) y_{2}(t)-y_{1}(t) y_{2}(\tau)}{p(\tau) W \tau)} \\[4pt] &=\sin t \cos \tau-\sin \tau \cos t \\[4pt] &=\sin (t-\tau) \end{aligned} \nonumber

Note that the Green’s function depends on t-\tau. While this is useful in some contexts, we will use the expanded form.

We can now determine the particular solution of the nonhomogeneous differential equation. We have

\begin{aligned} x_{p}(t) &=\int_{0}^{t} G(t, \tau) f(\tau) d \tau \\[4pt] &=\int_{0}^{t}(\sin t \cos \tau-\sin \tau \cos t)(2 \cos \tau) d \tau \end{aligned} \nonumber

\begin{aligned} & 8.2 \text { Initial and Boundary Value Green's Function } \\[4pt] =& 2 \sin t \int_{0}^{t} \cos ^{2} \tau d \tau-2 \cos t \int_{0}^{t} \sin \tau \cos \tau d \tau \\[4pt] =& 2 \sin t\left[\dfrac{\tau}{2}+\dfrac{1}{2} \sin 2 \tau\right]_{0}^{t}-2 \cos t\left[\dfrac{1}{2} \sin ^{2} \tau\right]_{0}^{t} \\[4pt] =& t \sin t \end{aligned} \nonumber

Therefore, the particular solution is x(t)=4 \cos t+t \sin t. This is the same solution we had found earlier in Chapter 2 .

As noted in the last section, we usually are not given the differential equation in self-adjoint form. Generally, it takes the form

a_{2}(x) y^{\prime \prime}(x)+a_{1}(x) y^{\prime}(x)+a_{0}(x) y(x)=g(x) . \nonumber

The driving term becomes

f(x)=\dfrac{1}{a_{2}(x)} p(x) g(x) . \nonumber

Inserting this into the Green’s function form of the particular solution, we obtain the following:

Solution Using the Green’s Function

The solution of the initial value problem,

a_{2}(x) y^{\prime \prime}(x)+a_{1}(x) y^{\prime}(x)+a_{0}(x) y(x)=g(x) \nonumber

takes the form

y(x)=c_{1} y_{1}(x)+c_{2} y_{2}(x)+\int_{0}^{t} G(x, \xi) g(\xi) d \xi, \nonumber

where the Green’s function is the piecewise defined function

G(x, \xi)=\dfrac{y_{1}(\xi) y_{2}(x)-y_{1}(x) y_{2}(\xi)}{a_{2}(\xi) W(\xi)} \nonumber

and y_{1}(x) and y_{2}(x) are solutions of the homogeneous equation satisfying

y_{1}(0)=0, y_{2}(0) \neq 0, y_{1}^{\prime}(0) \neq 0, y_{2}^{\prime}(0)=0 . \nonumber

Boundary Value Green’s Function

We now turn to boundary value problems. We will focus on the problem

\begin{array}{r} \dfrac{d}{d x}\left(p(x) \dfrac{d y(x)}{d x}\right)+q(x) y(x)=f(x), \quad a<x<b \\[4pt] y(a)=0, \quad y(b)=0 \end{array} \nonumber

However, the general theory works for other forms of homogeneous boundary conditions.

Once again, we seek x_{0} and x_{1} in the form

y(x)=y_{2}(x) \int_{x_{1}}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{x_{0}}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber

so that the solution to the boundary value problem can be written as a single integral involving a Green’s function. Here we absorb y_{h}(x) into the integrals with an appropriate choice of lower limits on the integrals.

We first pick solutions of the homogeneous differential equation such that y_{1}(a)=0, y_{2}(b)=0 and y_{1}(b) \neq 0, y_{2}(a) \neq 0. So, we have

\begin{aligned} y(a) &=y_{2}(a) \int_{x_{1}}^{a} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(a) \int_{x_{0}}^{a} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \\[4pt] &=y_{2}(a) \int_{x_{1}}^{a} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi . \end{aligned} \nonumber

This expression is zero if x_{1}=a.

At x=b we find that

\begin{aligned} y(b) &=y_{2}(b) \int_{x_{1}}^{b} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(b) \int_{x_{0}}^{b} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \\[4pt] &=-y_{1}(b) \int_{x_{0}}^{b} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \end{aligned} \nonumber

This vanishes for x_{0}=b.

So, we have found that

y(x)=y_{2}(x) \int_{a}^{x} \dfrac{f(\xi) y_{1}(\xi)}{p(\xi) W(\xi)} d \xi-y_{1}(x) \int_{b}^{x} \dfrac{f(\xi) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi . \nonumber

We are seeking a Green’s function so that the solution can be written as one integral. We can move the functions of x under the integral. Also, since a<x<b, we can flip the limits in the second integral. This gives

y(x)=\int_{a}^{x} \dfrac{f(\xi) y_{1}(\xi) y_{2}(x)}{p(\xi) W(\xi)} d \xi+\int_{x}^{b} \dfrac{f(\xi) y_{1}(x) y_{2}(\xi)}{p(\xi) W(\xi)} d \xi \nonumber

This result can be written in a compact form:

Boundary Value Green’s Function

The solution of the boundary value problem takes the form

y(x)=\int_{a}^{b} G(x, \xi) f(\xi) d \xi, \nonumber

where the Green’s function is the piecewise defined function

G(x, \xi)= \begin{cases}\dfrac{y_{1}(\xi) y_{2}(x)}{p W}, & a \leq \xi \leq x \\[4pt] \dfrac{y_{1}(x) y_{2}(\xi)}{p W}, x \leq \xi \leq b\end{cases} \nonumber

The Green’s function satisfies several properties, which we will explore further in the next section. For example, the Green’s function satisfies the boundary conditions at x=a and x=b. Thus,

\begin{aligned} G(a, \xi) &=\dfrac{y_{1}(a) y_{2}(\xi)}{p W}=0 \\[4pt] G(b, \xi) &=\dfrac{y_{1}(\xi) y_{2}(b)}{p W}=0 \end{aligned} \nonumber

Also, the Green’s function is symmetric in its arguments. Interchanging the arguments gives

G(\xi, x)=\left\{\begin{array}{l} \dfrac{y_{1}(x) y_{2}(\xi)}{p W}, a \leq x \leq \xi \\[4pt] \dfrac{y_{1}(\xi) y_{2}(x)}{p W} \xi \leq x \leq b \end{array}\right. \nonumber

But a careful look at the original form shows that

G(x, \xi)=G(\xi, x) \nonumber

We will make use of these properties in the next section to quickly determine the Green’s functions for other boundary value problems.

Example 8.4. Solve the boundary value problem y^{\prime \prime}=x^{2}, \quad y(0)=0=y(1) using the boundary value Green’s function.

We first solve the homogeneous equation, y^{\prime \prime}=0. After two integrations, we have y(x)=A x+B, for A and B constants to be determined.

We need one solution satisfying y_{1}(0)=0 Thus, 0=y_{1}(0)=B. So, we can pick y_{1}(x)=x, since A is arbitrary.

The other solution has to satisfy y_{2}(1)=0. So, 0=y_{2}(1)=A+B. This can be solved for B=-A. Again, A is arbitrary and we will choose A=-1. Thus, y_{2}(x)=1-x.

For this problem p(x)=1. Thus, for y_{1}(x)=x and y_{2}(x)=1-x

p(x) W(x)=y_{1}(x) y_{2}^{\prime}(x)-y_{1}^{\prime}(x) y_{2}(x)=x(-1)-1(1-x)=-1 . \nonumber

Note that p(x) W(x) is a constant, as it should be. Now we construct the Green’s function. We have

G(x, \xi)=\left\{\begin{array}{l} -\xi(1-x), 0 \leq \xi \leq x \\[4pt] -x(1-\xi), x \leq \xi \leq 1 \end{array}\right. \nonumber

Notice the symmetry between the two branches of the Green’s function. Also, the Green’s function satisfies homogeneous boundary conditions: G(0, \xi)=0, from the lower branch, and G(1, \xi)=0, from the upper branch.

Finally, we insert the Green’s function into the integral form of the solution:

\begin{aligned} y(x) &=\int_{0}^{1} G(x, \xi) f(\xi) d \xi \\[4pt] &=\int_{0}^{1} G(x, \xi) \xi^{2} d \xi \\[4pt] &=-\int_{0}^{x} \xi(1-x) \xi^{2} d \xi-\int_{x}^{1} x(1-\xi) \xi^{2} d \xi \\[4pt] &=-(1-x) \int_{0}^{x} \xi^{3} d \xi-x \int_{x}^{1}\left(\xi^{2}-\xi^{3}\right) d \xi \\[4pt] &=-(1-x)\left[\dfrac{\xi^{4}}{4}\right]_{0}^{x}-x\left[\dfrac{\xi^{3}}{3}-\dfrac{\xi^{4}}{4}\right]_{x}^{1} \\[4pt] &=-\dfrac{1}{4}(1-x) x^{4}-\dfrac{1}{12} x(4-3)+\dfrac{1}{12} x\left(4 x^{3}-3 x^{4}\right) \\[4pt] &=\dfrac{1}{12}\left(x^{4}-x\right) \end{aligned} \nonumber

Properties of Green’s Functions

We have noted some properties of Green’s functions in the last section. In this section we will elaborate on some of these properties as a tool for quickly constructing Green’s functions for boundary value problems. Here is a list of the properties based upon our previous solution.

Properties of the Green’s Function

Differential Equation:

\dfrac{\partial}{\partial x}\left(p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right)+q(x) G(x, \xi)=0, x \neq \xi

For x<\xi we are on the second branch and G(x, \xi) is proportional to y_{1}(x). Thus, since y_{1}(x) is a solution of the homogeneous equation, then so is G(x, \xi). For x>\xi we are on the first branch and G(x, \xi) is proportional to y_{2}(x). So, once again G(x, \xi) is a solution of the homogeneous problem.

Boundary Conditions

For x=a we are on the second branch and G(x, \xi) is proportional to y_{1}(x). Thus, whatever condition y_{1}(x) satisfies, G(x, \xi) will satisfy. A similar statement can be made for x=b.

  1. Symmetry or Reciprocity: G(x, \xi)=G(\xi, x) We had shown this in the last section.
  2. Continuity of \mathbf{G} at x=\xi: G\left(\xi^{+}, \xi\right)=G\left(\xi^{-}, \xi\right) Here we have defined

Setting x=\xi in both branches, we have

\dfrac{y_{1}(\xi) y_{2}(\xi)}{p W}=\dfrac{y_{1}(\xi) y_{2}(\xi)}{p W} \nonumber

  1. Jump Discontinuity of \dfrac{\partial G}{\partial x} at x=\xi :

\dfrac{\partial G\left(\xi^{+}, \xi\right)}{\partial x}-\dfrac{\partial G\left(\xi^{-}, \xi\right)}{\partial x}=\dfrac{1}{p(\xi)} \nonumber

This case is not as obvious. We first compute the derivatives by noting which branch is involved and then evaluate the derivatives and subtract them. Thus, we have

\begin{aligned} \dfrac{\partial G\left(\xi^{+}, \xi\right)}{\partial x}-\dfrac{\partial G\left(\xi^{-}, \xi\right)}{\partial x} &=-\dfrac{1}{p W} y_{1}(\xi) y_{2}^{\prime}(\xi)+\dfrac{1}{p W} y_{1}^{\prime}(\xi) y_{2}(\xi) \\[4pt] &=-\dfrac{y_{1}^{\prime}(\xi) y_{2}(\xi)-y_{1}(\xi) y_{2}^{\prime}(\xi)}{p(\xi)\left(y_{1}(\xi) y_{2}^{\prime}(\xi)-y_{1}^{\prime}(\xi) y_{2}(\xi)\right)} \\[4pt] &=\dfrac{1}{p(\xi)} \end{aligned} \nonumber

We now show how a knowledge of these properties allows one to quickly construct a Green’s function.

\begin{aligned} & G\left(\xi^{+}, x\right)=\lim _{x \downarrow \xi} G(x, \xi), \quad x>\xi, \\[4pt] & G\left(\xi^{-}, x\right)=\lim _{x \uparrow \xi} G(x, \xi), \quad x<\xi . \end{aligned} \nonumber

Example 8.5. Construct the Green’s function for the problem

\begin{gathered} y^{\prime \prime}+\omega^{2} y=f(x), \quad 0<x<1, \\[4pt] y(0)=0=y(1), \end{gathered} \nonumber

with \omega \neq 0.

I. Find solutions to the homogeneous equation.

A general solution to the homogeneous equation is given as

y_{h}(x)=c_{1} \sin \omega x+c_{2} \cos \omega x . \nonumber

Thus, for x \neq \xi

G(x, \xi)=c_{1}(\xi) \sin \omega x+c_{2}(\xi) \cos \omega x \nonumber

II. Boundary Conditions.

First, we have G(0, \xi)=0 for 0 \leq x \leq \xi. So,

G(0, \xi)=c_{2}(\xi) \cos \omega x=0 . \nonumber

So,

G(x, \xi)=c_{1}(\xi) \sin \omega x, \quad 0 \leq x \leq \xi \nonumber

Second, we have G(1, \xi)=0 for \xi \leq x \leq 1. So,

G(1, \xi)=c_{1}(\xi) \sin \omega+c_{2}(\xi) \cos \omega .=0 \nonumber

A solution can be chosen with

c_{2}(\xi)=-c_{1}(\xi) \tan \omega . \nonumber

This gives

G(x, \xi)=c_{1}(\xi) \sin \omega x-c_{1}(\xi) \tan \omega \cos \omega x . \nonumber

This can be simplified by factoring out the c_{1}(\xi) and placing the remaining terms over a common denominator. The result is

\begin{aligned} G(x, \xi) &=\dfrac{c_{1}(\xi)}{\cos \omega}[\sin \omega x \cos \omega-\sin \omega \cos \omega x] \\[4pt] &=-\dfrac{c_{1}(\xi)}{\cos \omega} \sin \omega(1-x) \end{aligned} \nonumber

Since the coefficient is arbitrary at this point, as can write the result as

G(x, \xi)=d_{1}(\xi) \sin \omega(1-x), \quad \xi \leq x \leq 1 \nonumber

We note that we could have started with y_{2}(x)=\sin \omega(1-x) as one of our linearly independent solutions of the homogeneous problem in anticipation that y_{2}(x) satisfies the second boundary condition.

Symmetry or Reciprocity

We now impose that G(x, \xi)=G(\xi, x). To this point we have that

G(x, \xi)=\left\{\begin{array}{cl} c_{1}(\xi) \sin \omega x, & 0 \leq x \leq \xi \\[4pt] d_{1}(\xi) \sin \omega(1-x), & \xi \leq x \leq 1 \end{array}\right. \nonumber

We can make the branches symmetric by picking the right forms for c_{1}(\xi) and d_{1}(\xi). We choose c_{1}(\xi)=C \sin \omega(1-\xi) and d_{1}(\xi)=C \sin \omega \xi. Then,

G(x, \xi)=\left\{\begin{array}{l} C \sin \omega(1-\xi) \sin \omega x, 0 \leq x \leq \xi \\[4pt] C \sin \omega(1-x) \sin \omega \xi, \xi \leq x \leq 1 \end{array} .\right. \nonumber

Now the Green’s function is symmetric and we still have to determine the constant C. We note that we could have gotten to this point using the Method of Variation of Parameters result where C=\dfrac{1}{p W}.

IV. Continuity of G(x, \xi)

We note that we already have continuity by virtue of the symmetry imposed in the last step.

V. Jump Discontinuity in \dfrac{\partial}{\partial x} G(x, \xi).

We still need to determine C. We can do this using the jump discontinuity of the derivative:

\dfrac{\partial G\left(\xi^{+}, \xi\right)}{\partial x}-\dfrac{\partial G\left(\xi^{-}, \xi\right)}{\partial x}=\dfrac{1}{p(\xi)} \nonumber

For our problem p(x)=1. So, inserting our Green’s function, we have

\begin{aligned} 1 &=\dfrac{\partial G\left(\xi^{+}, \xi\right)}{\partial x}-\dfrac{\partial G\left(\xi^{-}, \xi\right)}{\partial x} \\[4pt] &=\dfrac{\partial}{\partial x}[C \sin \omega(1-x) \sin \omega \xi]_{x=\xi}-\dfrac{\partial}{\partial x}[C \sin \omega(1-\xi) \sin \omega x\\[4pt] &=-\omega C \cos \omega(1-\xi) \sin \omega \xi-\omega C \sin \omega(1-\xi) \cos \omega \xi \\[4pt] &=-\omega C \sin \omega(\xi+1-\xi) \\[4pt] &=-\omega C \sin \omega . \end{aligned} \nonumber

Therefore,

C=-\dfrac{1}{\omega \sin \omega} . \nonumber

Finally, we have our Green’s function:

G(x, \xi)=\left\{\begin{array}{l} -\dfrac{\sin \omega(1-\xi) \sin \omega x}{\omega \sin \omega}, 0 \leq x \leq \xi \\[4pt] -\dfrac{\sin \omega(1-x) \sin \omega \xi}{\omega \sin \omega}, \xi \leq x \leq 1 \end{array} .\right. \nonumber

It is instructive to compare this result to the Variation of Parameters result. We have the functions y_{1}(x)=\sin \omega x and y_{2}(x)=\sin \omega(1-x) as the solutions of the homogeneous equation satisfying y_{1}(0)=0 and y_{2}(1)=0. We need to compute p W :

\begin{aligned} p(x) W(x) &=y_{1}(x) y_{2}^{\prime}(x)-y_{1}^{\prime}(x) y_{2}(x) \\[4pt] &=-\omega \sin \omega x \cos \omega(1-x)-\omega \cos \omega x \sin \omega(1-x) \\[4pt] &=-\omega \sin \omega \end{aligned} \nonumber

Inserting this result into the Variation of Parameters result for the Green’s function leads to the same Green’s function as above.

The Dirac Delta Function

We will develop a more general theory of Green’s functions for ordinary differential equations which encompasses some of the listed properties. The Green’s function satisfies a homogeneous differential equation for x \neq \xi,

\dfrac{\partial}{\partial x}\left(p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right)+q(x) G(x, \xi)=0, \quad x \neq \xi \nonumber

When x=\xi, we saw that the derivative has a jump in its value. This is similar to the step, or Heaviside, function,

H(x)=\left\{\begin{array}{l} 1, x>0 \\[4pt] 0, x<0 \end{array}\right. \nonumber

In the case of the step function, the derivative is zero everywhere except at the jump. At the jump, there is an infinite slope, though technically, we have learned that there is no derivative at this point. We will try to remedy this by introducing the Dirac delta function,

\delta(x)=\dfrac{d}{d x} H(x) . \nonumber

We will then show that the Green’s function satisfies the differential equation

\dfrac{\partial}{\partial x}\left(p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right)+q(x) G(x, \xi)=\delta(x-\xi) \nonumber

The Dirac delta function, \delta(x), is one example of what is known as a generalized function, or a distribution. Dirac had introduced this function in the 1930’s in his study of quantum mechanics as a useful tool. It was later studied in a general theory of distributions and found to be more than a simple tool used by physicists. The Dirac delta function, as any distribution, only makes sense under an integral.

Before defining the Dirac delta function and introducing some of its properties, we will look at some representations that lead to the definition. We will consider the limits of two sequences of functions.

First we define the sequence of functions

f_{n}(x)=\left\{\begin{array}{l} 0,|x|>\dfrac{1}{n} \\[4pt] \dfrac{n}{2},|x|<\dfrac{1}{n} \end{array} .\right. \nonumber

This is a sequence of functions as shown in Figure 8.1. As n \rightarrow \infty, we find the limit is zero for x \neq 0 and is infinite for x=0. However, the area under each member of the sequences is one since each box has height \dfrac{n}{2} and width \dfrac{2}{n}. Thus, the limiting function is zero at most points but has area one. (At this point the reader who is new to this should be doing some head scratching!)

image
Figure 8.1. A plot of the functions f_{n}(x) for n=2,4,8.

The limit is not really a function. It is a generalized function. It is called the Dirac delta function, which is defined by

  1. \delta(x)=0 for x \neq 0
  2. \int_{-\infty}^{\infty} \delta(x) d x=1

Another example is the sequence defined by

D_{n}(x)=\dfrac{2 \sin n x}{x} \nonumber

We can graph this function. We first rewrite this function as

D_{n}(x)=2 n \dfrac{\sin n x}{n x} . \nonumber

Now it is easy to see that as x \rightarrow 0, D_{n}(x) \rightarrow 2 n. For large x, The function tends to zero. A plot of this function is in Figure 8.2. For large n the peak grows and the values of D_{n}(x) for x \neq 0 tend to zero as show in Figure 8.3.

We note that in the limit n \rightarrow \infty, D_{n}(x)=0 for x \neq 0 and it is infinite at x=0. However, using complex analysis one can show that the area is

\int_{-\infty}^{\infty} D_{n}(x) d x=2 \pi \nonumber

Thus, the area is constant for each n.

image
Figure 8.2. A plot of the function D_{n}(x) for n=4.
image
Figure 8.3. A plot of the function D_{n}(x) for n=40.

There are two main properties that define a Dirac delta function. First one has that the area under the delta function is one,

\int_{-\infty}^{\infty} \delta(x) d x=1 \nonumber

Integration over more general intervals gives

\int_{a}^{b} \delta(x) d x=1, \quad 0 \in[a, b] \nonumber

and

\int_{a}^{b} \delta(x) d x=0, \quad 0 \notin[a, b] . \nonumber

Another common property is what is sometimes called the sifting property. Namely, integrating the product of a function and the delta function "sifts" out a specific value of the function. It is given by

\int_{-\infty}^{\infty} \delta(x-a) f(x) d x=f(a) \nonumber

This can be seen by noting that the delta function is zero everywhere except at x=a. Therefore, the integrand is zero everywhere and the only contribution from f(x) will be from x=a. So, we can replace f(x) with f(a) under the integral. Since f(a) is a constant, we have that

\int_{-\infty}^{\infty} \delta(x-a) f(x) d x=\int_{-\infty}^{\infty} \delta(x-a) f(a) d x=f(a) \int_{-\infty}^{\infty} \delta(x-a) d x=f(a) \nonumber

Another property results from using a scaled argument, ax. In this case we show that

\delta(a x)=|a|^{-1} \delta(x) \nonumber

As usual, this only has meaning under an integral sign. So, we place \delta(a x) inside an integral and make a substitution y=a x :

\begin{aligned} \int_{-\infty}^{\infty} \delta(a x) d x &=\lim _{L \rightarrow \infty} \int_{-L}^{L} \delta(a x) d x \\[4pt] &=\lim _{L \rightarrow \infty} \dfrac{1}{a} \int_{-a L}^{a L} \delta(y) d y \end{aligned} \nonumber

If a>0 then

\int_{-\infty}^{\infty} \delta(a x) d x=\dfrac{1}{a} \int_{-\infty}^{\infty} \delta(y) d y \nonumber

However, if a<0 then

\int_{-\infty}^{\infty} \delta(a x) d x=\dfrac{1}{a} \int_{\infty}^{-\infty} \delta(y) d y=-\dfrac{1}{a} \int_{-\infty}^{\infty} \delta(y) d y \nonumber

The overall difference in a multiplicative minus sign can be absorbed into one expression by changing the factor 1 / a to 1 /|a|. Thus,

\int_{-\infty}^{\infty} \delta(a x) d x=\dfrac{1}{|a|} \int_{-\infty}^{\infty} \delta(y) d y . \nonumber

Example 8.6. Evaluate \int_{-\infty}^{\infty}(5 x+1) \delta(4(x-2)) d x. This is a straight forward integration:

\int_{-\infty}^{\infty}(5 x+1) \delta(4(x-2)) d x=\dfrac{1}{4} \int_{-\infty}^{\infty}(5 x+1) \delta(x-2) d x=\dfrac{11}{4} \nonumber

A more general scaling of the argument takes the form \delta(f(x)). The integral of \delta(f(x)) can be evaluated depending upon the number of zeros of f(x). If there is only one zero, f\left(x_{1}\right)=0, then one has that

\int_{-\infty}^{\infty} \delta(f(x)) d x=\int_{-\infty}^{\infty} \dfrac{1}{\left|f^{\prime}\left(x_{1}\right)\right|} \delta\left(x-x_{1}\right) d x \nonumber

This can be proven using the substitution y=f(x) and is left as an exercise for the reader. This result is often written as

\delta(f(x))=\dfrac{1}{\left|f^{\prime}\left(x_{1}\right)\right|} \delta\left(x-x_{1}\right) . \nonumber

Example 8.7. Evaluate \int_{-\infty}^{\infty} \delta(3 x-2) x^{2} d x.

This is not a simple \delta(x-a). So, we need to find the zeros of f(x)=3 x-2. There is only one, x=\dfrac{2}{3}. Also, \left|f^{\prime}(x)\right|=3. Therefore, we have

\int_{-\infty}^{\infty} \delta(3 x-2) x^{2} d x=\int_{-\infty}^{\infty} \dfrac{1}{3} \delta\left(x-\dfrac{2}{3}\right) x^{2} d x=\dfrac{1}{3}\left(\dfrac{2}{3}\right)^{2}=\dfrac{4}{27} . \nonumber

Note that this integral can be evaluated the long way by using the substitution y=3 x-2. Then, d y=3 d x and x=(y+2) / 3. This gives

\int_{-\infty}^{\infty} \delta(3 x-2) x^{2} d x=\dfrac{1}{3} \int_{-\infty}^{\infty} \delta(y)\left(\dfrac{y+2}{3}\right)^{2} d y=\dfrac{1}{3}\left(\dfrac{4}{9}\right)=\dfrac{4}{27} \nonumber

More generally, one can show that when f\left(x_{j}\right)=0 and f^{\prime}\left(x_{j}\right) \neq 0 for x_{j}, j=1,2, \ldots, n, (i.e.; when one has n simple zeros), then

\delta(f(x))=\sum_{j=1}^{n} \dfrac{1}{\left|f^{\prime}\left(x_{j}\right)\right|} \delta\left(x-x_{j}\right) . \nonumber

Example 8.8. Evaluate \int_{0}^{2 \pi} \cos x \delta\left(x^{2}-\pi^{2}\right) d x

In this case the argument of the delta function has two simple roots. Namely, f(x)=x^{2}-\pi^{2}=0 when x=\pm \pi. Furthermore, f^{\prime}(x)=2 x. Therefore, \left|f^{\prime}(\pm \pi)\right|=2 \pi. This gives

\delta\left(x^{2}-\pi^{2}\right)=\dfrac{1}{2 \pi}[\delta(x-\pi)+\delta(x+\pi)] . \nonumber

Inserting this expression into the integral and noting that x=-\pi is not in the integration interval, we have

\begin{aligned} \int_{0}^{2 \pi} \cos x \delta\left(x^{2}-\pi^{2}\right) d x &=\dfrac{1}{2 \pi} \int_{0}^{2 \pi} \cos x[\delta(x-\pi)+\delta(x+\pi)] d x \\[4pt] &=\dfrac{1}{2 \pi} \cos \pi=-\dfrac{1}{2 \pi} \end{aligned} \nonumber

Finally, we previously noted there is a relationship between the Heaviside, or step, function and the Dirac delta function. We defined the Heaviside function as

H(x)=\left\{\begin{array}{l} 0, x<0 \\[4pt] 1, x>0 \end{array}\right. \nonumber

Then, it is easy to see that H^{\prime}(x)=\delta(x).

Green’s Function Differential Equation

As noted, the Green’s function satisfies the differential equation

\dfrac{\partial}{\partial x}\left(p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right)+q(x) G(x, \xi)=\delta(x-\xi) \nonumber

and satisfies homogeneous conditions. We have used the Green’s function to solve the nonhomogeneous equation

\dfrac{d}{d x}\left(p(x) \dfrac{d y(x)}{d x}\right)+q(x) y(x)=f(x) \nonumber

These equations can be written in the more compact forms

\begin{gathered} \mathcal{L}[y]=f(x) \\[4pt] \mathcal{L}[G]=\delta(x-\xi) \end{gathered} \nonumber

Multiplying the first equation by G(x, \xi), the second equation by y(x), and then subtracting, we have

G \mathcal{L}[y]-y \mathcal{L}[G]=f(x) G(x, \xi)-\delta(x-\xi) y(x) . \nonumber

Now, integrate both sides from x=a to x=b. The left side becomes

\int_{a}^{b}[f(x) G(x, \xi)-\delta(x-\xi) y(x)] d x=\int_{a}^{b} f(x) G(x, \xi) d x-y(\xi) \nonumber

and, using Green’s Identity, the right side is

\int_{a}^{b}(G \mathcal{L}[y]-y \mathcal{L}[G]) d x=\left[p(x)\left(G(x, \xi) y^{\prime}(x)-y(x) \dfrac{\partial G}{\partial x}(x, \xi)\right)\right]_{x=a}^{x=b} \nonumber

Combining these results and rearranging, we obtain

y(\xi)=\int_{a}^{b} f(x) G(x, \xi) d x-\left[p(x)\left(y(x) \dfrac{\partial G}{\partial x}(x, \xi)-G(x, \xi) y^{\prime}(x)\right)\right]_{x=a}^{x=b} \nonumber

Next, one uses the boundary conditions in the problem in order to determine which conditions the Green’s function needs to satisfy. For example, if we have the boundary condition y(a)=0 and y(b)=0, then the boundary terms yield

\begin{aligned} y(\xi)=& \int_{a}^{b} f(x) G(x, \xi) d x-\left[p(b)\left(y(b) \dfrac{\partial G}{\partial x}(b, \xi)-G(b, \xi) y^{\prime}(b)\right)\right] \\[4pt] &+\left[p(a)\left(y(a) \dfrac{\partial G}{\partial x}(a, \xi)-G(a, \xi) y^{\prime}(a)\right)\right] \\[4pt] =& \int_{a}^{b} f(x) G(x, \xi) d x+p(b) G(b, \xi) y^{\prime}(b)-p(a) G(a, \xi) y^{\prime}(a) . \end{aligned} \nonumber

The right hand side will only vanish if G(x, \xi) also satisfies these homogeneous boundary conditions. This then leaves us with the solution

y(\xi)=\int_{a}^{b} f(x) G(x, \xi) d x . \nonumber

We should rewrite this as a function of x. So, we replace \xi with x and x with \xi. This gives

y(x)=\int_{a}^{b} f(\xi) G(\xi, x) d \xi . \nonumber

However, this is not yet in the desirable form. The arguments of the Green’s function are reversed. But, G(x, \xi) is symmetric in its arguments. So, we can simply switch the arguments getting the desired result.

We can now see that the theory works for other boundary conditions. If we had y^{\prime}(a)=0, then the y(a) \dfrac{\partial G}{\partial x}(a, \xi) term in the boundary terms could be made to vanish if we set \dfrac{\partial G}{\partial x}(a, \xi)=0. So, this confirms that other boundary value problems can be posed besides the one elaborated upon in the chapter so far.

We can even adapt this theory to nonhomogeneous boundary conditions. We first rewrite Equation (8.62) as

y(x)=\int_{a}^{b} G(x, \xi) f(\xi) d \xi-\left[p(\xi)\left(y(\xi) \dfrac{\partial G}{\partial \xi}(x, \xi)-G(x, \xi) y^{\prime}(\xi)\right)\right]_{\xi=a}^{\xi=b} \nonumber

Let’s consider the boundary conditions y(a)=\alpha and y^{\prime}(b)= beta. We also assume that G(x, \xi) satisfies homogeneous boundary conditions,

G(a, \xi)=0, \quad \dfrac{\partial G}{\partial \xi}(b, \xi)=0 . \nonumber

in both x and \xi since the Green’s function is symmetric in its variables. Then, we need only focus on the boundary terms to examine the effect on the solution. We have

\begin{aligned} {\left[p(\xi)\left(y(\xi) \dfrac{\partial G}{\partial \xi}(x, \xi)-G(x, \xi) y^{\prime}(\xi)\right)\right]_{\xi=a}^{\xi=b} } &=\left[p(b)\left(y(b) \dfrac{\partial G}{\partial \xi}(x, b)-G(x, b) y^{\prime}(b)\right)\right] \\[4pt] &-\left[p(a)\left(y(a) \dfrac{\partial G}{\partial \xi}(x, a)-G(x, a) y^{\prime}(a)\right)\right] \\[4pt] &=-\beta p(b) G(x, b)-\alpha p(a) \dfrac{\partial G}{\partial \xi}(x, a) . \end{aligned} \nonumber

Therefore, we have the solution

y(x)=\int_{a}^{b} G(x, \xi) f(\xi) d \xi+\beta p(b) G(x, b)+\alpha p(a) \dfrac{\partial G}{\partial \xi}(x, a) . \nonumber

This solution satisfies the nonhomogeneous boundary conditions. Let’s see how it works. Example 8.9. Modify Example 8.4 to solve the boundary value problem y^{\prime \prime}= x^{2}, \quad y(0)=1, y(1)=2 using the boundary value Green’s function that we found:

G(x, \xi)=\left\{\begin{array}{l} -\xi(1-x), 0 \leq \xi \leq x \\[4pt] -x(1-\xi), x \leq \xi \leq 1 \end{array}\right. \nonumber

We insert the Green’s function into the solution and use the given conditions to obtain

\begin{aligned} y(x) &=\int_{0}^{1} G(x, \xi) \xi^{2} d \xi-\left[y(\xi) \dfrac{\partial G}{\partial \xi}(x, \xi)-G(x, \xi) y^{\prime}(\xi)\right]_{\xi=0}^{\xi=1} \\[4pt] &=\int_{0}^{x}(x-1) \xi^{3} d \xi+\int_{x}^{1} x(\xi-1) \xi^{2} d \xi+y(0) \dfrac{\partial G}{\partial \xi}(x, 0)-y(1) \dfrac{\partial G}{\partial \xi}(x) \\[4pt] &=\dfrac{(x-1) x^{4}}{4}+\dfrac{x\left(1-x^{4}\right)}{4}-\dfrac{x\left(1-x^{3}\right)}{3}+(x-1)-2 x \\[4pt] &=\dfrac{x^{4}}{12}+\dfrac{35}{12} x-1 \end{aligned} \nonumber

Of course, this problem can be solved more directly by direct integration. The general solution is

y(x)=\dfrac{x^{4}}{12}+c_{1} x+c_{2} . \nonumber

Inserting this solution into each boundary condition yields the same result.

We have seen how the introduction of the Dirac delta function in the differential equation satisfied by the Green’s function, Equation (8.59), can lead to the solution of boundary value problems. The Dirac delta function also aids in our interpretation of the Green’s function. We note that the Green’s function is a solution of an equation in which the nonhomogeneous function is \delta(x-\xi). Note that if we multiply the delta function by f(\xi) and integrate we obtain

\int_{-\infty}^{\infty} \delta(x-\xi) f(\xi) d \xi=f(x) \nonumber

We can view the delta function as a unit impulse at x=\xi which can be used to build f(x) as a sum of impulses of different strengths, f(\xi). Thus, the Green’s function is the response to the impulse as governed by the differential equation and given boundary conditions.

In particular, the delta function forced equation can be used to derive the jump condition. We begin with the equation in the form

\dfrac{\partial}{\partial x}\left(p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right)+q(x) G(x, \xi)=\delta(x-\xi) \nonumber

Now, integrate both sides from \xi-\epsilon to \xi+\epsilon and take the limit as \epsilon \rightarrow 0. Then,

\begin{aligned} \lim _{\epsilon \rightarrow 0} \int_{\xi-\epsilon}^{\xi+\epsilon}\left[\dfrac{\partial}{\partial x}\left(p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right)+q(x) G(x, \xi)\right] d x &=\lim _{\epsilon \rightarrow 0} \int_{\xi-\epsilon}^{\xi+\epsilon} \delta(x-\xi) d x \\[4pt] &=1 \end{aligned} \nonumber

Since the q(x) term is continuous, the limit of that term vanishes. Using the Fundamental Theorem of Calculus, we then have

\lim _{\epsilon \rightarrow 0}\left[p(x) \dfrac{\partial G(x, \xi)}{\partial x}\right]_{\xi-\epsilon}^{\xi+\epsilon}=1 . \nonumber

This is the jump condition that we have been using!

Series Representations of Green’s Functions

There are times that it might not be so simple to find the Green’s function in the simple closed form that we have seen so far. However, there is a method for determining the Green’s functions of Sturm-Liouville boundary value problems in the form of an eigenfunction expansion. We will finish our discussion of Green’s functions for ordinary differential equations by showing how one obtains such series representations. (Note that we are really just repeating the steps towards developing eigenfunction expansion which we had seen in Chapter 6.)

We will make use of the complete set of eigenfunctions of the differential operator, \mathcal{L}, satisfying the homogeneous boundary conditions:

\mathcal{L}\left[\phi_{n}\right]=-\lambda_{n} \sigma \phi_{n}, \quad n=1,2, \ldots \nonumber

We want to find the particular solution y satisfying \mathcal{L}[y]=f and homogeneous boundary conditions. We assume that

y(x)=\sum_{n=1}^{\infty} a_{n} \phi_{n}(x) . \nonumber

Inserting this into the differential equation, we obtain

\mathcal{L}[y]=\sum_{n=1}^{\infty} a_{n} \mathcal{L}\left[\phi_{n}\right]=-\sum_{n=1}^{\infty} \lambda_{n} a_{n} \sigma \phi_{n}=f \nonumber

This has resulted in the generalized Fourier expansion

f(x)=\sum_{n=1}^{\infty} c_{n} \sigma \phi_{n}(x) \nonumber

with coefficients

c_{n}=-\lambda_{n} a_{n} \nonumber

We have seen how to compute these coefficients earlier in the text. We multiply both sides by \phi_{k}(x) and integrate. Using the orthogonality of the eigenfunctions,

\int_{a}^{b} \phi_{n}(x) \phi_{k}(x) \sigma(x) d x=N_{k} \delta_{n k} \nonumber

one obtains the expansion coefficients (if \lambda_{k} \neq 0 )

a_{k}=-\dfrac{\left(f, \phi_{k}\right)}{N_{k} \lambda_{k}}, \nonumber

where \left(f, \phi_{k}\right) \equiv \int_{a}^{b} f(x) \phi_{k}(x) d x.

As before, we can rearrange the solution to obtain the Green’s function. Namely, we have

y(x)=\sum_{n=1}^{\infty} \dfrac{\left(f, \phi_{n}\right)}{-N_{n} \lambda_{n}} \phi_{n}(x)=\int_{a}^{b} \underbrace{\sum_{n=1}^{\infty} \dfrac{\phi_{n}(x) \phi_{n}(\xi)}{-N_{n} \lambda_{n}}}_{G(x, \xi)} f(\xi) d \xi \nonumber

Therefore, we have found the Green’s function as an expansion in the eigenfunctions:

G(x, \xi)=\sum_{n=1}^{\infty} \dfrac{\phi_{n}(x) \phi_{n}(\xi)}{-\lambda_{n} N_{n}} . \nonumber

Example 8.10. Eigenfunction Expansion Example

We will conclude this discussion with an example. Consider the boundary value problem

y^{\prime \prime}+4 y=x^{2}, \quad x \in(0,1), \quad y(0)=y(1)=0 . \nonumber

The Green’s function for this problem can be constructed fairly quickly for this problem once the eigenvalue problem is solved. We will solve this problem three different ways in order to summarize the methods we have used in the text.

The eigenvalue problem is

\phi^{\prime \prime}(x)+4 \phi(x)=-\lambda \phi(x) \nonumber

where \phi(0)=0 and \phi(1)=0. The general solution is obtained by rewriting the equation as

\phi^{\prime \prime}(x)+k^{2} \phi(x)=0 \nonumber

where

k^{2}=4+\lambda \nonumber

Solutions satisfying the boundary condition at x=0 are of the form

\phi(x)=A \sin k x . \nonumber

Forcing \phi(1)=0 gives

0=A \sin k \Rightarrow k=n \pi, \quad k=1,2,3 \ldots \nonumber

So, the eigenvalues are

\lambda_{n}=n^{2} \pi^{2}-4, \quad n=1,2, \ldots \nonumber

and the eigenfunctions are

\phi_{n}=\sin n \pi x, \quad n=1,2, \ldots \nonumber

We need the normalization constant, N_{n}. We have that

N_{n}=\left\|\phi_{n}\right\|^{2}=\int_{0}^{1} \sin ^{2} n \pi x=\dfrac{1}{2} . \nonumber

We can now construct the Green’s function for this problem using Equation (8.72)

G(x, \xi)=2 \sum_{n=1}^{\infty} \dfrac{\sin n \pi x \sin n \pi \xi}{\left(4-n^{2} \pi^{2}\right)} . \nonumber

We can use this Green’s function to determine the solution of the boundary value problem. Thus, we have

\begin{aligned} y(x) &=\int_{0}^{1} G(x, \xi) f(\xi) d \xi \\[4pt] &=\int_{0}^{1}\left(2 \sum_{n=1}^{\infty} \dfrac{\sin n \pi x \sin n \pi \xi}{\left(4-n^{2} \pi^{2}\right)}\right) \xi^{2} d \xi \\[4pt] &=2 \sum_{n=1}^{\infty} \dfrac{\sin n \pi x}{\left(4-n^{2} \pi^{2}\right)} \int_{0}^{1} \xi^{2} \sin n \pi \xi d \xi \\[4pt] &=2 \sum_{n=1}^{\infty} \dfrac{\sin n \pi x}{\left(4-n^{2} \pi^{2}\right)}\left[\dfrac{\left(2-n^{2} \pi^{2}\right)(-1)^{n}-2}{n^{3} \pi^{3}}\right] \end{aligned} \nonumber

We can compare this solution to the one one would obtain if we did not employ Green’s functions directly. The eigenfunction expansion method for solving boundary value problems, which we saw earlier proceeds as follows. We assume that our solution is in the form

y(x)=\sum_{n=1}^{\infty} c_{n} \phi_{n}(x) . \nonumber

Inserting this into the differential equation \mathcal{L}[y]=x^{2} gives

\begin{aligned} x^{2} &=\mathcal{L}\left[\sum_{n=1}^{\infty} c_{n} \sin n \pi x\right] \\[4pt] &=\sum_{n=1}^{\infty} c_{n}\left[\dfrac{d^{2}}{d x^{2}} \sin n \pi x+4 \sin n \pi x\right] \\[4pt] &=\sum_{n=1}^{\infty} c_{n}\left[4-n^{2} \pi^{2}\right] \sin n \pi x \end{aligned} \nonumber

We need the Fourier sine series expansion of x^{2} on [0,1] in order to determine the c_{n} ’s. Thus, we need

\begin{aligned} b_{n} &=\dfrac{2}{1} \int_{0}^{1} x^{2} \sin n \pi x \\[4pt] &=2\left[\dfrac{\left(2-n^{2} \pi^{2}\right)(-1)^{n}-2}{n^{3} \pi^{3}}\right], \quad n=1,2, \ldots \end{aligned} \nonumber

Thus,

x^{2}=2 \sum_{n=1}^{\infty}\left[\dfrac{\left(2-n^{2} \pi^{2}\right)(-1)^{n}-2}{n^{3} \pi^{3}}\right] \sin n \pi x . \nonumber

Inserting this in Equation (8.75), we find

2 \sum_{n=1}^{\infty}\left[\dfrac{\left(2-n^{2} \pi^{2}\right)(-1)^{n}-2}{n^{3} \pi^{3}}\right] \sin n \pi x=\sum_{n=1}^{\infty} c_{n}\left[4-n^{2} \pi^{2}\right] \sin n \pi x . \nonumber

Due to the linear independence of the eigenfunctions, we can solve for the unknown coefficients to obtain

c_{n}=2 \dfrac{\left(2-n^{2} \pi^{2}\right)(-1)^{n}-2}{\left(4-n^{2} \pi^{2}\right) n^{3} \pi^{3}} \nonumber

Therefore, the solution using the eigenfunction expansion method is

\begin{aligned} y(x) &=\sum_{n=1}^{\infty} c_{n} \phi_{n}(x) \\[4pt] &=2 \sum_{n=1}^{\infty} \dfrac{\sin n \pi x}{\left(4-n^{2} \pi^{2}\right)}\left[\dfrac{\left(2-n^{2} \pi^{2}\right)(-1)^{n}-2}{n^{3} \pi^{3}}\right] \end{aligned} \nonumber

We note that this is the same solution as we had obtained using the Green’s function obtained in series form.

One remaining question is the following: Is there a closed form for the Green’s function and the solution to this problem? The answer is yes! We note that the differential operator is a special case of the example done is section 8.2.2. Namely, we pick \omega=2. The Green’s function was already found in that section. For this special case, we have

G(x, \xi)=\left\{\begin{array}{l} -\dfrac{\sin 2(1-\xi) \sin 2 x}{2 \sin 2}, 0 \leq x \leq \xi \\[4pt] -\dfrac{\sin 2(1-x) \sin 2 \xi}{2 \sin 2}, \xi \leq x \leq 1 \end{array}\right. \nonumber

What about the solution to the boundary value problem? This solution is given by

\begin{aligned} y(x) &=\int_{0}^{1} G(x, \xi) f(\xi) d \xi \\[4pt] &=-\int_{0}^{x} \dfrac{\sin 2(1-x) \sin 2 \xi}{2 \sin 2} \xi^{2} d \xi+\int_{x}^{1} \dfrac{\sin 2(\xi-1) \sin 2 x}{2 \sin 2} \xi^{2} d \xi \\[4pt] &=-\dfrac{1}{4 \sin 2}\left[-x^{2} \sin 2-\sin 2 \cos ^{2} x+\sin 2+\cos 2 \sin x \cos x+\sin x \cos x\right] \\[4pt] &=-\dfrac{1}{4 \sin 2}\left[-x^{2} \sin 2+\left(1-\cos ^{2} x\right) \sin 2+\sin x \cos x(1+\cos 2)\right] \\[4pt] &\left.=-\dfrac{1}{4 \sin 2}\left[-x^{2} \sin 2+2 \sin ^{2} x \sin 1 \cos 1+2 \sin x \cos x \cos ^{2} 1\right)\right] \\[4pt] &=-\dfrac{1}{8 \sin 1 \cos 1}\left[-x^{2} \sin 2+2 \sin x \cos 1(\sin x \sin 1+\cos x \cos 1)\right] \\[4pt] &=\dfrac{x^{2}}{4}-\dfrac{\sin x \cos (1-x)}{4 \sin 1} . \end{aligned} \nonumber

In Figure 8.4 we show a plot of this solution along with the first five terms of the series solution. The series solution converges quickly.

image
Figure 8.4. Plots of the exact solution to Example 8.10 with the first five terms of the series solution.

As one last check, we solve the boundary value problem directly, as we had done in Chapter 4. Again, the problem is

y^{\prime \prime}+4 y=x^{2}, \quad x \in(0,1), \quad y(0)=y(1)=0 . \nonumber

The problem has the general solution

y(x)=c_{1} \cos 2 x+c_{2} \sin 2 x+y_{p}(x), \nonumber

where y_{p} is a particular solution of the nonhomogeneous differential equation. Using the Method of Undetermined Coefficients, we assume a solution of the form

y_{p}(x)=A x^{2}+B x+C . \nonumber

Inserting this in the nonhomogeneous equation, we have

2 A+4\left(A x^{2}+B x+C\right)=x^{2}, \nonumber

Thus, B=0,4 A=1 and 2 A+4 C=0. The solution of this system is

A=\dfrac{1}{4}, \quad B=0, \quad C=-\dfrac{1}{8} \nonumber

So, the general solution of the nonhomogeneous differential equation is

y(x)=c_{1} \cos 2 x+c_{2} \sin 2 x+\dfrac{x^{2}}{4}-\dfrac{1}{8} . \nonumber

We now determine the arbitrary constants using the boundary conditions. We have

\begin{aligned} 0 &=y(0) \\[4pt] &=c_{1}-\dfrac{1}{8} \\[4pt] 0 &=y(1) \\[4pt] &=c_{1} \cos 2+c_{2} \sin 2+\dfrac{1}{8} \end{aligned} \nonumber

Thus, c_{1}=\dfrac{1}{8} and

c_{2}=-\dfrac{\dfrac{1}{8}+\dfrac{1}{8} \cos 2}{\sin 2} \nonumber

Inserting these constants in the solution we find the same solution as before.

\begin{aligned} y(x) &=\dfrac{1}{8} \cos 2 x-\left[\dfrac{\dfrac{1}{8}+\dfrac{1}{8} \cos 2}{\sin 2}\right] \sin 2 x+\dfrac{x^{2}}{4}-\dfrac{1}{8} \\[4pt] &=\dfrac{\cos 2 x \sin 2-\sin 2 x \cos 2-\sin 2 x}{8 \sin 2}+\dfrac{x^{2}}{4}-\dfrac{1}{8} \\[4pt] &=\dfrac{\left(1-2 \sin ^{2} x\right) \sin 1 \cos 1-\sin x \cos x\left(2 \cos ^{2} 1-1\right)-\sin x \cos x-\sin 1 \cos 1}{8 \sin 1 \cos 1}+\dfrac{x^{2}}{4} \\[4pt] &=-\dfrac{\sin ^{2} x \sin 1+\sin x \cos x \cos 1}{4 \sin 1}+\dfrac{x^{2}}{4} \\[4pt] &=\dfrac{x^{2}}{4}-\dfrac{\sin x \cos (1-x)}{4 \sin 1} . \end{aligned} \nonumber

Problems

8.1. Use the Method of Variation of Parameters to determine the general solution for the following problems.

a. y^{\prime \prime}+y=\tan x.

b. y^{\prime \prime}-4 y^{\prime}+4 y=6 x e^{2 x}

8.2. Instead of assuming that c_{1}^{\prime} y_{1}+c_{2}^{\prime} y_{2}=0 in the derivation of the solution using Variation of Parameters, assume that c_{1}^{\prime} y_{1}+c_{2}^{\prime} y_{2}=h(x) for an arbitrary function h(x) and show that one gets the same particular solution.

8.3. Find the solution of each initial value problem using the appropriate initial value Green’s function.

a. y^{\prime \prime}-3 y^{\prime}+2 y=20 e^{-2 x}, \quad y(0)=0, \quad y^{\prime}(0)=6.

b. y^{\prime \prime}+y=2 \sin 3 x, \quad y(0)=5, \quad y^{\prime}(0)=0.

c. y^{\prime \prime}+y=1+2 \cos x, \quad y(0)=2, \quad y^{\prime}(0)=0.

d. x^{2} y^{\prime \prime}-2 x y^{\prime}+2 y=3 x^{2}-x, \quad y(1)=\pi, \quad y^{\prime}(1)=0.

8.4. Consider the problem y^{\prime \prime}=\sin x, y^{\prime}(0)=0, y(\pi)=0.

a. Solve by direct integration.

b. Determine the Green’s function.

c. Solve the boundary value problem using the Green’s function.

d. Change the boundary conditions to y^{\prime}(0)=5, y(\pi)=-3.

i. Solve by direct integration.

ii. Solve using the Green’s function.

8.5. Consider the problem:

\dfrac{\partial^{2} G}{\partial x^{2}}=\delta\left(x-x_{0}\right), \quad \dfrac{\partial G}{\partial x}\left(0, x_{0}\right)=0, \quad G\left(\pi, x_{0}\right)=0 \nonumber

a. Solve by direct integration.

b. Compare this result to the Green’s function in part b of the last problem.

c. Verify that G is symmetric in its arguments.

8.6. In this problem you will show that the sequence of functions

f_{n}(x)=\dfrac{n}{\pi}\left(\dfrac{1}{1+n^{2} x^{2}}\right) \nonumber

approaches \delta(x) as n \rightarrow \infty. Use the following to support your argument:

a. Show that \lim _{n \rightarrow \infty} f_{n}(x)=0 for x \neq 0.

b. Show that the area under each function is one.

8.7. Verify that the sequence of functions \left\{f_{n}(x)\right\}_{n=1}^{\infty}, defined by f_{n}(x)= \dfrac{n}{2} e^{-n|x|}, approaches a delta function. 8.8. Evaluate the following integrals:
a. \int_{0}^{\pi} \sin x \delta\left(x-\dfrac{\pi}{2}\right) d x.
b. \int_{-\infty}^{\infty} \delta\left(\dfrac{x-5}{3} e^{2 x}\right)\left(3 x^{2}-7 x+2\right) d x
c. \int_{0}^{\pi} x^{2} \delta\left(x+\dfrac{\pi}{2}\right) d x
d. \int_{0}^{\infty} e^{-2 x} \delta\left(x^{2}-5 x+6\right) d x. [See Problem 8.10.]
e. \int_{-\infty}^{\infty}\left(x^{2}-2 x+3\right) \delta\left(x^{2}-9\right) d x. [See Problem 8.10.]

8.9. Find a Fourier series representation of the Dirac delta function, \delta(x), on [-L, L]

8.10. For the case that a function has multiple simple roots, f\left(x_{i}\right)=0, f^{\prime}\left(x_{i}\right) \neq 0, i=1,2, \ldots, it can be shown that

\delta(f(x))=\sum_{i=1}^{n} \dfrac{\delta\left(x-x_{i}\right)}{\left|f^{\prime}\left(x_{i}\right)\right|} \nonumber

Use this result to evaluate \int_{-\infty}^{\infty} \delta\left(x^{2}-5 x+6\right)\left(3 x^{2}-7 x+2\right) d x.

8.11. Consider the boundary value problem: y^{\prime \prime}-y=x, x \in(0,1), with boundary conditions y(0)=y(1)=0.

a. Find a closed form solution without using Green’s functions.

b. Determine the closed form Green’s function using the properties of Green’s functions. Use this Green’s function to obtain a solution of the boundary value problem.

c. Determine a series representation of the Green’s function. Use this Green’s function to obtain a solution of the boundary value problem.

d. Confirm that all of the solutions obtained give the same results.

Support Center

How can we help?