Processing math: 71%
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Mathematics LibreTexts

3.4: Second-Order Approximations

( \newcommand{\kernel}{\mathrm{null}\,}\)

In one-variable calculus, Taylor polynomials provide a natural way to extend best affine approximations to higher-order polynomial approximations. It is possible to generalize these ideas to scalar-valued functions of two or more variables, but the theory rapidly becomes involved and technical. In this section we will be content merely to point the way with a discussion of second-degree Taylor polynomials. Even at this level, it is best to leave full explanations for a course in advanced calculus.

Higher-order derivatives

The first step is to introduce higher order derivatives. If f:RnR has partial derivatives which exist on an open set U, then, for any i=1,2,3,,n,fxi is itself a function from Rn to R. The partial derivatives of fxi, if they exist, are called second-order partial derivatives of f. We may denote the partial derivative of fxi with respect to xj,j=1,2,3,, evaluated at a point x, by either 2xjxif(x), or fxixj(x), or Dxixjf(x). Note the order in which the variables are written; it is possible that differentiating first with respect to xi and second with respect xj will yield a different result than if the order were reversed.

If j=i, we will write 2x2if(x) for 2xixif(x). It is, of course, possible to extend this notation to third, fourth, and higher-order derivatives.

Example 3.4.1

Suppose f(x,y)=x2y3xsin(2y). Then

fx(x,y)=2xy3sin(2y)

and

fy(x,y)=x26xcos(2y),

so

fxx(x,y)=2y,fxy(x,y)=2x6cos(2y),fyy(x,y)=12xsin(2y),

and

fyx(x,y)=2x6cos(2y).

Note that, in this example, fxy(x,y)=fyx(x,y). For an example of a third-order derivative,

fyxy(x,y)=12sin(2y).

Example 3.4.2

Suppose w=xy2z34xylog(z). Then, for example,

2wyx=y(wx)=y(y2z34ylog(z))=2yz34log(z)

and

2wz2=z(wz)=z(3xy2z24xyz)=6xy2z+4xyz2.

Also,

2wxy=x(wy)=x(2xyz34xlog(z))=2yz34log(z),

and so

2wyx=2wxy.

In both of our examples we have seen instances where mixed second partial derivatives, that is, second-order partial derivatives with respect to two different variables, taken in different orders are equal. This is not always the case, but does follow if we assume that both of the mixed partial derivatives in question are continuous.

Definition 3.4.1

We say a function f:RnR is C2 on an open set U if fxjxi is continuous on U for each i=1,2,,n and j=1,2,,n.

Theorem 3.4.1

If f is C2 on an open ball containing a point c, then

2xjxif(c)=2xixjf(c)

for i=1,2,,n and j=1,2,,n.

Although we have the tools to verify this result, we will leave the justification for a more advanced course.

We shall see that it is convenient to use a matrix to arrange the second partial derivatives of a function f. If f:RnR, there are n2 second partial derivatives and this matrix will be n×n.

Definition 3.4.2

Suppose the second-order partial derivatives of f:RnR all exist at the point c. We call the n×n matrix

Hf(c)=[2x21f(c)2x2x1f(c)2x3x1f(c)2xnx1f(c)2x1x2f(c)2x22f(c)2x3x2f(c)2xnx2f(c)2x1x3f(c)2x2x3f(c)2x23f(c)2xnx3f(c)2x1xnf(c)2x2xnf(c)2x3xnf(c)2x2nf(c)]

the Hessian of f at c.

Put another way, the Hessian of f at c is the n×n matrix whose ith row is fxi(c).

Example 3.4.3

Suppose f(x,y)=x2y3xsin(2y). Then, using our results from above,

Hf(x,y)=[fxx(x,y)fxy(x,y)fyx(x,y)fyy(x,y)]=[2y2x6cos(y)2x6cos(2y)12xsin(2y)].

Thus, for example,

Hf(2,0)=[0220].

Suppose f:RnR is C2 on an open ball B2(c,r) and let h=(h1,h2) be a point with h<r. If we define φ:RR by φ(t)=f(c+th), then φ(0)=f(c) and φ(1)=f(c+h). From the one-variable calculus version of Taylor’s theorem, we know that

φ(1)=φ(0)+φ(0)+12φ(s),

where s is a real number between 0 and 1. Using the chain rule, we have

φ(t)=f(c+th)ddt(c+th)=f(c+th)h=fx(c+th)h1+fy(c+th)h2

and

φ(t)=h1fx(c+th)h+h2fy(c+th)h=(h1fx(c+th)+h2fy(c+th)h=[h1h2][fxx(c+th)fxy(c+th)fyx(c+th)fyy(c+th)][h1h2]=hTHf(c+th)h,

where we have used the notation

h=[h1h2]

and

hT=[h1h2],

the latter being called the transpose of h (see Problem 12 of Section 1.6). Hence

φ(0)=f(c)h

and

φ(s)=12hTHf(c+sh)h,

so, substituting into (???), we have

f(c+h)=φ(1)=f(c)+f(c)h+12hTHf(c+sh)h.

This result, a version of Taylor’s theorem, is easily generalized to higher dimensions.

Theorem 3.4.2

Suppose f:RnR is C2 on an open ball Bn(c,r) and let h be a point with h<r. Then there exists a real number s between 0 and 1 such that

f(c+h)=f(c)+f(c)h+12hTHf(c+sh)h.

If we let x=c+h and evaluate the Hessian at c, (???) becomes a polynomial approximation for f.

Definition 3.4.3

If f:RnR is C2 on an open ball about the point c, then we call

P2(x)=f(c)+f(c)(xc)+12(xc)THf(c)(xc)

the second-order Taylor polynomial for f at c.

Example 3.4.4

To find the second-order Taylor polynomial for f(x,y)=e2x+y at (0,0), we compute

f(x,y)=(2e2x+y,e2x+y)

and

Hf(x,y)=[4e2x+y2e2x+y2e2x+ye2x+y],

from which it follows that

f(0,0)=(2,1)

and

Hf(0,0)=[4221].

Then

P2(x,y)=f(0,0)+f(0,0)(x,y)+12[xy]Hf(0,0)[xy]=1+(2,1)(x,y)+12[xy][4221][xy]=12x+y=12[xy][4x2y2x+y]=12x+y+12(4x22xy2xy+y2)=12x+y+2x22xy+12y2.

Symmetric matrices

Note that if f:R2R is C2 on an open ball about the point c, then the entry in the ith row and jth column of Hf(c) is equal to the entry in the jth row and ith column of Hf(c) since

2xjxif(c)=2xixjf(c).

Definition 3.4.4

We call a matrix M=[aij] with the property that aij=aji for all ij a symmetric matrix.

Example 3.4.5

The matrices

\left[\begin{array}{ll} 2 & 1 \\ 1 & 5 \end{array}\right] \nonumber

and

\left[\begin{array}{rrr} 1 & 2 & 3 \\ 2 & 4 & 5 \\ 3 & 5 & -7 \end{array}\right] \nonumber

are both symmetric, while the matrices

\left[\begin{array}{rr} 2 & -1 \\ 3 & 4 \end{array}\right] \nonumber

and

\left[\begin{array}{rrr} 2 & 1 & 3 \\ 2 & 3 & 4 \\ -2 & 4 & -6 \end{array}\right] \nonumber

are not symmetric.

Example \PageIndex{6}

The Hessian of any C^2 scalar valued function is a symmetric matrix. For example, the Hessian of f(x, y)=e^{-2 x+y}, namely,

H f(x, y)=\left[\begin{array}{cc} 4 e^{-2 x+y} & -2 e^{-2 x+y} \\ -2 e^{-2 x+y} & e^{-2 x+y} \end{array}\right] , \nonumber

is symmetric for any value of (x,y).

Given an n \times n symmetric matrix M, the function q: \mathbb{R}^{n} \rightarrow \mathbb{R} defined by

q(\mathbf{x})=\mathbf{x}^{T} M \mathbf{x} \nonumber

is a quadratic polynomial. When M is the Hessian of some function f, this is the form of the quadratic term in the second-order Taylor polynomial for f. In the next section it will be important to be able to determine when this term is positive for all \mathbf{x} \neq \mathbf{0} or negative for all \mathbf{x} \neq \mathbf{0}.

Definition \PageIndex{5}

Let M be an n \times n symmetric matrix and define q: \mathbb{R}^{n} \rightarrow \mathbb{R} by

q(\mathbf{x})=\mathbf{x}^{T} M \mathbf{x} . \nonumber

We say M is positive definite if q(\mathbf{x})>0 for all \mathbf{x} \neq \mathbf{0} in \mathbb{R}^n, negative definite if q(\mathbf{x})<0 for all \mathbf{x} \neq \mathbf{0} in \mathbb{R}^n, and indefinite if there exists an \mathbf{x} \neq 0 for which q(\mathbf{x})>0 and an \mathbf{x} \neq \mathbf{0} for which q(\mathbf{x})<0. Otherwise, we say M is nondefinite.

In general it is not easy to determine to which of these categories a given symmetric matrix belongs. However, the important special case of 2 \times 2 matrices is straightforward. Consider

M=\left[\begin{array}{ll} a & b \\ b & c \end{array}\right] \nonumber

and let

q(x, y)=\left[\begin{array}{ll} x & y \end{array}\right] M\left[\begin{array}{l} x \\ y \end{array}\right]=a x^{2}+2 b x y+c y^{2} . \label{3.4.10}

If a \neq 0, then we may complete the square in (\ref{3.4.10}) to obtain

\begin{align} q(x, y) &=a\left(x^{2}+\frac{2 b}{a} x y\right)+c y^{2} \nonumber \\ &=a\left(\left(x+\frac{b}{a} y\right)^{2}-\frac{b^{2}}{a^{2}} y^{2}\right)+c y^{2} \nonumber \\ &=a\left(x+\frac{b}{a} y\right)^{2}+\left(c-\frac{b^{2}}{a}\right) y^{2} \nonumber \\ &=a\left(x+\frac{b}{a} y\right)^{2}+\frac{a c-b^{2}}{a} y^{2} \nonumber \\ &=a\left(x+\frac{b}{a} y\right)^{2}+\frac{\operatorname{det}(M)}{a} y^{2} . \label{3.4.11} \end{align}

Now suppose \operatorname{det}(M)>0. Then from (\ref{3.4.11}) we see that q(x, y)>0 for all (x, y) \neq(0,0) if a>0 and q(x, y)<0 for all (x, y) \neq(0,0) if a<0. That is, M is positive definite if a>0 and negative definite if a<0. If \operatorname{det}(M)<0, then q(1,0) and q\left(-\frac{b}{a}, 1\right) will have opposite signs, and so M is indefinite. Finally, suppose \operatorname{det}(M)=0. Then

q(x, y)=a\left(x+\frac{b}{a} y\right)^{2} , \nonumber

so q(x,y) = 0 when x=-\frac{b}{a} y. Moreover, q(x,y) has the same sign as a for all other values of (x,y). Hence in this case M is nondefinite.

Similar analyses for the case a=0 give us the following result.

Theorem \PageIndex{3}

Suppose

M=\left[\begin{array}{ll} a & b \\ b & c \end{array}\right] . \nonumber

If \operatorname{det}(M)>0, then M is positive definite if a>0 and negative definite if a<0. If \operatorname{det}(M)<0, then M is indefinite. If \operatorname{det}(M)=0, then M is nondefinite.

Example \PageIndex{7}

The matrix

M=\left[\begin{array}{ll} 2 & 1 \\ 1 & 3 \end{array}\right] \nonumber

is positive definite since \operatorname{det}(M)=5>0 and 2>0.

Example \PageIndex{8}

The matrix

M=\left[\begin{array}{rr} -2 & 1 \\ 1 & -4 \end{array}\right] \nonumber

is negative definite since \operatorname{det}(M)=7>0 and -2<0.

Example \PageIndex{9}

The matrix

M=\left[\begin{array}{rr} -3 & 1 \\ 1 & 2 \end{array}\right] \nonumber

is indefinite since \operatorname{det}(M)=-7<0.

Example \PageIndex{10}

The matrix

M=\left[\begin{array}{ll} 4 & 2 \\ 2 & 1 \end{array}\right] \nonumber

is nondefinite since \operatorname{det}(M)=0.

In the next section we will see how these ideas help us identify local extreme values for scalar valued functions of two variables.


This page titled 3.4: Second-Order Approximations is shared under a CC BY-NC-SA 1.0 license and was authored, remixed, and/or curated by Dan Sloughter via source content that was edited to the style and standards of the LibreTexts platform.

Support Center

How can we help?