2.5: Maxima and Minima

Last updated

Jan 16, 2023
Save as PDF
- 2.4: Directional Derivatives and the Gradient
- 2.6: Unconstrained Optimization- Numerical Methods

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\id}{\mathrm{id}}$ $\newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$ $\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$ $\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\id}{\mathrm{id}}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\kernel}{\mathrm{null}\,}$

$\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$

$\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$

$\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$ $\newcommand{\AA}{\unicode[.8,0]{x212B}}$

$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vectorC}[1]{\textbf{#1}}$

$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$

$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$

$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\avec}{\mathbf a}$

$\newcommand{\bvec}{\mathbf b}$

$\newcommand{\cvec}{\mathbf c}$

$\newcommand{\dvec}{\mathbf d}$

$\newcommand{\dtil}{\widetilde{\mathbf d}}$

$\newcommand{\evec}{\mathbf e}$

$\newcommand{\fvec}{\mathbf f}$

$\newcommand{\nvec}{\mathbf n}$

$\newcommand{\pvec}{\mathbf p}$

$\newcommand{\qvec}{\mathbf q}$

$\newcommand{\svec}{\mathbf s}$

$\newcommand{\tvec}{\mathbf t}$

$\newcommand{\uvec}{\mathbf u}$

$\newcommand{\vvec}{\mathbf v}$

$\newcommand{\wvec}{\mathbf w}$

$\newcommand{\xvec}{\mathbf x}$

$\newcommand{\yvec}{\mathbf y}$

$\newcommand{\zvec}{\mathbf z}$

$\newcommand{\rvec}{\mathbf r}$

$\newcommand{\mvec}{\mathbf m}$

$\newcommand{\zerovec}{\mathbf 0}$

$\newcommand{\onevec}{\mathbf 1}$

$\newcommand{\real}{\mathbb R}$

$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$

$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$

$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$

$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$

$\newcommand{\bcal}{\cal B}$

$\newcommand{\ccal}{\cal C}$

$\newcommand{\scal}{\cal S}$

$\newcommand{\wcal}{\cal W}$

$\newcommand{\ecal}{\cal E}$

$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$

$\newcommand{\gray}[1]{\color{gray}{#1}}$

$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$

$\newcommand{\rank}{\operatorname{rank}}$

$\newcommand{\row}{\text{Row}}$

$\newcommand{\col}{\text{Col}}$

$\renewcommand{\row}{\text{Row}}$

$\newcommand{\nul}{\text{Nul}}$

$\newcommand{\var}{\text{Var}}$

$\newcommand{\corr}{\text{corr}}$

$\newcommand{\len}[1]{\left|#1\right|}$

$\newcommand{\bbar}{\overline{\bvec}}$

$\newcommand{\bhat}{\widehat{\bvec}}$

$\newcommand{\bperp}{\bvec^\perp}$

$\newcommand{\xhat}{\widehat{\xvec}}$

$\newcommand{\vhat}{\widehat{\vvec}}$

$\newcommand{\uhat}{\widehat{\uvec}}$

$\newcommand{\what}{\widehat{\wvec}}$

$\newcommand{\Sighat}{\widehat{\Sigma}}$

$\newcommand{\lt}{<}$

$\newcommand{\gt}{>}$

$\newcommand{\amp}{&}$

$\definecolor{fillinmathshade}{gray}{0.9}$

The gradient can be used to find extreme points of real-valued functions of several variables, that is, points where the function has a local maximum or local minimum. We will consider only functions of two variables; functions of three or more variables require methods using linear algebra.

Definition 2.7

Let $f (x, y)$ be a real-valued function, and let $(a,b)$ be a point in the domain of $f$ . We say that $f$ has a local maximum at $(a,b)$ if $f (x, y) \le f (a,b)$ for all $(x, y)$ inside some disk of positive radius centered at $(a,b)$ , i.e. there is some sufficiently small $r > 0$ such that $f (x, y) \le f (a,b)\text{ for all }(x, y)$ for which $(x− a)^2 +(y− b)^2 < r^2$ .

Likewise, we say that $f$ has a local minimum at $(a,b)\text{ if }f (x, y) \gt f (a,b)\text{ for all }(x, y)$ inside some disk of positive radius centered at $(a,b)$ .

If $f (x, y) \le f (a,b)\text{ for all }(x, y)$ in the domain of $f$ , then $f$ has a global maximum at $(a,b)$ . If $f (x, y) \ge f (a,b)\text{ for all }(x, y)\text{ in the domain of }f \text{, then }f$ has a global minimum at $(a,b)$ .

Suppose that $(a,b)$ is a local maximum point for $f (x, y)$ , and that the first-order partial derivatives of $f$ exist at $(a,b)$ . We know that $f (a,b)$ is the largest value of $f (x, y)\text{ as }(x, y)$ goes in all directions from the point $(a,b)$ , in some sufficiently small disk centered at $(a,b)$ . In particular, $f (a,b)$ is the largest value of $f$ in the $x$ direction (around the point $(a,b)$ ), that is, the single-variable function $g(x) = f (x,b)$ has a local maximum at $x = a$ . So we know that $g ′ (a) = 0$ . Since $g ′ (x) = \dfrac{∂f}{∂x} (x,b)\text{, then }\dfrac{∂f}{∂x} (a,b) = 0$ . Similarly, $f (a,b)$ is the largest value of $f$ near $(a,b)$ in the $y$ direction and so $\dfrac{∂f}{∂y} (a,b) = 0$ . We thus have the following theorem:

Theorem 2.5

Let $f (x, y)$ be a real-valued function such that both $\dfrac{∂f}{∂x} (a,b)$ and $\dfrac{∂f}{∂y} (a,b)$ exist. Then a necessary condition for $f (x, y)$ to have a local maximum or minimum at $(a,b)$ is that $\nabla f (a,b) = \textbf{0}$ .

Note: Theorem 2.5 can be extended to apply to functions of three or more variables.

A point $(a,b)$ where $\nabla f (a,b) = \textbf{0}$ is called a critical point for the function $f (x, y)$ . So given a function $f (x, y)$ , to find the critical points of $f$ you have to solve the equations $\dfrac{∂f}{∂x} (x, y) = 0\text{ and }\dfrac{∂f}{∂y} (x, y) = 0$ simultaneously for $(x, y)$ . Similar to the single-variable case, the necessary condition that $\nabla f (a,b) = \textbf{0}$ is not always sufficient to guarantee that a critical point is a local maximum or minimum.

Example 2.18

The function $f (x, y) = x y$ has a critical point at $(0,0)$ : $\dfrac{∂f}{∂x} = y = 0 \Rightarrow y = 0$ , and $\dfrac{∂f}{∂y} = x = 0 \Rightarrow x = 0$ , so $(0,0)$ is the only critical point. But clearly $f$ does not have a local maximum or minimum at $(0,0)$ since any disk around $(0,0)$ contains points $(x, y)$ where the values of $x$ and $y$ have the same sign (so that $f (x, y) = x y \gt 0 = f (0,0))$ and different signs (so that $f (x, y) = x y < 0 = f (0,0))$ . In fact, along the path $y = x$ in $\mathbb{R}^2$ , $f (x, y) = x^2$ , which has a local minimum at $(0,0)$ , while along the path $y = −x$ we have $f (x, y) = −x^2$ , which has a local maximum at $(0,0)$ . So $(0,0)$ is an example of a saddle point, i.e. it is a local maximum in one direction and a local minimum in another direction. The graph of $f (x, y)$ is shown in Figure 2.5.1, which is a hyperbolic paraboloid.

alt — Figure 2.5.1 $f (x, y) = x y$ , saddle point at (0,0)

The following theorem gives sufficient conditions for a critical point to be a local maximum or minimum of a smooth function (i.e. a function whose partial derivatives of all orders exist and are continuous), which we will not prove here.

Theorem 2.6

Let $f (x, y)$ be a smooth real-valued function, with a critical point at $(a,b)$ (i.e. $\nabla f (a,b) = 0$ ). Define

$\nonumber D = \dfrac{∂^2 f}{∂x^2} (a,b) \dfrac{∂^2 f}{∂y^2}(a,b)− \left (\dfrac{∂^2 f}{∂y∂x} (a,b)\right )^2$

Then

if $D > 0 \text{ and }\dfrac{∂^2 f}{∂x^2} (a,b) > 0$ , then $f$ has a local minimum at $(a,b)$
if $D > 0 \text{ and }\dfrac{∂^2 f}{∂x^2} (a,b) < 0$ , then $f$ has a local maximum at $(a,b)$
if $D < 0 \text{, then }f$ has neither a local minimum nor a local maximum at $(a,b)$
if $D = 0$ , then the test fails

If condition (c) holds, then $(a,b)$ is a saddle point. Note that the assumption that $f (x, y)$ is smooth means that

$D = \begin{vmatrix} \dfrac{∂^2 f}{∂x^2} (a,b) & \dfrac{∂^2 f}{∂y∂x} (a,b) \\[4pt] \dfrac{∂^2 f}{∂x∂y} (a,b) & \dfrac{∂^2 f}{∂y^2} (a,b) \\[4pt] \end{vmatrix}$

since $\dfrac{∂^2 f}{∂y∂x} = \dfrac{∂^2 f}{∂x∂y}$ . Also, if $D > 0$ then $\dfrac{∂^2 f}{∂x^2} (a,b) \dfrac{∂^2 f}{∂y^2} (a,b) = D + \left ( \dfrac{∂^2 f}{∂y∂x} (a,b)\right )^2 > 0$ , and so $\dfrac{∂^2 f}{∂x^2} (a,b)\text{ and }\dfrac{∂^2 f}{∂y^2} (a,b)$ have the same sign. This means that in parts (a) and (b) of the theorem one can replace $\dfrac{∂^2 f}{∂x^2} (a,b)$ by $\dfrac{∂^2 f}{∂y^2} (a,b)$ if desired.

Example 2.19

Find all local maxima and minima of $f (x, y) = x^2 + x y+ y^2 −3x$ .

Solution

First find the critical points, i.e. where $\nabla f = \textbf{0}$ . Since

$\nonumber \dfrac{∂f}{∂x} = 2x+ y−3 \text{ and }\dfrac{∂f}{∂y} = x+2y$

then the critical points $(x, y)$ are the common solutions of the equations

$\nonumber \begin{align} 2x+ y−3 &= 0 \\[4pt] \nonumber x +2y &= 0 \end{align}$

which has the unique solution $(x, y) = (2,−1)$ . So (2,−1) is the only critical point.

To use Theorem 2.6, we need the second-order partial derivatives:

$\nonumber \dfrac{∂^2 f}{∂x^2} = 2 ,\quad \dfrac{∂^2 f}{∂y^2} = 2 ,\quad \dfrac{∂^2 f}{∂y∂x} = 1$

and so

$\nonumber D = \dfrac{∂^2 f}{∂x^2} (2,−1) \dfrac{∂^2 f}{∂y^2} (2,−1)− \left ( \dfrac{∂^2 f}{∂y∂x} (2,−1)\right )^2 = (2)(2)−1^2 = 3 > 0$

and $\dfrac{∂^2 f}{∂x^2} (2,−1) = 2 > 0$ . Thus, (2,−1) is a local minimum.

Example 2.20

Find all local maxima and minima of $f (x, y) = x y− x^3 − y^2$ .

Solution

First find the critical points, i.e. where $\nabla f = \textbf{0}$ . Since

$\nonumber \dfrac{∂f}{∂x} = y−3x^2 \text{ and }\dfrac{∂f}{∂y} = x−2y$

then the critical points $(x, y)$ are the common solutions of the equations

$\nonumber \begin{align} y−3x^2 &= 0 \\[4pt] \nonumber x−2y &= 0 \end{align}$

The first equation yields $y = 3x^2$ , substituting that into the second equation yields $x−6x^2 = 0$ , which has the solutions $x = 0 \text{ and }x = \dfrac{1}{6}$ . So $x = 0 \Rightarrow y = 3(0) = 0 \text{ and }x = \dfrac{1}{6} \Rightarrow y = 3 \left (\dfrac{1}{6} \right )^2 = \dfrac{1}{12}$ . So the critical points are $(x, y) = (0,0)$ and $(x, y) = \left ( \dfrac{1}{6} , \dfrac{1}{12} \right )$ .

To use Theorem 2.6, we need the second-order partial derivatives:

$\nonumber \dfrac{∂^2 f}{∂x^2} = −6x ,\quad \dfrac{∂^2 f}{∂y^2} = −2 ,\quad \dfrac{∂^2 f}{∂y∂x} = 1$

$\nonumber D = \dfrac{∂^2 f}{∂x^2} (0,0) \dfrac{∂^2 f}{∂y^2} (0,0)− \left (\dfrac{∂^2 f}{∂y∂x} (0,0)\right )^2 = (−6(0))(−2)−1^2 = −1 < 0$

and thus $(0,0)$ is a saddle point. Also,

$\nonumber D = \dfrac{∂^2 f}{∂x^2} \left ( \dfrac{1}{6} ,\dfrac{1}{12 } \right ) \dfrac{∂^2 f}{∂y^2} \left (\dfrac{1}{6} , \dfrac{1}{12}\right ) − \left (\dfrac{∂^2 f}{∂y∂x} \left ( \dfrac{1}{6} , \dfrac{1}{12}\right ) \right)^2 = (−6 \left (\dfrac{1}{6}\right )) (−2)−1^2 = 1 > 0$

and $\dfrac{∂^2 f}{∂x^2}\left (\dfrac{1}{6} , \dfrac{1}{12} \right ) = −1 < 0$ . Thus, $\left (\dfrac{1}{6} , \dfrac{1}{12}\right )$ is a local maximum.

Example 2.21

Find all local maxima and minima of $f (x, y) = (x−2)^4 +(x−2y)^2$ .

First find the critical points, i.e. where $\nabla f = \textbf{0}$ . Since

$\nonumber \dfrac{∂f}{∂x} = 4(x−2)^3 +2(x−2y) \text{ and }\dfrac{∂f}{∂y} = −4(x−2y)$

then the critical points $(x, y)$ are the common solutions of the equations

$\nonumber \begin{align} 4(x−2)^3 +2(x−2y) &= 0 \\[4pt] \nonumber −4(x−2y) &= 0 \end{align}$

The second equation yields $x = 2y$ , substituting that into the first equation yields $4(2y−2)^3 = 0$ , which has the solution $y = 1$ , and so $x = 2(1) = 2$ . Thus, $(2,1)$ is the only critical point.

To use Theorem 2.6, we need the second-order partial derivatives:

$\nonumber \dfrac{∂^2 f}{∂x^2} = 12(x−2)^2 +2,\quad \dfrac{∂^2 f}{∂y^2} = 8 ,\quad \dfrac{∂^2 f}{∂y∂x} = −4$

$\nonumber D = \dfrac{∂^2 f}{∂x^2} (2,1) \dfrac{∂^2 f}{∂y^2} (2,1)− \left (\dfrac{∂^2 f}{∂y∂x} (2,1)\right )^2 = (2)(8)−(−4)^2 = 0$

and so the test fails. What can be done in this situation? Sometimes it is possible to examine the function to see directly the nature of a critical point. In our case, we see that $f (x, y) \ge 0$ for all $(x, y)$ , since $f (x, y)$ is the sum of fourth and second powers of numbers and hence must be nonnegative. But we also see that $f (2,1) = 0$ . Thus $f (x, y) \ge 0 = f (2,1)$ for all $(x, y)$ , and hence $(2,1)$ is in fact a global minimum for $f$ .

Example 2.22

Find all local maxima and minima of $f (x, y) = (x^2 + y^2 )e^{−(x^2+y^2)}$ .

Solution

First find the critical points, i.e. where $\nabla f = 0$ . Since

$\nonumber \begin{align} \dfrac{∂f}{∂x} &= 2x(1−(x^2 + y^2 ))e^{−(x^2+y^2)} \\[4pt] \nonumber \dfrac{∂f}{∂y} &= 2y(1−(x^2 + y^2))e^{−(x^2+y^2)} \end{align}$

then the critical points are $(0,0)$ and all points $(x, y)$ on the unit circle $x^2 + y^2 = 1$ .

To use Theorem 2.6, we need the second-order partial derivatives:

$\nonumber \begin{align} \dfrac{∂^2 f}{∂x^2} &= 2[1−(x^2 + y^2 )−2x^2 −2x^2 (1−(x^2 + y^2))]e^{−(x^2+y^2)} \\[4pt] \nonumber \dfrac{∂^2 f}{∂y^2} &= 2[1−(x^2 + y^2 )−2y^2 −2y^2 (1−(x^2 + y^2))]e^{−(x^2+y^2)} \\[4pt] \nonumber \dfrac{∂^2 f}{∂y∂x} &= −4x y[2−(x^2 + y^2 )]e^{−(x^2+y^2)} \end{align}$

At $(0,0)$ , we have $D = 4 > 0$ and $\dfrac{∂^2 f}{∂x^2} (0,0) = 2 > 0$ , so $(0,0)$ is a local minimum. However, for points $(x, y)$ on the unit circle $x^2 + y^2 = 1$ , we have

$\nonumber D = (−4x^2 e^{−1} )(−4y^2 e^{−1} )−(−4x ye^{−1} )^2 = 0$

and so the test fails. If we look at the graph of $f (x, y)$ , as shown in Figure 2.5.2, it looks like we might have a local maximum for $(x, y)$ on the unit circle $x^2 + y^2 = 1$ . If we switch to using polar coordinates $(r,\theta )$ instead of $(x, y)$ in $\mathbb{R}^2$ , where $r^2 = x^2+y^2$ , then we see that we can write $f (x, y)$ as a function $g(r)$ of the variable $r$ alone: $g(r) = r^2 e^{−r^2}$ . Then $g ′ (r) = 2r(1 − r^2 )e^{−r^2}$ , so it has a critical point at $r = 1$ , and we can check that $g′′(1) = −4e^{−1} < 0$ , so the Second Derivative Test from single-variable calculus says that $r = 1$ is a local maximum. But $r = 1$ corresponds to the unit circle $x^2 + y^2 = 1$ . Thus, the points $(x, y)$ on the unit circle $x^2 + y^2 = 1$ are local maximum points for $f$ .

alt — Figure 2.5.2 $f (x, y) = (x^ 2 + y^ 2 )e^{ −(x^ 2+y^ 2 )}$

Search

Text Color

Text Size

Margin Size

Font Type

Support Center

How can we help?