Skip to main content
$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}{\| #1 \|}$$ $$\newcommand{\inner}{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$

# 6.9: Local Extrema. Maxima and Minima

$$\newcommand{\vecs}{\overset { \rightharpoonup} {\mathbf{#1}} }$$ $$\newcommand{\vecd}{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}{\| #1 \|}$$ $$\newcommand{\inner}{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}{\| #1 \|}$$ $$\newcommand{\inner}{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$

We say that $$f : E^{\prime} \rightarrow E^{1}$$ has a local maximum (minimum) at $$\vec{p} \in E^{\prime}$$ iff $$f(\vec{p})$$ is the largest (least) value of $$f$$ on some globe $$G$$ about $$\vec{p};$$ more precisely, iff

$(\forall \vec{x} \in G) \quad \Delta f=f(\vec{x})-f(\vec{p})<0(>0).$

We speak of an improper extremum if we only have $$\Delta f \leq 0( \geq 0)$$ on $$G.$$ In any case, all depends on the sign of $$\Delta f.$$

From Problem 6 in §1, recall the following necessary condition.

Theorem $$\PageIndex{1}$$

If $$f : E^{\prime} \rightarrow E^{1}$$ has a local extremum at $$\vec{p}$$ then $$D_{\vec{u}} f(\vec{p})=0$$ for all $$\vec{u} \neq \overrightarrow{0}$$ in $$E^{\prime}.$$

In the case $$E^{\prime}=E^{n}\left(C^{n}\right),$$ this means that $$d^{1} f(\vec{p} ; \cdot)=0$$ on $$E^{\prime}$$.

(Recall that $$d^{1} f(\vec{p} ; \vec{t})=\sum_{k=1}^{n} D_{k} f(\vec{p}) t_{k}.$$ It vanishes if the $$D_{k} f(\vec{p})$$ do.

Note 1. This condition is only necessary, not sufficient. For example, if $$f(x, y)=x y,$$ then $$d^{1} f(\overrightarrow{0} ; \cdot)=0 ;$$ yet $$f$$ has no extremum at $$\overrightarrow{0}.$$ (Verify!)

Sufficient conditions were given in Theorem 2 of §5, for $$E^{\prime}=E^{1}.$$ We now take up $$E^{\prime}=E^{2}.$$

Theorem $$\PageIndex{2}$$

Let $$f : E^{2} \rightarrow E^{1}$$ be of class $$C D^{2}$$ on a globe $$G=G_{\vec{p}}(\delta).$$ Suppose $$d^{1} f(\vec{p} ; \cdot)=0$$ on $$E^{2}.$$ Set $$A=D_{11} f(\vec{p}), B=D_{12} f(\vec{p}),$$ and $$C=D_{22} f(\vec{p})$$.

Then the following statements are true.

(i) If $$A C>B^{2},$$ has a maximum or minimum at $$\vec{p},$$ according to whether

$$A<0$$ or $$A>0.$$

(ii) If $$A C<B^{2}, f$$ has no extremum at $$\vec{p}$$.

The case $$A C=B$$ is unresolved.

Proof

Let $$\vec{x} \in G$$ and $$\vec{u}=\vec{x}-\vec{p} \neq \overrightarrow{0}$$.

As $$d^{1} f(\vec{p} ; \cdot)=0,$$ Theorem 2 in §5, yields

$\Delta f=f(\vec{x})-f(\vec{p})=R_{1}=\frac{1}{2} d^{2} f(\vec{s} ; \vec{u}),$

with $$\vec{s} \in L(\vec{p}, \vec{x}) \subseteq G$$ (see Corollary 1 of §5). As $$f \in C D^{2},$$ we have $$D_{12} f=D_{21} f$$ on $$G$$ (Theorem 1 in §5). Thus by formula (4) in §5,

$\Delta f=\frac{1}{2} d^{2} f(\vec{s} ; \vec{u})=\frac{1}{2}\left[D_{11} f(\vec{s}) u_{1}^{2}+2 D_{12} f(\vec{s}) u_{1} u_{2}+D_{22} f(\vec{s}) u_{2}^{2}\right].$

Now, as the partials involved are continuous, we can choose $$G=G_{\vec{p}}(\delta)$$ so small that the sign of expression (1) will not change if $$\vec{s}$$ is replaced by $$\vec{p}$$. Then the crucial sign of $$\Delta f$$ on $$G$$ coincides with that of

$D=A u_{1}^{2}+2 B u_{1} u_{2}+C u_{2}^{2}$

(with $$A, B,$$ and $$C$$ as stated in the theorem).

From (2) we obtain, by elementary algebra,

\begin{aligned} A D &=\left(A u_{1}+B u_{2}\right)^{2}+\left(A C-B^{2}\right) u_{2}^{2}, \\ C D &=\left(C u_{1}+B u_{2}\right)^{2}+\left(A C-B^{2}\right) u_{2}^{2}. \end{aligned}

Clearly, if $$A C>B^{2},$$ the right-side expression in (3) is $$>0;$$ so $$A D>0$$, i.e., $$D$$ has the same sign as $$A.$$

Hence if $$A<0,$$ we also have $$\Delta f<0$$ on $$G,$$ and $$f$$ has a maximum at $$\vec{p}.$$ If $$A>0,$$ then $$\Delta f>0,$$ and $$f$$ has a minimum at $$\vec{p}$$.

Now let $$A C<B^{2}$$. We claim that no matter how small $$G=G_{\vec{p}}(\delta), \Delta f$$ changes sign as $$\vec{x}$$ varies in $$G,$$ and so $$f$$ has no extremum at $$\vec{p}$$.

Indeed, we have $$\vec{x}=\vec{p}+\vec{u}, \vec{u}=\left(u_{1}, u_{2}\right) \neq \overrightarrow{0}.$$ If $$u_{2}=0,$$ (3) shows that $$D$$ and $$\Delta f$$ have the same sign as $$A(A \neq 0).$$

But if $$u_{2} \neq 0$$ and $$u_{1}=-B u_{2} / A$$ (assuming $$A \neq 0),$$ then $$D$$ and $$\Delta f$$ have the sign opposite to that of $$A;$$ and $$\vec{x}$$ is still in $$G$$ if $$u_{2}$$ is small enough (how small?).

One proceeds similarly if $$C \neq 0$$ (interchange $$A$$ and $$C,$$ and use (3').

Finally, if $$A=C=0,$$ then by (2), $$D=2 B u_{1} u_{2}$$ and $$B \neq 0$$ (since $$A C<B^{2})$$. Again $$D$$ and $$\Delta f$$ change sign as $$u_{1} u_{2}$$ does; so $$f$$ has no extremum at $$\vec{p}.$$ Thus all is proved.$$\quad \square$$

Briefly, the proof utilizes the fact that the trinomial (2) is sign-changing iff its discriminant $$B^{2}-A C$$ is positive, i.e., $$\left|\begin{array}{cc}{A} & {B} \\ {B} & {C}\end{array}\right|<0$$.

Note 2. Functions $$f : C \rightarrow E^{1}$$ (of one complex variable) are likewise covered by Theorem 2 if one treats them as functions on $$E^{2}$$ (of two real variables).

Functions of n variables. Here we must rely on the algebraic theory of so-called symmetric quadratic forms, i.e., polynomials $$P : E^{n} \rightarrow E^{1}$$ of the form

$P(\vec{u})=\sum_{j=1}^{n} \sum_{i=1}^{n} a_{i j} u_{i} u_{j},$

where $$\vec{u}=\left(u_{i}, \ldots, u_{n}\right) \in E^{n}$$ and $$a_{i j}=a_{j i} \in E^{1}$$.

We take for granted a theorem due to J. J. Sylvester (see S. Perlis, Theory of Matrices, 1952, p. 197), which may be stated as follows.

Let $$P : E^{n} \rightarrow E^{1}$$ be a symmetric quadratic form,

$P(\vec{u})=\sum_{j=1}^{n} \sum_{i=1}^{n} a_{i j} u_{i} u_{j}.$

(i) $$P>0$$ on all of $$E^{n}-\{\overrightarrow{0}\}$$ iff the following $$n$$ determinants $$A_{k}$$ are positive:

$A_{k}=\left|\begin{array}{cccc}{a_{11}} & {a_{12}} & {\dots} & {a_{1 k}} \\ {a_{21}} & {a_{22}} & {\dots} & {a_{2 k}} \\ {\ldots} & {\ldots \ldots \ldots} & {\ldots} & {a_{2 k}} \\ {a_{k 1}} & {a_{k 2}} & {\dots} & {a_{k k}}\end{array}\right|, \quad k=1,2, \ldots, n.$

(ii) We have $$P<0$$ on $$E^{n}-\{\overrightarrow{0}\}$$ iff $$(-1)^{k} A_{k}>0$$ for $$k=1,2, \ldots, n$$.

Now we can extend Theorem 2 to the case $$f : E^{n} \rightarrow E^{1}$$. (This will also cover $$f : C^{n} \rightarrow E^{1},$$ treated as $$f : E^{2 n} \rightarrow E^{1}.)$$ The proof resembles that of Theorem 2.

Theorem $$\PageIndex{3}$$

Let $$f : E^{n} \rightarrow E^{1}$$ be of class $$C D^{2}$$ on some $$G=G_{\vec{p}}(\delta).$$ Suppose $$d f(\vec{p} ; \cdot)=0$$ on $$E^{n}.$$ Define the $$A_{k}$$ as in (4), with $$a_{i j}=D_{i j} f(\vec{p}), i, j, k \leq n$$ Then the following statements hold.

(i) $$f$$ has a local minimum at $$\vec{p}$$ if $$A_{k}>0$$ for $$k=1,2, \ldots, n$$.

(ii) $$f$$ has a local maximum at $$\vec{p}$$ if $$(-1)^{k} A_{k}>0$$ for $$k=1, \ldots, n$$.

(iii) $$f$$ has no extremum at $$\vec{p}$$ if the expression

$P(\vec{u})=\sum_{j=1}^{n} \sum_{i=1}^{n} a_{i j} u_{i} u_{j}$

is $$>0$$ for some $$\vec{u} \in E^{n}$$ and $$<0$$ for others (i.e., $$P$$ changes sign on $$E^{n})$$.

Proof

Let again $$\vec{x} \in G, \vec{u}=\vec{x}-\vec{p} \neq \overrightarrow{0},$$ and use Taylor's theorem to obtain

$\Delta f=f(\vec{x})-f(\vec{p})=R_{1}=\frac{1}{2} d^{2} f(\vec{s} ; \vec{u})=\sum_{j=1}^{n} \sum_{i=1}^{n} D_{i j} f(\vec{s}) u_{i} u_{j},$

with $$\vec{s} \in L(\vec{x}, \vec{p})$$.

As $$f \in C D^{2},$$ the partials $$D_{i j} f$$ are continuous on $$G.$$ Thus we can make $$G$$ so small that the sign of the last double sum does not change if $$\vec{s}$$ is replaced by $$\vec{p}$$. Hence the sign of $$\Delta f$$ on $$G$$ is the same as that of $$P(\vec{u})=\sum_{j=1}^{n} \sum_{i=1}^{n} a_{i j} u_{i} u_{j}$$, with the $$a_{i j}$$ as stated in the theorem.

The quadratic form $$P$$ is symmetric since $$a_{i j}=a_{j i}$$ by Theorem 1 in §5. Thus by Sylvester's theorem stated above, one easily obtains our assertions (i) and (ii). Indeed, they are immediate from clauses (i) and (ii) of that theorem.

Now, for (iii), suppose $$P(\vec{u})>0>P(\vec{v}),$$ i.e.,

$\sum_{j=1}^{n} \sum_{i=1}^{n} a_{i j} u_{i} u_{j}>0>\sum_{j=1}^{n} \sum_{i=1}^{n} a_{i j} v_{i} v_{j} \quad \text { for some } \vec{u}, \vec{v} \in E^{n}-\{\overrightarrow{0}\}.$

If here $$\vec{u}$$ and $$\vec{v}$$ are replaced by $$t \vec{u}$$ and $$t \vec{v}(t \neq 0),$$ then $$u_{i} u_{j}$$ and $$v_{i} v_{j}$$ turn into $$t^{2} u_{i} u_{j}$$ and $$t^{2} v_{i} v_{j},$$ respectively. Hence

$P(t \vec{u})=t^{2} P(\vec{u})>0>t^{2} P(\vec{v})=P(t \vec{v}).$

Now, for any $$t \in(0, \delta /|\vec{u}|),$$ the point $$\vec{x}=\vec{p}+t \vec{u}$$ lies on the $$\vec{u}$$-directed line through $$\vec{p},$$ inside $$G=G_{\vec{p}}(\delta).$$ (Why?) Similarly for the point $$\vec{x}^{\prime}=\vec{p}+t \vec{v}.$$

Hence for such $$\vec{x}$$ and $$\vec{x}^{\prime},$$ Taylor's theorem again yields formulas analogous to (5) for some $$\vec{s} \in L(\vec{p}, \vec{x})$$ and $$\vec{s}^{\prime} \in L\left(\vec{p}, \vec{x}^{\prime}\right)$$ lying on the same two lines. It again follows that for small $$\delta$$,

$f(\vec{x})-f(\vec{p})>0>f\left(\vec{x}^{\prime}\right)-f(\vec{p}),$

just as $$P(\vec{u})>0>P(\vec{v})$$.

Thus $$\Delta f$$ changes sign on $$G_{\vec{p}}(\delta),$$ and (iii) is proved.$$\quad \square$$

Note 3. Still unresolved are cases in which $$P(\vec{u})$$ vanishes for some $$\vec{u} \neq \overrightarrow{0},$$ without changing its sign; e.g., $$P(\vec{u})=\left(u_{1}+u_{2}+u_{3}\right)^{2}=0$$ for $$\vec{u}=(1,1,-2)$$. Then the answer depends on higher-order terms of the Taylor formula. In particular, if $$d^{1} f(\vec{p} ; \cdot)=d^{2} f(\vec{p} ; \cdot)=0$$ on $$E^{n},$$ then $$\Delta f=R_{2}=\frac{1}{6} d^{3} f(\vec{p} ; \vec{s}),$$ etc.

Note 4. The largest or least value of $$f$$ on a set $$A$$ (sometimes called the absolute maximum or minimum) may occur at sominterior (e.g., boundary) point $$\vec{p} \in A,$$ and then fails to be among the local extrema (where, by definition, a globe $$G_{\vec{p}} \subseteq A$$ is presupposed). Thus to find absolute extrema, one must also explore the behaviour of $$f$$ at noninterior points of $$A.$$

By Theorem 1, local extrema can occur only at so-called critical points $$\vec{p}$$, i.e., those at which all directional derivatives vanish (or fail to exist, in which case $$D_{\vec{u}} f(\vec{p})=0$$ by convention).

In practice, to find such points in $$E^{n}\left(C^{n}\right),$$ one equates the partials $$D_{k} f$$ $$(k \leq n)$$ to $$0.$$ Then one uses Theorems 2 and 3 or other considerations to determine whether an extremum really exists.

Examples

(A) Find the largest value of

$f(x, y)=\sin x+\sin y-\sin (x+y)$

on the set $$A \subseteq E^{2}$$ bounded by the lines $$x=0, y=0$$ and $$x+y=2 \pi$$.

We have

$D_{1} f(x, y)=\cos x-\cos (x+y) \text { and } D_{2} f(x, y)=\cos y-\cos (x+y).$

Inside the triangle $$A,$$ both partials vanish only at the point $$\left(\frac{2 \pi}{3}, \frac{2 \pi}{3}\right)$$ at which $$f=\frac{3}{2} \sqrt{3}.$$ On the boundary of $$A$$ (i.e., on the lines $$x=0, y=0$$ and $$x+y=2 \pi ), f=0.$$ Thus even without using Theorem 2, it is evident that $$f$$ attains its largest value,

$f\left(\frac{2 \pi}{3}, \frac{2 \pi}{3}\right)=\frac{3}{2} \sqrt{3},$

at this unique critical point.

(B) Find the largest and the least value of

$f(x, y, z)=a^{2} x^{2}+b^{2} y^{2}+c^{2} z^{2}-\left(a x^{2}+b y^{2}+c z^{2}\right)^{2},$

on the condition that $$x^{2}+y^{2}+z^{2}=1$$ and $$a>b>c>0$$.

As $$z^{2}=1-x^{2}-y^{2},$$ we can eliminate $$z$$ from $$f(x, y, z)$$ and replace $$f$$by $$F : E^{2} \rightarrow E^{1}:$$

$F(x, y)=\left(a^{2}-c^{2}\right) x^{2}+\left(b^{2}-c^{2}\right) y^{2}+c^{2}-\left[(a-c) x^{2}+(b-c) y^{2}+c\right]^{2}.$

(Explain!) For $$F,$$ we seek the extrema on the disc $$\overline{G}=\overline{G}_{0}(1) \subset E^{2},$$ where $$x^{2}+y^{2} \leq 1$$ (so as not to violate the condition $$x^{2}+y^{2}+z^{2}=1)$$.

Equating to 0 the two partials

$\begin{array}{l}{D_{1} F(x, y)=2 x(a-c)\left\{(a+c)-2\left[(a-c) x^{2}+(b-c) y^{2}+c\right]^{2}\right\}=0}, \\ {D_{2} F(x, y)=2 y(b-c)\left\{(b+c)-2\left[(a-c) x^{2}+(b-c) y^{2}+c\right]^{2}\right\}=0}\end{array}$

and solving this system of equations, we find these critical points inside $$G:$$

(1) $$x=y=0$$ ($$F=0)$$;

(2) $$x=0, y=\pm 2^{-\frac{1}{2}}\left(F=\frac{1}{4}(b-c)^{2}\right);$$ and

(3) $$x=\pm 2^{-\frac{1}{2}}, y=0\left(F=\frac{1}{4}(a-c)^{2}\right)$$.

(Verify!)

Now, for the boundary of $$\overline{G},$$ i.e., the circle $$x^{2}+y^{2}=1,$$ repeat this process: substitute $$y^{2}=1-x^{2}$$ in the formula for $$F(x, y),$$ thus reducing it to

$h(x)=\left(a^{2}-b^{2}\right) x^{2}+b^{2}+\left[(a-b) x^{2}+b\right]^{2}, \quad h : E^{1} \rightarrow E^{1},$

on the interval $$[-1,1] \subset E^{1}.$$ In $$(-1,1)$$ the derivative

$h^{\prime}(x)=2(a-b) x\left(1-2 x^{2}\right)$

vanishes only when

(4) $$x=0$$ ($$h=0),$$ and

(5) $$x=\pm 2^{-\frac{1}{2}}\left(h=\frac{1}{4}(a-b)^{2}\right)$$.

Finally, at the endpoints of $$[-1,1],$$ we have

(6) $$x=\pm 1$$ ($$h=0)$$.

Comparing the resulting function values in all six cases, we conclude that the least of them is $$0,$$ while the largest is $$\frac{1}{4}(a-c)^{2}.$$ These are the desired least and largest values of $$f,$$ subject to the conditions stated. They are attained, respectively, at the points

$(0,0, \pm 1),(0, \pm 1,0),( \pm 1,0,0), \text { and }\left( \pm 2^{-\frac{1}{2}}, 0, \pm 2^{-\frac{1}{2}}\right).$

Again, the use of Theorems 2 and 3 was redundant. However, we suggest as an exercise that the reader test the critical points of $$F$$ by using Theorem 2.

Caution. Theorems 1 to 3 apply to functions of independent variables only. In Example (B), $$x, y, z$$ were made interdependent by the imposed equation

$x^{2}+y^{2}+z^{2}=1$

(which geometrically limits all to the surface of $$G_{\overrightarrow{0}}(1)$$ in $$E^{3}),$$ so that one of them, $$z,$$ could be eliminated. Only then can Theorems 1 to 3 be used.