$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$

# 2: Extrema Subject to One Constraint

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$ $$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$

Here is Theorem [theorem:1] with $$m=1$$.

Theorem $$\PageIndex1$$

Suppose that $$n>1.$$ If  $${\bf X}_0$$ is a local extreme point of $$f$$ subject to $$g({\bf X})=0$$ and $$g_{x_{r}}({\bf X}_0)\ne0$$ for some $$r\in\{1,2,\dots,n\},$$ then there is a constant $$\lambda$$ such that

$\label{eq:7} f_{x_i}({\bf X}_0)-\lambda g_{x_i}({\bf X}_0)=0,\quad$ $$1\le i\le n;$$ thus, $${\bf X}_0$$ is a critical point of $$f-\lambda g$$.

For notational convenience, let $$r=1$$ and denote

${\bf U}=(x_2,x_{3},\dots x_{n})\text{ and } {\bf U}_0=(x_{20},x_{30},\dots x_{n0}).$

Since $$g_{x_1}({\bf X}_0)\ne0$$, the Implicit Function Theorem (Corollary 6.4.2, p. 423) implies that there is a unique continuously differentiable function $$h=h({\bf U}),$$ defined on a neighborhood $$N \subset{\mathbb R}^{n-1}$$ of $${\bf U}_0,$$ such that $$(h({\bf U}),{\bf U})\in D$$ for all $${\bf U}\in N$$, $$h({\bf U}_0)=x_{10}$$, and

$\label{eq:8} g(h({\bf U}),{\bf U})=0,\quad {\bf U}\in N.$

Now define

$\label{eq:9} \lambda=\frac{f_{x_1}({\bf X}_0)}{g_{x_1}({\bf X}_0)},$

which is permissible, since $$g_{x_1}({\bf X}_0)\ne0$$. This implies Equation \ref{eq:7} with $$i=1$$. If $$i> 1$$, differentiating Equation \ref{eq:8} with respect to $$x_i$$ yields

$\label{eq:10} \frac{\partial g(h({\bf U}),{\bf U})}{\partial x_i} + \frac{\partial g(h({\bf U}),{\bf U})}{\partial x_1} \frac{\partial h({\bf U})}{\partial x_i}=0,\quad {\bf U}\in N.$

Also,

$\label{eq:11} \frac{\partial f({h(\bf U}),{\bf U}))}{\partial x_i}= \frac{\partial f(h({\bf U}),{\bf U})}{\partial x_i}+ \frac{\partial f(h({\bf U}),{\bf U})}{\partial x_1} \frac{\partial h({\bf U})}{\partial x_i},\quad {\bf U}\in N.$

Since $$(h({\bf U}_0),{\bf U}_0)={\bf X}_0$$, Equation \ref{eq:10} implies that

$\label{eq:12} \frac{\partial g({\bf X}_0)}{\partial x_i}+ \frac{\partial g({\bf X}_0)}{\partial x_1} \frac{\partial h({\bf U}_0)}{\partial x_i}=0.$

If  $${\bf X}_0$$ is a local extreme point of $$f$$ subject to $$g({\bf X})=0$$, then $${\bf U}_0$$ is an unconstrained local extreme point of $$f(h({\bf U}),{\bf U})$$; therefore, Equation \ref{eq:11} implies that

$\label{eq:13} \frac{\partial f({\bf X}_0)}{\partial x_i}+ \frac{\partial f({\bf X}_0)}{\partial x_1} \frac{\partial h({\bf U}_0)}{\partial x_i}=0.$

Since a linear homogeneous system

$\left[\begin{array}{ccccccc} a&b\\c&d \end{array}\right] \left[\begin{array}{ccccccc} u\\v \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right]$

has a nontrivial solution if and only if

$\left|\begin{array}{ccccccc} a&b\\c&d \end{array}\right|=0,$ (Theorem 6.1.15), Equations \ref{eq:12} and \ref{eq:13} imply that

$\left|\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}\\\\ \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}}& \end{array}\right|=0,\text{\; so\;\;} \left|\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}\\\\ \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}} \end{array}\right|=0,$

since the determinants of a matrix and its transpose are equal. Therefore, the system

$\left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}\\\\ \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}} \end{array}\right] \left[\begin{array}{ccccccc} u\\v \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right]$

has a nontrivial solution (Theorem 6.1.15). Since $$g_{x_1}({\bf X}_0)\ne0$$, $$u$$ must be nonzero in a nontrivial solution. Hence, we may assume that $$u=1$$, so

$\label{eq:14} \left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}\\\\ \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}} \end{array}\right] \left[\begin{array}{ccccccc} 1\\ v \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right].$

In particular,

$\frac{\partial f({\bf X}_0)}{\partial x_1}+ v\frac{\partial g({\bf X}_0)}{\partial x_1}=0, \text{\; so\;\;} -v=\frac{f_{x_1}({\bf X}_0)}{g_{x_1}({\bf X}_0)}.$

Now Equation \ref{eq:9} implies that $$-v=\lambda$$, and Equation \ref{eq:14} becomes

$\left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}\\\\ \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}} \end{array}\right] \left[\begin{array}{rcccccc} 1\\ -\lambda \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right].$

Computing the topmost entry of the vector on the left yields Equation \ref{eq:7}.

Example $$\PageIndex1$$

Find the point $$(x_0,y_0)$$ on the line

$ax+by=d$

closest to a given point $$(x_1,y_1)$$.

Solution

We must minimize $$\sqrt{(x-x_1)^2+(y-y_1)^2}$$ subject to the constraint. This is equivalent to minimizing $$(x-x_1)^2+(y-y_1)^2$$ subject to the constraint, which is simpler. For, this we could let

$L=(x-x_1)^2+(y-y_1)^2-\lambda (ax+by-d); \nonumber$

however,

$L=\frac{(x-x_1)^2+(y-y_1)^2}2-\lambda (ax+by) \nonumber$

is better. Since

$L_{x}=x-x_1-\lambda a\text{\quad and \quad } L_{y}=y-y_1-\lambda b, \nonumber$

$$(x_0,y_0)=(x_1+\lambda a, y_1+\lambda b)$$, where we must choose $$\lambda$$ so that $$ax_0+by_0=d$$. Therefore,

$ax_0+by_0=ax_1+by_1+\lambda(a^2+b^2)=d, \nonumber$

so

$\lambda= \frac{d-ax_1-by_1}{a^2+b^2}, \nonumber$

$x_0=x_1+\frac{(d-ax_1-by_1)a}{a^2+b^2}, \text{ and } y_0=y_1+\frac{(d-ax_1-by_1)b}{a^2+b^2}. \nonumber$

The distance from $$(x_1,y_1)$$ to the line is

$\sqrt{(x_0-x_1)^2+(y_0-y_1)^2}= \frac{|d-ax_1-by_1|}{\sqrt{a^2+b^2}}. \nonumber$

Example $$\PageIndex2$$

Find the extreme values of $$f(x,y)=2x+y$$ subject to

$x^2+y^2=4.$

Solution

Let

$L=2x+y-\frac{\lambda}2(x^2+y^2);$

then

$L_{x}=2-\lambda x\text{ and } L_{y}=1-\lambda y,$

so $$(x_0,y_0)=(2/\lambda,1/\lambda)$$. Since $$x_0^2+y_0^2=4$$, $$\lambda=\pm \sqrt{5}/2$$. Hence, the constrained maximum is $$2\sqrt{5}$$, attained at $$(4/\sqrt{5},2/\sqrt{5})$$, and the constrained minimum is $$-2\sqrt{5}$$, attained at $$(-4/\sqrt{5},-2/\sqrt{5})$$.

Example $$\PageIndex{3}$$

Find the point in the plane

$\label{eq:15} 3x+4y+z=1$ closest to $$(-1,1,1)$$.

Solution

We must minimize

$f(x,y,z)=(x+1)^2+(y-1)^2+(z-1)^2$

subject to Equation \ref{eq:15}. Let

$L=\frac{(x+1)^2+(y-1)^2+(z-1)^2}2-\lambda(3x+4y+z); \nonumber$

then

$L_{x}= x+1-3\lambda,\quad L_{y}= y-1-4\lambda,\text{ and } L_{z}= z-1-\lambda,$

so

$x_0=-1+3\lambda,\quad y_0=1+4\lambda,\quad z_0=1+\lambda.$

From Equation \ref{eq:15},

$3(-1+3\lambda)+4(1+4\lambda)+(1+\lambda)-1=1+26\lambda=0,$ so $$\lambda=-1/26$$ and

$(x_0,y_0,z_0)= \left(-\frac{29}{26},\frac{22}{26},\frac{25}{26}\right).$

The distance from $$(x_0,y_0,z_0)$$ to $$(-1,1,1)$$ is

$\sqrt{(x_0+1)^2+(y_0-1)^2+(z_0-1)^2}=\frac1{\sqrt{26}}.$

Example $$\PageIndex{4}$$

Assume that $$n\ge 2$$ and $$x_i\ge 0$$, $$1\le i\le n$$.

1. Find the extreme values of $$\displaystyle{\sum_{i=1}^{n}x_i}$$ subject to $$\displaystyle{\sum_{i=1}^{n}x_i^2=1}$$.
2. Find the minimum value of $$\displaystyle{\sum_{i=1}^{n}x_i^2}$$ subject to $$\displaystyle{\sum_{i=1}^{n}x_i=1}$$.

(a) Let

$L= \sum_{i=1}^{n}x_i-\frac{\lambda}2\sum_{i=1}^{n}x_i^2;$ then

$L_{x_i}=1-\lambda x_i, \text{\; so\;\;} x_{i0}=\frac1{\lambda}, \quad 1\le i\le n.$

Hence, $$\displaystyle{\sum_{i=1}^{n}x_{i0}^2}=n/\lambda^2$$, so $$\lambda=\pm\sqrt{n}$$  and

$(x_{10},x_{20},\dots,x_{n0})= \pm\left(\frac1{\sqrt{n}},\frac1{\sqrt{n}}, \dots, \frac1{\sqrt{n}}\right).$

Therefore, the constrained maximum is $$\sqrt{n}$$ and the constrained minimum is $$-\sqrt{n}$$.

(b) Let

$L=\frac12 \sum_{i=1}^{n}x_i^2-\lambda\sum_{i=1}^{n}x_i;$ then

$L_{x_i}=x_i-\lambda, \text{\; so\;\;} x_{i0}=\lambda,\quad 1\le i\le n.$

Hence, $$\displaystyle{\sum_{i=1}^{n}x_{i0}}=n\lambda=1$$, so $$x_{i0}=\lambda=1/n$$ and the constrained minimum is

$\displaystyle{\sum_{i=1}^{n}x_{i0}^2}=\frac1{n}$ There is no constrained maximum. (Why?)

Example $$\PageIndex1$$

Show that

$x^{1/p}y^{1/q} \le \frac{x}{p}+\frac{y}{q}, \quad x,y \ge 0, \nonumber$

if

$\label{eq:16} \frac1{p} +\frac1{q} = 1, \quad p > 0, \text{ and } q > 0.$

Solution

We first find the maximum of

$f(x,y) = x^{1/p}y^{1/q} \nonumber$

subject to

$\label{eq:17} \frac{x}{p}+\frac{y}{q} = \sigma, \quad x \ge 0, \quad y \ge 0,$

where $$\sigma$$ is a fixed but arbitrary positive number. Since $$f$$ is continuous, it must assume a maximum at some point $$(x_0,y_0)$$ on the line segment Equation \ref{eq:17}, and $$(x_0,y_0)$$ cannot be an endpoint of the segment, since $$f(p\sigma,0) = f(0,q\sigma)=0$$. Therefore, $$(x_0,y_0)$$ is in the open first quadrant.

Let

$L = x^{1/p}y^{1/q} -\lambda \left(\frac{x}{p}+\frac{y}{q}\right). \nonumber$

Then

$L_x = \frac1{px} f(x,y) - \frac{\lambda}{p} \text{ and } L_y = \frac1{qy} f(x,y) - \frac{\lambda}{q}=0, \nonumber$

so $$x_0 = y_0=f(x_0,y_0)/\lambda$$. Now Equations \ref{eq:16} and \ref{eq:17} imply that $$x_0 =y_0 = \sigma$$. Therefore,

$f(x,y) \le f(\sigma,\sigma) = \sigma^{1/p}\sigma^{1/q} = \sigma=\frac{x}{p}+\frac{y}{q}. \nonumber$

This can be generalized (Exercise [exer:53]). It can also be used to generalize Schwarz’s inequality (Exercise [exer:54]).