2: Extrema Subject to One Constraint

Last updated

Oct 27, 2024
Save as PDF
- 1: Introduction to Lagrange Multipliers
- 3: Constrained Extrema of Quadratic Forms

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\id}{\mathrm{id}}$ $\newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$ $\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$ $\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\id}{\mathrm{id}}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\kernel}{\mathrm{null}\,}$

$\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$

$\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$

$\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$ $\newcommand{\AA}{\unicode[.8,0]{x212B}}$

$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vectorC}[1]{\textbf{#1}}$

$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$

$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$

$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\avec}{\mathbf a}$

$\newcommand{\bvec}{\mathbf b}$

$\newcommand{\cvec}{\mathbf c}$

$\newcommand{\dvec}{\mathbf d}$

$\newcommand{\dtil}{\widetilde{\mathbf d}}$

$\newcommand{\evec}{\mathbf e}$

$\newcommand{\fvec}{\mathbf f}$

$\newcommand{\nvec}{\mathbf n}$

$\newcommand{\pvec}{\mathbf p}$

$\newcommand{\qvec}{\mathbf q}$

$\newcommand{\svec}{\mathbf s}$

$\newcommand{\tvec}{\mathbf t}$

$\newcommand{\uvec}{\mathbf u}$

$\newcommand{\vvec}{\mathbf v}$

$\newcommand{\wvec}{\mathbf w}$

$\newcommand{\xvec}{\mathbf x}$

$\newcommand{\yvec}{\mathbf y}$

$\newcommand{\zvec}{\mathbf z}$

$\newcommand{\rvec}{\mathbf r}$

$\newcommand{\mvec}{\mathbf m}$

$\newcommand{\zerovec}{\mathbf 0}$

$\newcommand{\onevec}{\mathbf 1}$

$\newcommand{\real}{\mathbb R}$

$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$

$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$

$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$

$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$

$\newcommand{\bcal}{\cal B}$

$\newcommand{\ccal}{\cal C}$

$\newcommand{\scal}{\cal S}$

$\newcommand{\wcal}{\cal W}$

$\newcommand{\ecal}{\cal E}$

$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$

$\newcommand{\gray}[1]{\color{gray}{#1}}$

$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$

$\newcommand{\rank}{\operatorname{rank}}$

$\newcommand{\row}{\text{Row}}$

$\newcommand{\col}{\text{Col}}$

$\renewcommand{\row}{\text{Row}}$

$\newcommand{\nul}{\text{Nul}}$

$\newcommand{\var}{\text{Var}}$

$\newcommand{\corr}{\text{corr}}$

$\newcommand{\len}[1]{\left|#1\right|}$

$\newcommand{\bbar}{\overline{\bvec}}$

$\newcommand{\bhat}{\widehat{\bvec}}$

$\newcommand{\bperp}{\bvec^\perp}$

$\newcommand{\xhat}{\widehat{\xvec}}$

$\newcommand{\vhat}{\widehat{\vvec}}$

$\newcommand{\uhat}{\widehat{\uvec}}$

$\newcommand{\what}{\widehat{\wvec}}$

$\newcommand{\Sighat}{\widehat{\Sigma}}$

$\newcommand{\lt}{<}$

$\newcommand{\gt}{>}$

$\newcommand{\amp}{&}$

$\definecolor{fillinmathshade}{gray}{0.9}$

Here is Theorem [theorem:1] with $m=1$ .

Theorem $\PageIndex1$

Nov 2, 2019, 10:05 PM

[theorem:2]

Suppose that $n>1.$ If ${\bf X}_0$ is a local extreme point of $f$ subject to $g({\bf X})=0$ and $g_{x_{r}}({\bf X}_0)\ne0$ for some $r\in\{1,2,\dots,n\},$ then there is a constant $\lambda$ such that

$\label{eq:7} f_{x_i}({\bf X}_0)-\lambda g_{x_i}({\bf X}_0)=0,\quad$ $1\le i\le n;$ thus, ${\bf X}_0$ is a critical point of $f-\lambda g$ .

For notational convenience, let $r=1$ and denote

${\bf U}=(x_2,x_{3},\dots x_{n})\text{ and } {\bf U}_0=(x_{20},x_{30},\dots x_{n0}). \nonumber$

Since $g_{x_1}({\bf X}_0)\ne0$ , the Implicit Function Theorem (Corollary 6.4.2, p. 423) implies that there is a unique continuously differentiable function $h=h({\bf U}),$ defined on a neighborhood $N \subset{\mathbb R}^{n-1}$ of ${\bf U}_0,$ such that $(h({\bf U}),{\bf U})\in D$ for all ${\bf U}\in N$ , $h({\bf U}_0)=x_{10}$ , and

$\label{eq:8} g(h({\bf U}),{\bf U})=0,\quad {\bf U}\in N.$

Now define

$\label{eq:9} \lambda=\frac{f_{x_1}({\bf X}_0)}{g_{x_1}({\bf X}_0)},$

which is permissible, since $g_{x_1}({\bf X}_0)\ne0$ . This implies Equation $\ref{eq:7}$ with $i=1$ . If $i> 1$ , differentiating Equation $\ref{eq:8}$ with respect to $x_i$ yields

$\label{eq:10} \frac{\partial g(h({\bf U}),{\bf U})}{\partial x_i} + \frac{\partial g(h({\bf U}),{\bf U})}{\partial x_1} \frac{\partial h({\bf U})}{\partial x_i}=0,\quad {\bf U}\in N.$

Also,

$\label{eq:11} \frac{\partial f({h(\bf U}),{\bf U}))}{\partial x_i}= \frac{\partial f(h({\bf U}),{\bf U})}{\partial x_i}+ \frac{\partial f(h({\bf U}),{\bf U})}{\partial x_1} \frac{\partial h({\bf U})}{\partial x_i},\quad {\bf U}\in N.$

Since $(h({\bf U}_0),{\bf U}_0)={\bf X}_0$ , Equation $\ref{eq:10}$ implies that

$\label{eq:12} \frac{\partial g({\bf X}_0)}{\partial x_i}+ \frac{\partial g({\bf X}_0)}{\partial x_1} \frac{\partial h({\bf U}_0)}{\partial x_i}=0.$

If ${\bf X}_0$ is a local extreme point of $f$ subject to $g({\bf X})=0$ , then ${\bf U}_0$ is an unconstrained local extreme point of $f(h({\bf U}),{\bf U})$ ; therefore, Equation $\ref{eq:11}$ implies that

$\label{eq:13} \frac{\partial f({\bf X}_0)}{\partial x_i}+ \frac{\partial f({\bf X}_0)}{\partial x_1} \frac{\partial h({\bf U}_0)}{\partial x_i}=0.$

Since a linear homogeneous system

$\left[\begin{array}{ccccccc} a&b\\c&d \end{array}\right] \left[\begin{array}{ccccccc} u\\v \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right] \nonumber$

has a nontrivial solution if and only if

$\left|\begin{array}{ccccccc} a&b\\c&d \end{array}\right|=0, \nonumber$ (Theorem 6.1.15), Equations $\ref{eq:12}$ and $\ref{eq:13}$ imply that

$\left|\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}\\\\ \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}}& \end{array}\right|=0,\text{\; so\;\;} \left|\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}\\\\ \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}} \end{array}\right|=0, \nonumber$

since the determinants of a matrix and its transpose are equal. Therefore, the system

$\left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}\\\\ \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}} \end{array}\right] \left[\begin{array}{ccccccc} u\\v \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right] \nonumber$

has a nontrivial solution (Theorem 6.1.15). Since $g_{x_1}({\bf X}_0)\ne0$ , $u$ must be nonzero in a nontrivial solution. Hence, we may assume that $u=1$ , so

$\label{eq:14} \left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}\\\\ \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}} \end{array}\right] \left[\begin{array}{ccccccc} 1\\ v \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right].$

In particular,

$\frac{\partial f({\bf X}_0)}{\partial x_1}+ v\frac{\partial g({\bf X}_0)}{\partial x_1}=0, \text{\; so\;\;} -v=\frac{f_{x_1}({\bf X}_0)}{g_{x_1}({\bf X}_0)}. \nonumber$

Now Equation $\ref{eq:9}$ implies that $-v=\lambda$ , and Equation $\ref{eq:14}$ becomes

$\left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}\\\\ \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}} \end{array}\right] \left[\begin{array}{rcccccc} 1\\ -\lambda \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right]. \nonumber$

Computing the topmost entry of the vector on the left yields Equation $\ref{eq:7}$ .

Example $\PageIndex1$

Nov 2, 2019, 10:05 PM

[example:1]

Find the point $(x_0,y_0)$ on the line

$ax+by=d \nonumber$

closest to a given point $(x_1,y_1)$ .

Solution

We must minimize $\sqrt{(x-x_1)^2+(y-y_1)^2}$ subject to the constraint. This is equivalent to minimizing $(x-x_1)^2+(y-y_1)^2$ subject to the constraint, which is simpler. For, this we could let

$L=(x-x_1)^2+(y-y_1)^2-\lambda (ax+by-d); \nonumber$

however,

$L=\frac{(x-x_1)^2+(y-y_1)^2}2-\lambda (ax+by) \nonumber$

is better. Since

$L_{x}=x-x_1-\lambda a\text{\quad and \quad } L_{y}=y-y_1-\lambda b, \nonumber$

$(x_0,y_0)=(x_1+\lambda a, y_1+\lambda b)$ , where we must choose $\lambda$ so that $ax_0+by_0=d$ . Therefore,

$ax_0+by_0=ax_1+by_1+\lambda(a^2+b^2)=d, \nonumber$

$\lambda= \frac{d-ax_1-by_1}{a^2+b^2}, \nonumber$

$x_0=x_1+\frac{(d-ax_1-by_1)a}{a^2+b^2}, \text{ and } y_0=y_1+\frac{(d-ax_1-by_1)b}{a^2+b^2}. \nonumber$

The distance from $(x_1,y_1)$ to the line is

$\sqrt{(x_0-x_1)^2+(y_0-y_1)^2}= \frac{|d-ax_1-by_1|}{\sqrt{a^2+b^2}}. \nonumber$

Example $\PageIndex2$

Nov 2, 2019, 10:06 PM

[example:2]

Find the extreme values of $f(x,y)=2x+y$ subject to

$x^2+y^2=4. \nonumber$

Solution

Let

$L=2x+y-\frac{\lambda}2(x^2+y^2); \nonumber$

then

$L_{x}=2-\lambda x\text{ and } L_{y}=1-\lambda y, \nonumber$

so $(x_0,y_0)=(2/\lambda,1/\lambda)$ . Since $x_0^2+y_0^2=4$ , $\lambda=\pm \sqrt{5}/2$ . Hence, the constrained maximum is $2\sqrt{5}$ , attained at $(4/\sqrt{5},2/\sqrt{5})$ , and the constrained minimum is $-2\sqrt{5}$ , attained at $(-4/\sqrt{5},-2/\sqrt{5})$ .

Example $\PageIndex{3}$

Nov 2, 2019, 9:59 PM

[example:3]

Find the point in the plane

$\label{eq:15} 3x+4y+z=1$ closest to $(-1,1,1)$ .

Solution

We must minimize

$f(x,y,z)=(x+1)^2+(y-1)^2+(z-1)^2 \nonumber$

subject to Equation $\ref{eq:15}$ . Let

$L=\frac{(x+1)^2+(y-1)^2+(z-1)^2}2-\lambda(3x+4y+z); \nonumber$

then

$L_{x}= x+1-3\lambda,\quad L_{y}= y-1-4\lambda,\text{ and } L_{z}= z-1-\lambda, \nonumber$

$x_0=-1+3\lambda,\quad y_0=1+4\lambda,\quad z_0=1+\lambda. \nonumber$

From Equation $\ref{eq:15}$ ,

$3(-1+3\lambda)+4(1+4\lambda)+(1+\lambda)-1=1+26\lambda=0, \nonumber$ so $\lambda=-1/26$ and

$(x_0,y_0,z_0)= \left(-\frac{29}{26},\frac{22}{26},\frac{25}{26}\right). \nonumber$

The distance from $(x_0,y_0,z_0)$ to $(-1,1,1)$ is

$\sqrt{(x_0+1)^2+(y_0-1)^2+(z_0-1)^2}=\frac1{\sqrt{26}}. \nonumber$

Example $\PageIndex{4}$

Nov 2, 2019, 10:01 PM

[example:4]

Assume that $n\ge 2$ and $x_i\ge 0$ , $1\le i\le n$ .

Find the extreme values of $\displaystyle{\sum_{i=1}^{n}x_i}$ subject to $\displaystyle{\sum_{i=1}^{n}x_i^2=1}$ .
Find the minimum value of $\displaystyle{\sum_{i=1}^{n}x_i^2}$ subject to $\displaystyle{\sum_{i=1}^{n}x_i=1}$ .

(a) Let

$L= \sum_{i=1}^{n}x_i-\frac{\lambda}2\sum_{i=1}^{n}x_i^2; \nonumber$ then

$L_{x_i}=1-\lambda x_i, \text{\; so\;\;} x_{i0}=\frac1{\lambda}, \quad 1\le i\le n. \nonumber$

Hence, $\displaystyle{\sum_{i=1}^{n}x_{i0}^2}=n/\lambda^2$ , so $\lambda=\pm\sqrt{n}$ and

$(x_{10},x_{20},\dots,x_{n0})= \pm\left(\frac1{\sqrt{n}},\frac1{\sqrt{n}}, \dots, \frac1{\sqrt{n}}\right). \nonumber$

Therefore, the constrained maximum is $\sqrt{n}$ and the constrained minimum is $-\sqrt{n}$ .

(b) Let

$L=\frac12 \sum_{i=1}^{n}x_i^2-\lambda\sum_{i=1}^{n}x_i; \nonumber$ then

$L_{x_i}=x_i-\lambda, \text{\; so\;\;} x_{i0}=\lambda,\quad 1\le i\le n. \nonumber$

Hence, $\displaystyle{\sum_{i=1}^{n}x_{i0}}=n\lambda=1$ , so $x_{i0}=\lambda=1/n$ and the constrained minimum is

$\displaystyle{\sum_{i=1}^{n}x_{i0}^2}=\frac1{n} \nonumber$ There is no constrained maximum. (Why?)

Example $\PageIndex1$

Nov 2, 2019, 10:07 PM

[example:5]

Show that

$x^{1/p}y^{1/q} \le \frac{x}{p}+\frac{y}{q}, \quad x,y \ge 0, \nonumber$

$\label{eq:16} \frac1{p} +\frac1{q} = 1, \quad p > 0, \text{ and } q > 0.$

Solution

We first find the maximum of

$f(x,y) = x^{1/p}y^{1/q} \nonumber$

subject to

$\label{eq:17} \frac{x}{p}+\frac{y}{q} = \sigma, \quad x \ge 0, \quad y \ge 0,$

where $\sigma$ is a fixed but arbitrary positive number. Since $f$ is continuous, it must assume a maximum at some point $(x_0,y_0)$ on the line segment Equation $\ref{eq:17}$ , and $(x_0,y_0)$ cannot be an endpoint of the segment, since $f(p\sigma,0) = f(0,q\sigma)=0$ . Therefore, $(x_0,y_0)$ is in the open first quadrant.

Let

$L = x^{1/p}y^{1/q} -\lambda \left(\frac{x}{p}+\frac{y}{q}\right). \nonumber$

Then

$L_x = \frac1{px} f(x,y) - \frac{\lambda}{p} \text{ and } L_y = \frac1{qy} f(x,y) - \frac{\lambda}{q}=0, \nonumber$

so $x_0 = y_0=f(x_0,y_0)/\lambda$ . Now Equations $\ref{eq:16}$ and $\ref{eq:17}$ imply that $x_0 =y_0 = \sigma$ . Therefore,

$f(x,y) \le f(\sigma,\sigma) = \sigma^{1/p}\sigma^{1/q} = \sigma=\frac{x}{p}+\frac{y}{q}. \nonumber$

This can be generalized (Exercise [exer:53]). It can also be used to generalize Schwarz’s inequality (Exercise [exer:54]).

Search

Text Color

Text Size

Margin Size

Font Type

Example $\PageIndex2$

Example $\PageIndex{3}$

Example $\PageIndex{4}$

Example $\PageIndex1$

Theorem 2.1\PageIndex1

Example 2.1\PageIndex1

Solution

Example 2.2\PageIndex2

Solution

Example 2.3\PageIndex{3}

Solution

Example 2.4\PageIndex{4}

Example 2.1\PageIndex1

Solution

Support Center

How can we help?

Theorem $\PageIndex1$

Example $\PageIndex1$

Example $\PageIndex2$

Example $\PageIndex{3}$

Example $\PageIndex{4}$

Example $\PageIndex1$