4: Extrema Subject to Two Constraints
- Page ID
- 17319
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Here is Theorem [theorem:1] with \(m=2\).
Suppose that \(n>2.\) If \({\bf X}_{0}\) is a local extreme point of \(f\) subject to \(g_1({\bf X})=g_2({\bf X})=0\) and
\[\label{eq:19} \left|\begin{array}{ccccccc} \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{r}}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{s}}}\\\\ \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{r}}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{s}}}\\ \end{array}\right|\ne0 \]
for some \(r\) and \(s\) in \(\{1,2,\dots,n\},\) then there are constants \(\lambda\) and \(\mu\) such that
\[\label{eq:20} \frac{\partial f({\bf X}_{0})}{\partial x_{i}}- \lambda\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}- \mu\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}=0, \]
\(1\le i\le n\).
For notational convenience, let \(r=1\) and \(s=2\). Denote
\[{\bf U}=(x_{3},x_{4},\dots x_{n})\text{ and } {\bf U}_{0}=(x_{30},x_{30},\dots x_{n0}). \nonumber \]
Since
\[\label{eq:21} \left|\begin{array}{ccccccc} \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}}\\\\ \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}}\\ \end{array}\right|\ne0, \]
the Implicit Function Theorem (Theorem 6.4.1) implies that there are unique continuously differentiable functions
\[h_1=h_1(x_{3},x_{4},\dots,x_{n})\text{ and } h_2=h_1(x_{3},x_{4},\dots,x_{n}), \nonumber \]
defined on a neighborhood \(N\subset{\mathbb R}^{n-2}\) of \({\bf U}_{0},\) such that \((h_1({\bf U}),h_2({\bf U}),{\bf U})\in D\) for all \({\bf U}\in N\), \(h_1({\bf U}_{0})=x_{10}\), \(h_2({\bf U}_{0})=x_{20}\), and
\[\label{eq:22} g_1(h_1({\bf U}),h_2({\bf U}),{\bf U})= g_2(h_1({\bf U}),h_2({\bf U}),{\bf U})=0,\quad {\bf U}\in N. \]
From \ref{eq:21}, the system
\[\label{eq:23} \left[\begin{array}{ccccccc} \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}}\\\\ \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}}\\ \end{array}\right] \left[\begin{array}{ccccccc} \lambda\\\mu \end{array}\right]= \left[\begin{array}{ccccccc} f_{x_1}({\bf X}_{0})\\f_{x_2}({\bf X}_{0})\\ \end{array}\right] \]
has a unique solution (Theorem 6.1.13). This implies Equation \ref{eq:20} with \(i=1\) and \(i=2\). If \(3\le i\le n\), then differentiating Equation \ref{eq:22} with respect to \(x_{i}\) and recalling that \((h_1({\bf U}_{0}),h_2({\bf U}_{0}),{\bf U}_{0})={\bf X}_{0}\) yields
\[\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}+ \frac{\partial g_1({\bf X}_{0})}{\partial x_1} \frac{\partial h_1({\bf U}_{0})}{\partial x_{i}}+ \frac{\partial g_1({\bf X}_{0})}{\partial x_2} \frac{\partial h_2({\bf U}_{0})}{\partial x_{i}}=0 \nonumber \]
and
\[\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}+ \frac{\partial g_2({\bf X}_{0})}{\partial x_1} \frac{\partial h_1({\bf U}_{0})}{\partial x_{i}}+ \frac{\partial g_2({\bf X}_{0})}{\partial x_2} \frac{\partial h_2({\bf U}_{0})}{\partial x_{i}}=0. \nonumber \]
If \({\bf X}_{0}\) is a local extreme point of \(f\) subject to \(g_1({\bf X})=g_2({\bf X})=0\), then \({\bf U}_{0}\) is an unconstrained local extreme point of \(f(h_1({\bf U}),h_2({\bf U}),{\bf U})\); therefore,
\[\frac{\partial f({\bf X}_{0})}{\partial x_{i}}+ \frac{\partial f({\bf X}_{0})}{\partial x_1} \frac{\partial h_1({\bf U}_{0})}{\partial x_{i}}+ \frac{\partial f({\bf X}_{0})}{\partial x_2} \frac{\partial h_2({\bf U}_{0})}{\partial x_{i}}=0. \nonumber \]
The last three equations imply that
\[\left|\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_{i}}} & \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_2}}\\\\ \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}} \\\\ \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}}\\ \end{array}\right|=0, \nonumber \]
\[\left|\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_{i}}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}} \\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}} \\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_2}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}} \\\\ \end{array}\right|=0. \nonumber \]
Therefore, there are constants \(c_1\), \(c_2\), \(c_{3}\), not all zero, such that
\[\label{eq:24} \left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_{i}}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}} \\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}}\\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_2}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}}\\\\ \end{array}\right] \left[\begin{array}{ccccccc} c_1\\c_2\\c_{3} \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0\\0 \end{array}\right]. \]
If \(c_1=0\), then
\[\left[\begin{array}{ccccccc} \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}} \\\\ \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}} \\ \end{array}\right] \left[\begin{array}{ccccccc} c_2\\c_{3} \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right], \nonumber \]
so Equation \ref{eq:19} implies that \(c_2=c_{3}=0\); hence, we may assume that \(c_1=1\) in a nontrivial solution of Equation \ref{eq:24}. Therefore,
\[\label{eq:25} \left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_{i}}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}}\\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}} \\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_2}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}} \\\\ \end{array}\right] \left[\begin{array}{ccccccc} 1\\c_2\\c_{3} \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0\\0 \end{array}\right], \]
which implies that
\[\left[\begin{array}{ccccccc} \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}} \\\\ \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}} \\ \end{array}\right] \left[\begin{array}{ccccccc} -c_2\\-c_{3} \end{array}\right]= \left[\begin{array}{ccccccc} f_{x_1}({\bf X}_{0})\\f_{x_2}({\bf X}_{0})\\ \end{array}\right]. \nonumber \]
Since Equation \ref{eq:23} has only one solution, this implies that \(c_2=-\lambda\) and \(c_2=-\mu\), so Equation \ref{eq:25} becomes
\[\left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_{i}}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}}\\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}} \\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_2}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}}\\\\ \end{array}\right] \left[\begin{array}{rcccccc} 1\\-\lambda\\-\mu \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0\\0 \end{array}\right]. \nonumber \]
Computing the topmost entry of the vector on the left yields Equation \ref{eq:20}.
[example:7]
Minimize
\[f(x,y,z,w) = x^2+y^2+z^2+w^2 \nonumber \]
subject to
\[\label{eq:26} x+y+z+w = 10 \text{ and } x-y+z+3w = 6. \]
Solution
Let
\[L = \frac{x^2+y^2+z^2+w^2}2-\lambda(x+y+z+w)-\mu(x-y+z+3w); \nonumber \]
then
\[\begin{align*} L_x &= x-\lambda-\mu \\ L_y & = y-\lambda+\mu \\ L_z & =& z-\lambda-\mu \\ L_w & = w-\lambda-3\mu,\end{align*} \]
so
\[\label{eq:27} x_{0} = \lambda+\mu, \quad y_{0} = \lambda-\mu, \quad z_{0} = \lambda+\mu, \quad w_{0} = \lambda+3\mu. \]
This and Equation \ref{eq:26} imply that
\[\begin{aligned} (\lambda+\mu)+(\lambda-\mu)+(\lambda+\mu) + (\lambda+3\mu) & =& 10 \\ (\lambda+\mu)-(\lambda-\mu)+(\lambda+\mu)+ (3\lambda+9\mu) & =& \phantom16.\end{aligned} \nonumber \]
Therefore,
\[\begin{aligned} 4\lambda + \phantom14\mu & =& 10 \\ 4\lambda + 12\mu & = &\phantom16,\end{aligned} \nonumber \]
so \(\lambda=3\) and \(\mu = -1/2\). Now Equation \ref{eq:27} implies that
\[(x_{0},y_{0},z_{0},w_{0}) = \left(\frac{5}2,\frac{7}2,\frac{5}2 \frac{3}2\right). \nonumber \]
Since \(f(x,y,z,w)\) is the square of the distance from \((x,y,z,w)\) to the origin, it attains a minimum value (but not a maximum value) subject to the constraints; hence the constrained minimum value is
\[f\left(\frac{5}2,\frac{7}2,\frac{5}2, \frac{3}2\right)=27. \nonumber \]
[example:8]
The distance between two curves in \(\mathbb{R}^2\) is the minimum value of
\[\sqrt{(x_1-x_2)^2+(y_1-y_2)^2}, \nonumber \]
where \((x_1,y_1)\) is on one curve and \((x_2,y_2)\) is on the other. Find the distance between the ellipse
\[x^2+2y^2=1 \nonumber \]
and the line
\[\label{eq:28} x+y=4. \]
Solution
We must minimize
\[d^2=(x_1-x_2)^2 + (y_1-y_2)^2 \nonumber \]
subject to
\[x_1^2 + 2y_1^2 =1 \text{ and } x_2+y_2 = 4. \nonumber \]
Let
\[L = \frac{(x_1-x_2)^2 + (y_1-y_2)^2 - \lambda(x_1^2 + 2y_1^2)}2 -\mu(x_2+y_2); \nonumber \]
then
\[\begin{aligned} L_{x_1}&=&x_1-x_2-\lambda x_1\\ L_{y_1}&=&y_1-y_2-2\lambda y_1\\ L_{x_2}&=&x_2-x_1-\mu\\ L_{y_2}&=&y_2-y_1-\mu,\end{aligned} \nonumber \]
so
\[\begin{aligned} x_{10}-x_{20}&=&\lambda x_{10} \text{\; \quad (i)}\\ y_{10}-y_{20}&=&2\lambda y_{10}\text{\quad (ii)}\\ x_{20}-x_{10}&=&\mu\text{\quad \quad \;\;(iii)} \\ y_{20}-y_{10}&=&\mu.\text{\quad \quad \;\;(iv)}\end{aligned} \nonumber \]
From (i) and (iii), \(\mu=-\lambda x_{10}\); from (ii) and (iv), \(\mu=-2\lambda y_{10}\). Since the curves do not intersect, \(\lambda\ne0\), so \(x_{10}=2y_{10}\). Since \(x_{10}^2+2y_{10}^2=1\) and \((x_{0},y_{0})\) is in the first quadrant,
\[\label{eq:29} (x_{10},y_{10})=\left(\frac2{\sqrt{6}},\frac1{\sqrt{6}}\right). \]
Now (iii), (iv), and Equation \ref{eq:28} yield the simultaneous system
\[x_{20}-y_{20}=x_{10}-y_{10}=\frac1{\sqrt{6}},\quad x_{20}+y_{20}=4, \nonumber \]
so
\[(x_{20},y_{20}) = \left(2+\frac1{2\sqrt{6}}, 2-\frac1{2\sqrt{6}}\right). \nonumber \]
From this and Equation \ref{eq:29}, the distance between the curves is
\[\left[\left(2+\frac1{2\sqrt{6}} -\frac2{\sqrt{6}} \right)^2 + \left(2- \frac1{2\sqrt{6}} - \frac1{ \sqrt{6}}\right)^2\right]^{1/2} = \sqrt2 \left(2-\frac{3}{2\sqrt{6}}\right). \nonumber \]
[theorem:3]