Skip to main content
\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)
Mathematics LibreTexts

4: Extrema subject to Two Constraints

  • Page ID
    17319
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)

    Here is Theorem [theorem:1] with \(m=2\).

    Theorem \(\PageIndex1\)

    Suppose that \(n>2.\) If  \({\bf X}_{0}\) is a local extreme point of \(f\) subject to \(g_1({\bf X})=g_2({\bf X})=0\) and

    \[\label{eq:19} \left|\begin{array}{ccccccc} \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{r}}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{s}}}\\\\ \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{r}}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{s}}}\\ \end{array}\right|\ne0\]

    for some \(r\) and \(s\) in \(\{1,2,\dots,n\},\) then there are constants \(\lambda\) and \(\mu\) such that

    \[\label{eq:20} \frac{\partial f({\bf X}_{0})}{\partial x_{i}}- \lambda\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}- \mu\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}=0,\]

    \(1\le i\le n\).

    For notational convenience, let \(r=1\) and \(s=2\). Denote

    \[{\bf U}=(x_{3},x_{4},\dots x_{n})\text{ and } {\bf U}_{0}=(x_{30},x_{30},\dots x_{n0}).\]

    Since

    \[\label{eq:21} \left|\begin{array}{ccccccc} \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}}\\\\ \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}}\\ \end{array}\right|\ne0,\]

    the Implicit Function Theorem (Theorem 6.4.1) implies that there are unique continuously differentiable functions

    \[h_1=h_1(x_{3},x_{4},\dots,x_{n})\text{ and } h_2=h_1(x_{3},x_{4},\dots,x_{n}),\]

    defined on a neighborhood \(N\subset{\mathbb R}^{n-2}\) of \({\bf U}_{0},\) such that \((h_1({\bf U}),h_2({\bf U}),{\bf U})\in D\) for all \({\bf U}\in N\), \(h_1({\bf U}_{0})=x_{10}\), \(h_2({\bf U}_{0})=x_{20}\), and

    \[\label{eq:22} g_1(h_1({\bf U}),h_2({\bf U}),{\bf U})= g_2(h_1({\bf U}),h_2({\bf U}),{\bf U})=0,\quad {\bf U}\in N.\]

    From \ref{eq:21}, the system

    \[\label{eq:23} \left[\begin{array}{ccccccc} \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}}\\\\ \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}}\\ \end{array}\right] \left[\begin{array}{ccccccc} \lambda\\\mu \end{array}\right]= \left[\begin{array}{ccccccc} f_{x_1}({\bf X}_{0})\\f_{x_2}({\bf X}_{0})\\ \end{array}\right]\]

    has a unique solution (Theorem 6.1.13). This implies Equation \ref{eq:20} with \(i=1\) and \(i=2\). If \(3\le i\le n\), then differentiating Equation \ref{eq:22} with respect to \(x_{i}\) and recalling that \((h_1({\bf U}_{0}),h_2({\bf U}_{0}),{\bf U}_{0})={\bf X}_{0}\) yields

    \[\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}+ \frac{\partial g_1({\bf X}_{0})}{\partial x_1} \frac{\partial h_1({\bf U}_{0})}{\partial x_{i}}+ \frac{\partial g_1({\bf X}_{0})}{\partial x_2} \frac{\partial h_2({\bf U}_{0})}{\partial x_{i}}=0\]

    and

    \[\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}+ \frac{\partial g_2({\bf X}_{0})}{\partial x_1} \frac{\partial h_1({\bf U}_{0})}{\partial x_{i}}+ \frac{\partial g_2({\bf X}_{0})}{\partial x_2} \frac{\partial h_2({\bf U}_{0})}{\partial x_{i}}=0.\]

    If \({\bf X}_{0}\) is a local extreme point of \(f\) subject to \(g_1({\bf X})=g_2({\bf X})=0\), then \({\bf U}_{0}\) is an unconstrained local extreme point of \(f(h_1({\bf U}),h_2({\bf U}),{\bf U})\); therefore,

    \[\frac{\partial f({\bf X}_{0})}{\partial x_{i}}+ \frac{\partial f({\bf X}_{0})}{\partial x_1} \frac{\partial h_1({\bf U}_{0})}{\partial x_{i}}+ \frac{\partial f({\bf X}_{0})}{\partial x_2} \frac{\partial h_2({\bf U}_{0})}{\partial x_{i}}=0.\]

    The last three equations imply that

    \[\left|\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_{i}}} & \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_2}}\\\\ \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}} \\\\ \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}}\\ \end{array}\right|=0,\]

    \[\left|\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_{i}}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}} \\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}} \\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_2}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}} \\\\ \end{array}\right|=0.\]

    Therefore, there are constants \(c_1\), \(c_2\), \(c_{3}\), not all zero, such that

    \[\label{eq:24} \left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_{i}}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}} \\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}}\\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_2}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}}\\\\ \end{array}\right] \left[\begin{array}{ccccccc} c_1\\c_2\\c_{3} \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0\\0 \end{array}\right].\]

    If \(c_1=0\), then

    \[\left[\begin{array}{ccccccc} \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}} \\\\ \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}} \\ \end{array}\right] \left[\begin{array}{ccccccc} c_2\\c_{3} \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right],\]

    so Equation \ref{eq:19} implies that \(c_2=c_{3}=0\); hence, we may assume that \(c_1=1\) in a nontrivial solution of Equation \ref{eq:24}. Therefore,

    \[\label{eq:25} \left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_{i}}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}}\\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}} \\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_2}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}} \\\\ \end{array}\right] \left[\begin{array}{ccccccc} 1\\c_2\\c_{3} \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0\\0 \end{array}\right],\]

    which implies that

    \[\left[\begin{array}{ccccccc} \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}} \\\\ \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}} \\ \end{array}\right] \left[\begin{array}{ccccccc} -c_2\\-c_{3} \end{array}\right]= \left[\begin{array}{ccccccc} f_{x_1}({\bf X}_{0})\\f_{x_2}({\bf X}_{0})\\ \end{array}\right].\]

    Since Equation \ref{eq:23} has only one solution, this implies that \(c_2=-\lambda\) and \(c_2=-\mu\), so Equation \ref{eq:25} becomes

    \[\left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_{i}}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_{i}}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_{i}}}\\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_1}}& \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_1}} & \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_1}} \\\\ \displaystyle{\frac{\partial f({\bf X}_{0})}{\partial x_2}} & \displaystyle{\frac{\partial g_1({\bf X}_{0})}{\partial x_2}}& \displaystyle{\frac{\partial g_2({\bf X}_{0})}{\partial x_2}}\\\\ \end{array}\right] \left[\begin{array}{rcccccc} 1\\-\lambda\\-\mu \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0\\0 \end{array}\right].\]

    Computing the topmost entry of the vector on the left yields Equation \ref{eq:20}.

    Example \(\PageIndex1\)

    Minimize

    \[f(x,y,z,w) = x^2+y^2+z^2+w^2 \nonumber\]

    subject to

    \[\label{eq:26} x+y+z+w = 10 \text{ and } x-y+z+3w = 6.\]

    Solution

    Let

    \[L = \frac{x^2+y^2+z^2+w^2}2-\lambda(x+y+z+w)-\mu(x-y+z+3w); \nonumber\]

    then

    \[\begin{align*} L_x &= x-\lambda-\mu \\ L_y & = y-\lambda+\mu \\ L_z & =& z-\lambda-\mu \\ L_w & = w-\lambda-3\mu,\end{align*}\]

    so

    \[\label{eq:27} x_{0} = \lambda+\mu, \quad y_{0} = \lambda-\mu, \quad z_{0} = \lambda+\mu, \quad w_{0} = \lambda+3\mu.\]

    This and Equation \ref{eq:26} imply that

    \[\begin{aligned} (\lambda+\mu)+(\lambda-\mu)+(\lambda+\mu) + (\lambda+3\mu) & =& 10 \\ (\lambda+\mu)-(\lambda-\mu)+(\lambda+\mu)+ (3\lambda+9\mu) & =& \phantom16.\end{aligned}\]

    Therefore,

    \[\begin{aligned} 4\lambda + \phantom14\mu & =& 10 \\ 4\lambda + 12\mu & = &\phantom16,\end{aligned}\]

    so \(\lambda=3\) and \(\mu = -1/2\). Now Equation \ref{eq:27} implies that

    \[(x_{0},y_{0},z_{0},w_{0}) = \left(\frac{5}2,\frac{7}2,\frac{5}2 \frac{3}2\right).\]

    Since \(f(x,y,z,w)\) is the square of the distance from \((x,y,z,w)\) to the origin, it attains a minimum value (but not a maximum value) subject to the constraints; hence the constrained minimum value is

    \[f\left(\frac{5}2,\frac{7}2,\frac{5}2, \frac{3}2\right)=27.\]

    Example \(\PageIndex1\)

    The distance between two curves in \(\mathbb{R}^2\) is the minimum value of

    \[\sqrt{(x_1-x_2)^2+(y_1-y_2)^2}, \nonumber\]

    where \((x_1,y_1)\) is on one curve and \((x_2,y_2)\) is on the other. Find the distance between the ellipse

    \[x^2+2y^2=1 \nonumber\]

    and the line

    \[\label{eq:28} x+y=4.\]

    Solution

    We must minimize

    \[d^2=(x_1-x_2)^2 + (y_1-y_2)^2 \nonumber\]

    subject to

    \[x_1^2 + 2y_1^2 =1 \text{ and } x_2+y_2 = 4.\]

    Let

    \[L = \frac{(x_1-x_2)^2 + (y_1-y_2)^2 - \lambda(x_1^2 + 2y_1^2)}2 -\mu(x_2+y_2);\]

    then

    \[\begin{aligned} L_{x_1}&=&x_1-x_2-\lambda x_1\\ L_{y_1}&=&y_1-y_2-2\lambda y_1\\ L_{x_2}&=&x_2-x_1-\mu\\ L_{y_2}&=&y_2-y_1-\mu,\end{aligned}\]

    so

    \[\begin{aligned} x_{10}-x_{20}&=&\lambda x_{10} \text{\; \quad (i)}\\ y_{10}-y_{20}&=&2\lambda y_{10}\text{\quad (ii)}\\ x_{20}-x_{10}&=&\mu\text{\quad \quad \;\;(iii)} \\ y_{20}-y_{10}&=&\mu.\text{\quad \quad \;\;(iv)}\end{aligned}\]

    From (i) and (iii), \(\mu=-\lambda x_{10}\); from (ii) and (iv), \(\mu=-2\lambda y_{10}\). Since the curves do not intersect, \(\lambda\ne0\), so \(x_{10}=2y_{10}\). Since \(x_{10}^2+2y_{10}^2=1\) and \((x_{0},y_{0})\) is in the first quadrant,

    \[\label{eq:29} (x_{10},y_{10})=\left(\frac2{\sqrt{6}},\frac1{\sqrt{6}}\right).\]

    Now (iii), (iv), and Equation \ref{eq:28} yield the simultaneous system

    \[x_{20}-y_{20}=x_{10}-y_{10}=\frac1{\sqrt{6}},\quad x_{20}+y_{20}=4,\]

    so

    \[(x_{20},y_{20}) = \left(2+\frac1{2\sqrt{6}}, 2-\frac1{2\sqrt{6}}\right).\]

    From this and Equation \ref{eq:29}, the distance between the curves is

    \[\left[\left(2+\frac1{2\sqrt{6}} -\frac2{\sqrt{6}} \right)^2 + \left(2- \frac1{2\sqrt{6}} - \frac1{ \sqrt{6}}\right)^2\right]^{1/2} = \sqrt2 \left(2-\frac{3}{2\sqrt{6}}\right).\]