Skip to main content
Mathematics LibreTexts

2: Extrema Subject to One Constraint

  • Page ID
    17317
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Here is Theorem [theorem:1] with \(m=1\).

    Theorem \(\PageIndex1\)

    [theorem:2]

    Suppose that \(n>1.\) If  \({\bf X}_0\) is a local extreme point of \(f\) subject to \(g({\bf X})=0\) and \(g_{x_{r}}({\bf X}_0)\ne0\) for some \(r\in\{1,2,\dots,n\},\) then there is a constant \(\lambda\) such that

    \[\label{eq:7} f_{x_i}({\bf X}_0)-\lambda g_{x_i}({\bf X}_0)=0,\quad \] \(1\le i\le n;\) thus, \({\bf X}_0\) is a critical point of \(f-\lambda g\).

    For notational convenience, let \(r=1\) and denote

    \[{\bf U}=(x_2,x_{3},\dots x_{n})\text{ and } {\bf U}_0=(x_{20},x_{30},\dots x_{n0}). \nonumber \]

    Since \(g_{x_1}({\bf X}_0)\ne0\), the Implicit Function Theorem (Corollary 6.4.2, p. 423) implies that there is a unique continuously differentiable function \(h=h({\bf U}),\) defined on a neighborhood \(N \subset{\mathbb R}^{n-1}\) of \({\bf U}_0,\) such that \((h({\bf U}),{\bf U})\in D\) for all \({\bf U}\in N\), \(h({\bf U}_0)=x_{10}\), and

    \[\label{eq:8} g(h({\bf U}),{\bf U})=0,\quad {\bf U}\in N. \]

    Now define

    \[\label{eq:9} \lambda=\frac{f_{x_1}({\bf X}_0)}{g_{x_1}({\bf X}_0)}, \]

    which is permissible, since \(g_{x_1}({\bf X}_0)\ne0\). This implies Equation \ref{eq:7} with \(i=1\). If \(i> 1\), differentiating Equation \ref{eq:8} with respect to \(x_i\) yields

    \[\label{eq:10} \frac{\partial g(h({\bf U}),{\bf U})}{\partial x_i} + \frac{\partial g(h({\bf U}),{\bf U})}{\partial x_1} \frac{\partial h({\bf U})}{\partial x_i}=0,\quad {\bf U}\in N. \]

    Also,

    \[\label{eq:11} \frac{\partial f({h(\bf U}),{\bf U}))}{\partial x_i}= \frac{\partial f(h({\bf U}),{\bf U})}{\partial x_i}+ \frac{\partial f(h({\bf U}),{\bf U})}{\partial x_1} \frac{\partial h({\bf U})}{\partial x_i},\quad {\bf U}\in N. \]

    Since \((h({\bf U}_0),{\bf U}_0)={\bf X}_0\), Equation \ref{eq:10} implies that

    \[\label{eq:12} \frac{\partial g({\bf X}_0)}{\partial x_i}+ \frac{\partial g({\bf X}_0)}{\partial x_1} \frac{\partial h({\bf U}_0)}{\partial x_i}=0. \]

    If  \({\bf X}_0\) is a local extreme point of \(f\) subject to \(g({\bf X})=0\), then \({\bf U}_0\) is an unconstrained local extreme point of \(f(h({\bf U}),{\bf U})\); therefore, Equation \ref{eq:11} implies that

    \[\label{eq:13} \frac{\partial f({\bf X}_0)}{\partial x_i}+ \frac{\partial f({\bf X}_0)}{\partial x_1} \frac{\partial h({\bf U}_0)}{\partial x_i}=0. \]

    Since a linear homogeneous system

    \[\left[\begin{array}{ccccccc} a&b\\c&d \end{array}\right] \left[\begin{array}{ccccccc} u\\v \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right] \nonumber \]

    has a nontrivial solution if and only if

    \[\left|\begin{array}{ccccccc} a&b\\c&d \end{array}\right|=0, \nonumber \] (Theorem 6.1.15), Equations \ref{eq:12} and \ref{eq:13} imply that

    \[\left|\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}\\\\ \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}}& \end{array}\right|=0,\text{\; so\;\;} \left|\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}\\\\ \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}} \end{array}\right|=0, \nonumber \]

    since the determinants of a matrix and its transpose are equal. Therefore, the system

    \[\left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}\\\\ \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}} \end{array}\right] \left[\begin{array}{ccccccc} u\\v \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right] \nonumber \]

    has a nontrivial solution (Theorem 6.1.15). Since \(g_{x_1}({\bf X}_0)\ne0\), \(u\) must be nonzero in a nontrivial solution. Hence, we may assume that \(u=1\), so

    \[\label{eq:14} \left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}\\\\ \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}} \end{array}\right] \left[\begin{array}{ccccccc} 1\\ v \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right]. \]

    In particular,

    \[\frac{\partial f({\bf X}_0)}{\partial x_1}+ v\frac{\partial g({\bf X}_0)}{\partial x_1}=0, \text{\; so\;\;} -v=\frac{f_{x_1}({\bf X}_0)}{g_{x_1}({\bf X}_0)}. \nonumber \]

    Now Equation \ref{eq:9} implies that \(-v=\lambda\), and Equation \ref{eq:14} becomes

    \[\left[\begin{array}{ccccccc} \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_i}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_i}}\\\\ \displaystyle{\frac{\partial f({\bf X}_0)}{\partial x_1}}& \displaystyle{\frac{\partial g({\bf X}_0)}{\partial x_1}} \end{array}\right] \left[\begin{array}{rcccccc} 1\\ -\lambda \end{array}\right]= \left[\begin{array}{ccccccc} 0\\0 \end{array}\right]. \nonumber \]

    Computing the topmost entry of the vector on the left yields Equation \ref{eq:7}.

    Example \(\PageIndex1\)

    [example:1]

    Find the point \((x_0,y_0)\) on the line

    \[ax+by=d \nonumber \]

    closest to a given point \((x_1,y_1)\).

    Solution

    We must minimize \(\sqrt{(x-x_1)^2+(y-y_1)^2}\) subject to the constraint. This is equivalent to minimizing \((x-x_1)^2+(y-y_1)^2\) subject to the constraint, which is simpler. For, this we could let

    \[L=(x-x_1)^2+(y-y_1)^2-\lambda (ax+by-d); \nonumber \]

    however,

    \[L=\frac{(x-x_1)^2+(y-y_1)^2}2-\lambda (ax+by) \nonumber \]

    is better. Since

    \[L_{x}=x-x_1-\lambda a\text{\quad and \quad } L_{y}=y-y_1-\lambda b, \nonumber \]

    \((x_0,y_0)=(x_1+\lambda a, y_1+\lambda b)\), where we must choose \(\lambda\) so that \(ax_0+by_0=d\). Therefore,

    \[ax_0+by_0=ax_1+by_1+\lambda(a^2+b^2)=d, \nonumber \]

    so

    \[\lambda= \frac{d-ax_1-by_1}{a^2+b^2}, \nonumber \]

    \[x_0=x_1+\frac{(d-ax_1-by_1)a}{a^2+b^2}, \text{ and } y_0=y_1+\frac{(d-ax_1-by_1)b}{a^2+b^2}. \nonumber \]

    The distance from \((x_1,y_1)\) to the line is

    \[\sqrt{(x_0-x_1)^2+(y_0-y_1)^2}= \frac{|d-ax_1-by_1|}{\sqrt{a^2+b^2}}. \nonumber \]

    Example \(\PageIndex2\)

    [example:2]

    Find the extreme values of \(f(x,y)=2x+y\) subject to

    \[x^2+y^2=4. \nonumber \]

    Solution

    Let

    \[L=2x+y-\frac{\lambda}2(x^2+y^2); \nonumber \]

    then

    \[L_{x}=2-\lambda x\text{ and } L_{y}=1-\lambda y, \nonumber \]

    so \((x_0,y_0)=(2/\lambda,1/\lambda)\). Since \(x_0^2+y_0^2=4\), \(\lambda=\pm \sqrt{5}/2\). Hence, the constrained maximum is \(2\sqrt{5}\), attained at \((4/\sqrt{5},2/\sqrt{5})\), and the constrained minimum is \(-2\sqrt{5}\), attained at \((-4/\sqrt{5},-2/\sqrt{5})\).

    Example \(\PageIndex{3}\)

    [example:3]

    Find the point in the plane

    \[\label{eq:15} 3x+4y+z=1 \] closest to \((-1,1,1)\).

    Solution

    We must minimize

    \[f(x,y,z)=(x+1)^2+(y-1)^2+(z-1)^2 \nonumber \]

    subject to Equation \ref{eq:15}. Let

    \[L=\frac{(x+1)^2+(y-1)^2+(z-1)^2}2-\lambda(3x+4y+z); \nonumber \]

    then

    \[L_{x}= x+1-3\lambda,\quad L_{y}= y-1-4\lambda,\text{ and } L_{z}= z-1-\lambda, \nonumber \]

    so

    \[x_0=-1+3\lambda,\quad y_0=1+4\lambda,\quad z_0=1+\lambda. \nonumber \]

    From Equation \ref{eq:15},

    \[3(-1+3\lambda)+4(1+4\lambda)+(1+\lambda)-1=1+26\lambda=0, \nonumber \] so \(\lambda=-1/26\) and

    \[(x_0,y_0,z_0)= \left(-\frac{29}{26},\frac{22}{26},\frac{25}{26}\right). \nonumber \]

    The distance from \((x_0,y_0,z_0)\) to \((-1,1,1)\) is

    \[\sqrt{(x_0+1)^2+(y_0-1)^2+(z_0-1)^2}=\frac1{\sqrt{26}}. \nonumber \]

    Example \(\PageIndex{4}\)

    [example:4]

    Assume that \(n\ge 2\) and \(x_i\ge 0\), \(1\le i\le n\).

    1. Find the extreme values of \(\displaystyle{\sum_{i=1}^{n}x_i}\) subject to \(\displaystyle{\sum_{i=1}^{n}x_i^2=1}\).
    2. Find the minimum value of \(\displaystyle{\sum_{i=1}^{n}x_i^2}\) subject to \(\displaystyle{\sum_{i=1}^{n}x_i=1}\).

    (a) Let

    \[L= \sum_{i=1}^{n}x_i-\frac{\lambda}2\sum_{i=1}^{n}x_i^2; \nonumber \] then

    \[L_{x_i}=1-\lambda x_i, \text{\; so\;\;} x_{i0}=\frac1{\lambda}, \quad 1\le i\le n. \nonumber \]

    Hence, \(\displaystyle{\sum_{i=1}^{n}x_{i0}^2}=n/\lambda^2\), so \(\lambda=\pm\sqrt{n}\)  and

    \[(x_{10},x_{20},\dots,x_{n0})= \pm\left(\frac1{\sqrt{n}},\frac1{\sqrt{n}}, \dots, \frac1{\sqrt{n}}\right). \nonumber \]

    Therefore, the constrained maximum is \(\sqrt{n}\) and the constrained minimum is \(-\sqrt{n}\).

    (b) Let

    \[L=\frac12 \sum_{i=1}^{n}x_i^2-\lambda\sum_{i=1}^{n}x_i; \nonumber \] then

    \[L_{x_i}=x_i-\lambda, \text{\; so\;\;} x_{i0}=\lambda,\quad 1\le i\le n. \nonumber \]

    Hence, \(\displaystyle{\sum_{i=1}^{n}x_{i0}}=n\lambda=1\), so \(x_{i0}=\lambda=1/n\) and the constrained minimum is

    \[\displaystyle{\sum_{i=1}^{n}x_{i0}^2}=\frac1{n} \nonumber \] There is no constrained maximum. (Why?)

    Example \(\PageIndex1\)

    [example:5]

    Show that

    \[x^{1/p}y^{1/q} \le \frac{x}{p}+\frac{y}{q}, \quad x,y \ge 0, \nonumber \]

    if

    \[\label{eq:16} \frac1{p} +\frac1{q} = 1, \quad p > 0, \text{ and } q > 0. \]

    Solution

    We first find the maximum of

    \[f(x,y) = x^{1/p}y^{1/q} \nonumber \]

    subject to

    \[\label{eq:17} \frac{x}{p}+\frac{y}{q} = \sigma, \quad x \ge 0, \quad y \ge 0, \]

    where \(\sigma\) is a fixed but arbitrary positive number. Since \(f\) is continuous, it must assume a maximum at some point \((x_0,y_0)\) on the line segment Equation \ref{eq:17}, and \((x_0,y_0)\) cannot be an endpoint of the segment, since \(f(p\sigma,0) = f(0,q\sigma)=0\). Therefore, \((x_0,y_0)\) is in the open first quadrant.

    Let

    \[L = x^{1/p}y^{1/q} -\lambda \left(\frac{x}{p}+\frac{y}{q}\right). \nonumber \]

    Then

    \[L_x = \frac1{px} f(x,y) - \frac{\lambda}{p} \text{ and } L_y = \frac1{qy} f(x,y) - \frac{\lambda}{q}=0, \nonumber \]

    so \(x_0 = y_0=f(x_0,y_0)/\lambda\). Now Equations \ref{eq:16} and \ref{eq:17} imply that \(x_0 =y_0 = \sigma\). Therefore,

    \[f(x,y) \le f(\sigma,\sigma) = \sigma^{1/p}\sigma^{1/q} = \sigma=\frac{x}{p}+\frac{y}{q}. \nonumber \]

    This can be generalized (Exercise [exer:53]). It can also be used to generalize Schwarz’s inequality (Exercise [exer:54]).


    This page titled 2: Extrema Subject to One Constraint is shared under a CC BY-NC-SA 3.0 license and was authored, remixed, and/or curated by William F. Trench.

    • Was this article helpful?