Skip to main content
\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)
Mathematics LibreTexts

1: Introduction to Lagrange Multipliers

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)

    To avoid repetition, it is to be understood throughout that \(f\) and \(g_{1}\), \(g_{2}\),…, \(g_{m}\) are continuously differentiable on an open set \(D\) in \(\mathbb{R}^{n}\).

    Suppose that \(m<n\) and

    \[\label{eq:1} g_{1}(\mathbf{X}) = g_2(\mathbf{X}) = \cdots = g_{m}(\mathbf{X})=0\]

    on a nonempty subset \(D_{1}\) of \(D\). If  \(\mathbf{X}_{0} \in D_{1}\) and there is a neighborhood \(N\) of \(\mathbf{X}_{0}\) such that

    \[\label{eq:2} f(\mathbf{X}) \le f(\mathbf{X}_{0})\]

    for every \(\mathbf{X}\) in \(N \cap D_{1}\), then \(\mathbf{X}_{0}\) is a local maximum point of \(f\) subject to the constraints Equation \ref{eq:1}. However, we will usually say “subject to” rather than “subject to the constraint(s).”

    If Equation \ref{eq:2} is replaced by

    \[\label{eq:3} f(\mathbf{X}) \ge f(\mathbf{X}_{0}),\]

    then “maximum” is replaced by “minimum.” A local maximum or minimum of \(f\) subject to Equation \ref{eq:1} is also called a local extreme point of \(f\) subject to Equation \ref{eq:1}. More briefly, we also speak of constrained local maximum, minimum, or extreme points. If Equation \ref{eq:2} or Equation \ref{eq:3} holds for all \(\mathbf{X}\) in \(D_{1}\), we omit “local.”

    Recall that \({\bf X}_{0}=(x_{10}, x_{20},\dots,x_{n0})\) is a critical point of a differentiable function \(L=L(x_{1},x_{2},\dots,x_{n})\) if

    \[L_{x_{i}}(x_{10},x_{20},\dots,x_{n0})=0,\quad 1\le i\le n. \nonumber\]

    Therefore, every local extreme point of \(L\) is a critical point of \(L\); however, a critical point of \(L\) is not necessarily a local extreme point of \(L\).

    Suppose that the system Equation \ref{eq:1} of simultaneous equations can be solved for \(x_{1}\), …, \(x_{m}\) in terms of the \(x_{m+1}\), …, \(x_{n}\); thus,

    \[\label{eq:4} x_{j}=h_{j}(x_{m+1},\dots,x_{n}),\quad 1\le j\le m.\]

    Then a constrained extreme value of \(f\) is an unconstrained extreme value of

    \[\label{eq:5} f(h_{1}(x_{m+1},\dots,x_{n}),\dots,h_{m}(x_{m+1},\dots,x_{n}),x_{m+1},\dots,x_{n}).\]

    However, it may be difficult or impossible to find explicit formulas for \(h_{1}\), \(h_{2}\), …, \(h_{m}\), and, even if it is possible, the composite function Equation \ref{eq:5} is almost always complicated. Fortunately, there is a better way to to find constrained extrema, which also requires the solvability assumption, but does not require an explicit formula as indicated in Equation \ref{eq:4}. It is based on the following theorem. Since the proof is complicated, we consider two special cases first.

    Theorem \(\PageIndex{1}\)

    Suppose that \(n>m.\) If  \({\bf X}_{0}\) is a local extreme point of \(f\) subject to

    \[g_{1}({\bf X})=g_{2}({\bf X})=\cdots =g_{m}({\bf X})=0 \nonumber\]


    \[\label{eq:6} \left|\begin{array}{ccccccc} \displaystyle{\frac{\partial{g_{1}(\mathbf{X}_{0})}}{\partial{x_{r_{1}}}}} & \displaystyle{\frac{\partial{g_{1}(\mathbf{X}_{0})}}{\partial{x_{r_{2}}}}}& &\cdots & \displaystyle{\frac{\partial{g_{1}(\mathbf{X}_{0})}}{\partial{x_{r_{m}}}}} \\\\ \displaystyle{\frac{\partial{g_{2}(\mathbf{X}_{0})}}{\partial{x_{r_{1}}}}} & \displaystyle{\frac{\partial{g_{2}(\mathbf{X}_{0})}}{\partial{x_{r_{2}}}}}& &\cdots & \displaystyle{\frac{\partial{g_{m}(\mathbf{X}_{0})}}{\partial{x_{r_{m}}}}} & \\ \vdots & \vdots &&\ddots&\vdots\\ \displaystyle{\frac{\partial{g_{m}(\mathbf{X}_{0})}}{\partial{x_{r_{1}}}}} & \displaystyle{\frac{\partial{g_{m}(\mathbf{X}_{0})}}{\partial{x_{r_{2}}}}}& &\cdots & \displaystyle{\frac{\partial{g_{m}(\mathbf{X}_{0})}}{\partial{x_{r_{m}}}}} & \end{array}\right|\ne0\]

    for at least one choice of \(r_{1}<r_{2}<\dots <r_{m}\) in \(\{1,2,\dots,n\},\) then there are constants \(\lambda_{1},\) \(\lambda_{2},\) …\(,\) \(\lambda_{m}\) such that \({\bf X}_{0}\) is a critical point of

    \[f-\lambda_{1}g_{1}-\lambda_{2}g_{2}-\cdots-\lambda_{m} g_{m}; \nonumber\]

    that is\(,\)

    \[\frac{\partial{f({\bf X}_{0})}}{\partial x_{i}} -\lambda_{1}\frac{\partial{g_{1}({\bf X}_{0})}}{\partial x_{i}} -\lambda_{2}\frac{\partial{g_{2}({\bf X}_{0})}}{\partial x_{i}}-\cdots -\lambda_{m}\frac{\partial{g_{m}({\bf X}_{0})}}{\partial x_{i}}=0, \nonumber\]

    \(1\le i\le n\).

    The following implementation of this theorem is the method of Lagrange multipliers.

    method of Lagrange multipliers

    1. Find the critical points of \[f-\lambda_{1}g_{1}-\lambda_{2}g_{2}-\cdots-\lambda_{m} g_{m}, \nonumber\] treating \(\lambda_{1}\), \(\lambda_{2}\), …\(\lambda_{m}\) as unspecified constants.
    2. Find \(\lambda_{1}\), \(\lambda_{2}\), …, \(\lambda_{m}\) so that the critical points obtained in (a) satisfy the constraints.
    3. Determine which of the critical points are constrained extreme points of \(f\). This can usually be done by physical or intuitive arguments.
    4. If \(a\) and \(b_{1}\), \(b_{2}\), …, \(b_{m}\) are nonzero constants and \(c\) is an arbitrary constant, then the local extreme points of \(f\) subject to \(g_{1}=g_{2}= \cdots =g_{m}=0\) are the same as the local extreme points of \(af-c\) subject to \(b_{1}g_{1}=b_{2}g_{2}=\cdots=b_{m}g_{m}=0\). Therefore, we can replace \(f-\lambda_{1} g_{1}-\lambda_{2}g_{2}- \cdots-\lambda_{m} g_{m}\) by \(af-\lambda_{1}b_{1}g_{1}-\lambda_{2}b_{2}g_{2}- \cdots- \lambda_{m}b_{m}g_{m}-c\) to simplify computations. (Usually, the “\(-c\)” indicates dropping additive constants.) We will denote the final form by \(L\) (for Lagrangian).