$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$

# 1: Introduction to Lagrange Multipliers

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$ $$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$

To avoid repetition, it is to be understood throughout that $$f$$ and $$g_{1}$$, $$g_{2}$$,…, $$g_{m}$$ are continuously differentiable on an open set $$D$$ in $$\mathbb{R}^{n}$$.

Suppose that $$m<n$$ and

$\label{eq:1} g_{1}(\mathbf{X}) = g_2(\mathbf{X}) = \cdots = g_{m}(\mathbf{X})=0$

on a nonempty subset $$D_{1}$$ of $$D$$. If  $$\mathbf{X}_{0} \in D_{1}$$ and there is a neighborhood $$N$$ of $$\mathbf{X}_{0}$$ such that

$\label{eq:2} f(\mathbf{X}) \le f(\mathbf{X}_{0})$

for every $$\mathbf{X}$$ in $$N \cap D_{1}$$, then $$\mathbf{X}_{0}$$ is a local maximum point of $$f$$ subject to the constraints Equation \ref{eq:1}. However, we will usually say “subject to” rather than “subject to the constraint(s).”

If Equation \ref{eq:2} is replaced by

$\label{eq:3} f(\mathbf{X}) \ge f(\mathbf{X}_{0}),$

then “maximum” is replaced by “minimum.” A local maximum or minimum of $$f$$ subject to Equation \ref{eq:1} is also called a local extreme point of $$f$$ subject to Equation \ref{eq:1}. More briefly, we also speak of constrained local maximum, minimum, or extreme points. If Equation \ref{eq:2} or Equation \ref{eq:3} holds for all $$\mathbf{X}$$ in $$D_{1}$$, we omit “local.”

Recall that $${\bf X}_{0}=(x_{10}, x_{20},\dots,x_{n0})$$ is a critical point of a differentiable function $$L=L(x_{1},x_{2},\dots,x_{n})$$ if

$L_{x_{i}}(x_{10},x_{20},\dots,x_{n0})=0,\quad 1\le i\le n. \nonumber$

Therefore, every local extreme point of $$L$$ is a critical point of $$L$$; however, a critical point of $$L$$ is not necessarily a local extreme point of $$L$$.

Suppose that the system Equation \ref{eq:1} of simultaneous equations can be solved for $$x_{1}$$, …, $$x_{m}$$ in terms of the $$x_{m+1}$$, …, $$x_{n}$$; thus,

$\label{eq:4} x_{j}=h_{j}(x_{m+1},\dots,x_{n}),\quad 1\le j\le m.$

Then a constrained extreme value of $$f$$ is an unconstrained extreme value of

$\label{eq:5} f(h_{1}(x_{m+1},\dots,x_{n}),\dots,h_{m}(x_{m+1},\dots,x_{n}),x_{m+1},\dots,x_{n}).$

However, it may be difficult or impossible to find explicit formulas for $$h_{1}$$, $$h_{2}$$, …, $$h_{m}$$, and, even if it is possible, the composite function Equation \ref{eq:5} is almost always complicated. Fortunately, there is a better way to to find constrained extrema, which also requires the solvability assumption, but does not require an explicit formula as indicated in Equation \ref{eq:4}. It is based on the following theorem. Since the proof is complicated, we consider two special cases first.

Theorem $$\PageIndex{1}$$

Suppose that $$n>m.$$ If  $${\bf X}_{0}$$ is a local extreme point of $$f$$ subject to

$g_{1}({\bf X})=g_{2}({\bf X})=\cdots =g_{m}({\bf X})=0 \nonumber$

and

$\label{eq:6} \left|\begin{array}{ccccccc} \displaystyle{\frac{\partial{g_{1}(\mathbf{X}_{0})}}{\partial{x_{r_{1}}}}} & \displaystyle{\frac{\partial{g_{1}(\mathbf{X}_{0})}}{\partial{x_{r_{2}}}}}& &\cdots & \displaystyle{\frac{\partial{g_{1}(\mathbf{X}_{0})}}{\partial{x_{r_{m}}}}} \\\\ \displaystyle{\frac{\partial{g_{2}(\mathbf{X}_{0})}}{\partial{x_{r_{1}}}}} & \displaystyle{\frac{\partial{g_{2}(\mathbf{X}_{0})}}{\partial{x_{r_{2}}}}}& &\cdots & \displaystyle{\frac{\partial{g_{m}(\mathbf{X}_{0})}}{\partial{x_{r_{m}}}}} & \\ \vdots & \vdots &&\ddots&\vdots\\ \displaystyle{\frac{\partial{g_{m}(\mathbf{X}_{0})}}{\partial{x_{r_{1}}}}} & \displaystyle{\frac{\partial{g_{m}(\mathbf{X}_{0})}}{\partial{x_{r_{2}}}}}& &\cdots & \displaystyle{\frac{\partial{g_{m}(\mathbf{X}_{0})}}{\partial{x_{r_{m}}}}} & \end{array}\right|\ne0$

for at least one choice of $$r_{1}<r_{2}<\dots <r_{m}$$ in $$\{1,2,\dots,n\},$$ then there are constants $$\lambda_{1},$$ $$\lambda_{2},$$ …$$,$$ $$\lambda_{m}$$ such that $${\bf X}_{0}$$ is a critical point of

$f-\lambda_{1}g_{1}-\lambda_{2}g_{2}-\cdots-\lambda_{m} g_{m}; \nonumber$

that is$$,$$

$\frac{\partial{f({\bf X}_{0})}}{\partial x_{i}} -\lambda_{1}\frac{\partial{g_{1}({\bf X}_{0})}}{\partial x_{i}} -\lambda_{2}\frac{\partial{g_{2}({\bf X}_{0})}}{\partial x_{i}}-\cdots -\lambda_{m}\frac{\partial{g_{m}({\bf X}_{0})}}{\partial x_{i}}=0, \nonumber$

$$1\le i\le n$$.

The following implementation of this theorem is the method of Lagrange multipliers.

method of Lagrange multipliers

1. Find the critical points of $f-\lambda_{1}g_{1}-\lambda_{2}g_{2}-\cdots-\lambda_{m} g_{m}, \nonumber$ treating $$\lambda_{1}$$, $$\lambda_{2}$$, …$$\lambda_{m}$$ as unspecified constants.
2. Find $$\lambda_{1}$$, $$\lambda_{2}$$, …, $$\lambda_{m}$$ so that the critical points obtained in (a) satisfy the constraints.
3. Determine which of the critical points are constrained extreme points of $$f$$. This can usually be done by physical or intuitive arguments.
4. If $$a$$ and $$b_{1}$$, $$b_{2}$$, …, $$b_{m}$$ are nonzero constants and $$c$$ is an arbitrary constant, then the local extreme points of $$f$$ subject to $$g_{1}=g_{2}= \cdots =g_{m}=0$$ are the same as the local extreme points of $$af-c$$ subject to $$b_{1}g_{1}=b_{2}g_{2}=\cdots=b_{m}g_{m}=0$$. Therefore, we can replace $$f-\lambda_{1} g_{1}-\lambda_{2}g_{2}- \cdots-\lambda_{m} g_{m}$$ by $$af-\lambda_{1}b_{1}g_{1}-\lambda_{2}b_{2}g_{2}- \cdots- \lambda_{m}b_{m}g_{m}-c$$ to simplify computations. (Usually, the “$$-c$$” indicates dropping additive constants.) We will denote the final form by $$L$$ (for Lagrangian).