3.3: Dantzig's Algorithm
- Page ID
- 1820
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)In simple situations a graphical method might suffice, but in many applications there may be thousands or even millions of variables and constraints. Clearly an algorithm that can be implemented on a computer is needed. The simplex algorithm (usually attributed to George Dantzig) provides exactly that. It begins with a standard problem:
Problem \(\PageIndex{1}\):
Maximize \(f(x_1,\ldots,x_n)\) where \(f\) is linear, \(x_i\geq 0\) (\(i=1,\ldots, n\)) subject to
\[Mx = v,~~~~~x:=\begin{pmatrix}x_{1}\\\vdots \\ x_{n}\end{pmatrix} ,\]
where the \(m\times n\) matrix \(M\) and \(m\times 1\) column vector \(v\) are given.
This is solved by arranging the information in an augmented matrix and then applying EROs. To see how this works lets try an example.
Example \(\PageIndex{2}\):
Maximize \(f(x,y,z,w)=3x-3y-z+4w\) subject to constraints
\begin{equation*}
\begin{array}{rcccl}
c_1&:=&x+y+z+w&=&5\\[2mm]
c_2&:=&x+2y+3z+2w&=&6\, ,
\end{array}
\end{equation*}
where \(x\geq0\), \(y\geq0\), \(z\geq0\) and \(w\geq 0\).
Solution
The key observation is this: Suppose we are trying to maximize \(f(x_1,\ldots,x_n)\) subject to a constraint \(c(x_1,\ldots,x_n)=k\) for some constant \(k\) (\(c\) and \(k\) would be the entries of \(Mx\) and \(v\), respectively, in the above). Then we can also try to maximize
\[f(x_1,\ldots,x_n)+\alpha c(x_1,\ldots,x_n)\,\]
because this is only a constant shift \(f\to f+\alpha k\). Choosing \(\alpha\) carefully can lead to a simple form for the function we are extremizing.
Example \(\PageIndex{3}\): Setting up an augmented matrix
Since we are interested in the optimum value of \(f\), we treat it as an additional variable and add one further equation
\[-3x+3y+z-4w+f=0\, .\]
We arrange this equation and the two constraints in an augmented matrix
\[
\begin{array}{ccc}
\left(\begin{array}{rrrrr|r}
1&1&1&1&0&5\\
1&2&3&2&0&6\\\hline
\!-3&3&1&\!\!-4&\ 1&\ 0
\end{array}\right)
\quad &\Leftrightarrow & \quad
\left\{
\begin{array}{lcl}
c_1&=&5\\
c_2&=&6\\
f&=&3x-3y-z+4w
\end{array}\right.
\end{array}.
\]
Keep in mind that the first four columns correspond to the positive variables \((x,y,z,w)\) and that the last row has the information of the function \(f\).
Now the system is written as an augmented matrix where the last row encodes the objective function and the other rows the constraints. Clearly we can perform row operations on the constraint rows since this will not change the solutions to the constraints. Moreover, we can add any amount of the constraint rows to the last row, since this just amounts to adding a constant to the function we want to extremize.
Example \(\PageIndex{4}\): Performing EROs
We scan the last row, and notice the (most negative) coefficient \(-4\). Na\"ively you might think that this is good because this multiplies the positive variable \(w\) and only helps the objective function \(f=4w+\cdots\). However, what this actually means is that the variable \(w\) will large but determined by the constraints. Therefore we want to remove it from the objective function. We can zero out this entry by performing a row operation. For that, either of first two rows could be used. To decide which, we remember that the we still have to solve solve the constraints for variables that are positive. Hence we should try to keep the first two entries in the last column positive. Hence we choose the row which will add the smallest constant to \(f\) when we zero out the \(-4\): Look at the last column (where the values of the constraints are stored). We see that adding four times the first row to the last row would zero out the \(-4\) entry but add \(20\) to \(f\), while adding two times the second row to the last row would also zero out the \(-4\) but only add 12 to \(f\). (You can follow this by watching what happens to the last entry in the last row.) So we perform the latter row operation and obtain the following:
\[
\begin{array}{c|c}
\left(
\begin{array}{rrrrr|r}
1&1&1&1&0&5\\
1&2&3&2&0&6\\\hline
\!-1&7&7&0&\ 1&\ 12
\end{array}\right)
\quad\quad &\quad
\begin{array}{rcl}
c_1&=&5\\
c_2&=&6\\
f+2c_2&=&12+x-7y-7z\, .
\end{array}
\end{array}
\]
We do not want to undo any of our good work when we perform further row operations, so now we use the second row to zero out all other entries in the fourth column. This is achieved by subtracting half the second row from the first:
\[
\begin{array}{c|c}
\left(
\begin{array}{rrrrr|r}
\frac12&0&-\frac12&0&0&2\\
1&2&3&2&0&6\\\hline
\!-1&7&7&0&\ 1&\ 12
\end{array}\right)
\quad\quad &\quad
\begin{array}{rcl}
c_1-\frac12 c_2&=&2\\
c_2&=&6\\
f+2c_2&=&12+x-7y-7z\, .
\end{array}
\end{array}
\]
Precisely because we chose the second row to perform our row operations, all entries in the last column remain positive. This allows us to continue the algorithm.
We now repeat the above procedure: There is a \(-1\) in the first column of the last row. We want to zero it out while adding as little to
\(f\) as possible. This is achieved by adding twice the first row to the last row:
$$
\begin{array}{c|c}
\left(
\begin{array}{rrrrr|r}
\frac12&0&-\frac12&0&0&2\\
1&2&3&2&0&6\\\hline
0&7&6&0&\ 1&\ 16
\end{array}\right)
\quad\quad &\quad
\begin{array}{rcl}
c_1-\frac12 c_2&=&2\\
c_2&=&6\\
f+2c_2+2(c_1-\frac12 c_2)&=&16-7y-6z\, .
\end{array}
\end{array}
\]
The Dantzig algorithm terminates if all the coefficients in the last row (save perhaps for the last entry which encodes the value of the objective) are positive. To see why we are done, lets write out what our row operations have done in terms of the function \(f\) and the constraints \((c_1,c_2)\). First we have
\[
f+2c_2+2(c_1-\frac12 c_2)=16-7y-6z
\]
with both \(y\) and \(z\) positive. Hence to maximize \(f\) we should choose \(y=0=z\). In which case we obtain our optimum value
\[
\underline{f=16\, .}
\]
Finally, we check that the constraints can be solved with \(y=0=z\) and positive \((x,w)\). Indeed, they can by taking \(x=2=w\)
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)