3.4: Operation Counts
- Page ID
- 96048
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)To estimate how much computational time is required for an algorithm, one can count the number of operations required (multiplications, divisions, additions and subtractions). Usually, what is of interest is how the algorithm scales with the size of the problem. For example, suppose one wants to multiply two full \(n \times n\) matrices. The calculation of each element requires \(n\) multiplications and \(n-1\) additions, or say \(2 n-1\) operations. There are \(n^{2}\) elements to compute so that the total operation count is \(n^{2}(2 n-1)\). If \(n\) is large, we might want to know what will happen to the computational time if \(n\) is doubled. What matters most is the fastest-growing, leading-order term in the operation count. In this matrix multiplication example, the operation count is \(n^{2}(2 n-1)=2 n^{3}-n^{2}\), and the leading-order term is \(2 n^{3}\). The factor of 2 is unimportant for the scaling, and we say that the algorithm scales like \(\mathrm{O}\left(n^{3}\right)\), which is read as "big Oh of n cubed." When using the big-Oh notation, we will drop both lower-order terms and constant multipliers.
The big-Oh notation tells us how the computational time of an algorithm scales. For example, suppose that the multiplication of two large \(n \times n\) matrices took a computational time of \(T_{n}\) seconds. With the known operation count going like \(\mathrm{O}\left(n^{3}\right)\), we can write
\[T_{n}=k n^{3} \nonumber \]
for some unknown constant \(k\). To determine how much longer the multiplication of two \(2 n \times 2 n\) matrices will take, we write
\[\begin{aligned} T_{2 n} &=k(2 n)^{3} \\ &=8 k n^{3} \\ &=8 T_{n} \end{aligned} \nonumber \]
so that doubling the size of the matrix is expected to increase the computational time by a factor of \(2^{3}=8\).
Running MATLAB on my computer, the multiplication of two \(2048 \times 2048\) matrices took about \(0.75 \mathrm{sec}\). The multiplication of two \(4096 \times 4096\) matrices took about \(6 \mathrm{sec}\), which is 8 times longer. Timing of code in MATLAB can be found using the built-in stopwatch functions tic and toc.
What is the operation count and therefore the scaling of Gaussian elimination? Consider an elimination step with the pivot in the \(i\) th row and \(i\) th column. There are both \(n-i\) rows below the pivot and \(n-i\) columns to the right of the pivot. To perform elimination of one row, each matrix element to the right of the pivot must be multiplied by a factor and added to the row underneath. This must be done for all the rows. There are therefore \((n-i)(n-i)\) multiplication-additions to be done for this pivot. Since we are interested in only the scaling of the algorithm, I will just count a multiplication-addition as one operation.
To find the total operation count, we need to perform elimination using \(n-1\) pivots, so that
\[\begin{aligned} \text { op. counts } &=\sum_{i=1}^{n-1}(n-i)^{2} \\ &=(n-1)^{2}+(n-2)^{2}+\ldots(1)^{2} \\ &=\sum_{i=1}^{n-1} i^{2} \end{aligned} \nonumber \]
Three summation formulas will come in handy. They are
\[\begin{aligned} &\sum_{i=1}^{n} 1=n \\ &\sum_{i=1}^{n} i=\frac{1}{2} n(n+1) \\ &\sum_{i=1}^{n} i^{2}=\frac{1}{6} n(2 n+1)(n+1) \end{aligned} \nonumber \]
which can be proved by mathematical induction, or derived by some tricks.
The operation count for Gaussian elimination is therefore
\[\begin{aligned} \text { op. counts } &=\sum_{i=1}^{n-1} i^{2} \\ &=\frac{1}{6}(n-1)(2 n-1)(n) \end{aligned} \nonumber \]
The leading-order term is therefore \(n^{3} / 3\), and we say that Gaussian elimination scales like \(\mathrm{O}\left(n^{3}\right)\). Since LU decomposition with partial pivoting is essentially Gaussian elimination, that algorithm also scales like \(\mathrm{O}\left(n^{3}\right)\). However, once the LU decomposition of a matrix A is known, the solution of \(\mathrm{A} \mathbf{x}=\mathbf{b}\) can proceed by a forward and backward substitution. How does a backward substitution, say, scale? For backward substitution, the matrix equation to be solved is of the form
\[\left(\begin{array}{ccccc} a_{1,1} & a_{1,2} & \cdots & a_{1, n-1} & a_{1, n} \\ 0 & a_{2,2} & \cdots & a_{2, n-1} & a_{2, n} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \cdots & a_{n-1, n-1} & a_{n-1, n} \\ 0 & 0 & \cdots & 0 & a_{n, n} \end{array}\right)\left(\begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{n-1} \\ x_{n} \end{array}\right)=\left(\begin{array}{c} b_{1} \\ b_{2} \\ \vdots \\ b_{n-1} \\ b_{n} \end{array}\right) \nonumber \]
The solution for \(x_{i}\) is found after solving for \(x_{j}\) with \(j>i\). The explicit solution for \(x_{i}\) is given by
\[x_{i}=\frac{1}{a_{i, i}}\left(b_{i}-\sum_{j=i+1}^{n} a_{i, j} x_{j}\right) \nonumber \]
The solution for \(x_{i}\) requires \(n-i+1\) multiplication-additions, and since this must be done for \(n\) such \(i^{\prime}\) s, we have
\[\begin{aligned} \text { op. counts } &=\sum_{i=1}^{n} n-i+1 \\ &=n+(n-1)+\cdots+1 \\ &=\sum_{i=1}^{n} i \\ &=\frac{1}{2} n(n+1) \end{aligned} \nonumber \]
The leading-order term is \(n^{2} / 2\) and the scaling of backward substitution is \(\mathrm{O}\left(n^{2}\right)\). After the LU decomposition of a matrix A is found, only a single forward and backward substitution is required to solve \(A \mathbf{x}=\mathbf{b}\), and the scaling of the algorithm to solve this matrix equation is therefore still \(\mathrm{O}\left(n^{2}\right)\). For large \(n\), one should expect that solving \(\mathbf{A x}=\mathbf{b}\) by a forward and backward substitution should be substantially faster than a direct solution using Gaussian elimination.