2.10: LU
\(LU\) Factorization
An \(LU\) factorization of a matrix involves writing the given matrix as the product of a lower triangular matrix \(L\) which has the main diagonal consisting entirely of ones, and an upper triangular matrix \(U\) in the indicated order. This is the version discussed here but it is sometimes the case that the \(L\) has numbers other than 1 down the main diagonal. It is still a useful concept. The \(L\) goes with “lower” and the \(U\) with “upper”.
It turns out many matrices can be written in this way and when this is possible, people get excited about slick ways of solving the system of equations, \(AX=B\) . It is for this reason that you want to study the \(LU\) factorization. It allows you to work only with triangular matrices. It turns out that it takes about half as many operations to obtain an \(LU\) factorization as it does to find the row reduced echelon form.
First it should be noted not all matrices have an \(LU\) factorization and so we will emphasize the techniques for achieving it rather than formal proofs.
Can you write \(\left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \end{array} \right]\) in the form \(LU\) as just described?
Solution
To do so you would need \[\left[ \begin{array}{rr} 1 & 0 \\ x & 1 \end{array} \right] \left[ \begin{array}{rr} a & b \\ 0 & c \end{array} \right] = \ \left[ \begin{array}{cc} a & b \\ xa & xb+c \end{array} \right] =\left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \end{array} \right] .\nonumber \]
Therefore, \(b=1\) and \(a=0.\) Also, from the bottom rows, \(xa=1\) which can’t happen and have \(a=0.\) Therefore, you can’t write this matrix in the form \(% LU.\) It has no \(LU\) factorization. This is what we mean above by saying the method lacks generality.
Nevertheless the method is often extremely useful, and we will describe below one the many methods used to produce an \(LU\) factorization when possible.
Finding An \(LU\) Factorization By Inspection
Which matrices have an \(LU\) factorization? It turns out it is those whose row-echelon form can be achieved without switching rows. In other words matrices which only involve using row operations of type 2 or 3 to obtain the row-echelon form.
Find an \(LU\) factorization of \(A=\left[ \begin{array}{cccc} 1 & 2 & 0 & 2 \\ 1 & 3 & 2 & 1 \\ 2 & 3 & 4 & 0 \end{array} \right] .\)
Solution
One way to find the \(LU\) factorization is to simply look for it directly. You need
\[\left[ \begin{array}{cccc} 1 & 2 & 0 & 2 \\ 1 & 3 & 2 & 1 \\ 2 & 3 & 4 & 0 \end{array} \right] =\left[ \begin{array}{ccc} 1 & 0 & 0 \\ x & 1 & 0 \\ y & z & 1 \end{array} \right] \left[ \begin{array}{cccc} a & d & h & j \\ 0 & b & e & i \\ 0 & 0 & c & f \end{array} \right] .\nonumber \]
Then multiplying these you get \[ \ \left[ \begin{array}{cccc} a & d & h & j \\ xa & xd+b & xh+e & xj+i \\ ya & yd+zb & yh+ze+c & yj+iz+f \end{array} \right]\nonumber \] and so you can now tell what the various quantities equal. From the first column, you need \(a=1,x=1,y=2.\) Now go to the second column. You need \(d=2,xd+b=3\) so \(b=1,yd+zb=3\) so \(z=-1.\) From the third column, \(h=0,e=2,c=6.\) Now from the fourth column, \(j=2,i=-1,f=-5.\) Therefore, an \(LU\) factorization is \[\left[ \begin{array}{rrr} 1 & 0 & 0 \\ 1 & 1 & 0 \\ 2 & -1 & 1 \end{array} \right] \left[ \begin{array}{rrrr} 1 & 2 & 0 & 2 \\ 0 & 1 & 2 & -1 \\ 0 & 0 & 6 & -5 \end{array} \right] \nonumber .\] You can check whether you got it right by simply multiplying these two.
\(LU\) Factorization, Multiplier Method
Remember that for a matrix \(A\) to be written in the form \(A=LU\) , you must be able to reduce it to its row-echelon form without interchanging rows. The following method gives a process for calculating the \(LU\) factorization of such a matrix \(A\) .
Find an \(LU\) factorization for \[\left[ \begin{array}{rrr} 1 & 2 & 3 \\ 2 & 3 & 1 \\ -2 & 3 & -2 \end{array} \right]\nonumber \]
Solution
Write the matrix as the following product. \[\left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \right] \left[ \begin{array}{rrr} 1 & 2 & 3 \\ 2 & 3 & 1 \\ -2 & 3 & -2 \end{array} \right]\nonumber \]
In the matrix on the right, begin with the left row and zero out the entries below the top using the row operation which involves adding a multiple of a row to another row. You do this and also update the matrix on the left so that the product will be unchanged. Here is the first step. Take \(-2\) times the top row and add to the second. Then take \(2\) times the top row and add to the second in the matrix on the left. \[\left[ \begin{array}{rrr} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 0 & 0 & 1 \end{array} \right] \left[ \begin{array}{rrr} 1 & 2 & 3 \\ 0 & -1 & -5 \\ -2 & 3 & -2 \end{array} \right]\nonumber \] The next step is to take \(2\) times the top row and add to the bottom in the matrix on the right. To ensure that the product is unchanged, you place a \(% -2\) in the bottom left in the matrix on the left. Thus the next step yields \[\left[ \begin{array}{rrr} 1 & 0 & 0 \\ 2 & 1 & 0 \\ -2 & 0 & 1 \end{array} \right] \left[ \begin{array}{rrr} 1 & 2 & 3 \\ 0 & -1 & -5 \\ 0 & 7 & 4 \end{array} \right]\nonumber \] Next take \(7\) times the middle row on right and add to bottom row. Updating the matrix on the left in a similar manner to what was done earlier, \[\left[ \begin{array}{rrr} 1 & 0 & 0 \\ 2 & 1 & 0 \\ -2 & -7 & 1 \end{array} \right] \left[ \begin{array}{rrr} 1 & 2 & 3 \\ 0 & -1 & -5 \\ 0 & 0 & -31 \end{array} \right]\nonumber \] At this point, stop. You are done.
The method just described is called the multiplier method.
Below is a video on the LU decomposition using elementart matrices.
Solving Systems using \(LU\) Factorization
One reason people care about the \(LU\) factorization is it allows the quick solution of systems of equations. Here is an example.
Suppose you want to find the solutions to \[\left[ \begin{array}{rrrr} 1 & 2 & 3 & 2 \\ 4 & 3 & 1 & 1 \\ 1 & 2 & 3 & 0 \end{array} \right] \left[ \begin{array}{c} x \\ y \\ z \\ w \end{array} \right] =\left[ \begin{array}{c} 1 \\ 2 \\ 3 \end{array} \right] .\nonumber \]
Solution
Of course one way is to write the augmented matrix and grind away. However, this involves more row operations than the computation of the \(LU\) factorization and it turns out that the \(LU\) factorization can give the solution quickly. Here is how. The following is an \(LU\) factorization for the matrix. \[\left[ \begin{array}{rrrr} 1 & 2 & 3 & 2 \\ 4 & 3 & 1 & 1 \\ 1 & 2 & 3 & 0 \end{array} \right] \ = \ \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 4 & 1 & 0 \\ 1 & 0 & 1 \end{array} \right] \left[ \begin{array}{rrrr} 1 & 2 & 3 & 2 \\ 0 & -5 & -11 & -7 \\ 0 & 0 & 0 & -2 \end{array} \right] .\nonumber\]
Let \(UX=Y\) and consider \(LY=B\) where in this case, \(B=\left[ 1,2,3\right] ^{T}\) . Thus \[ \ \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 4 & 1 & 0 \\ 1 & 0 & 1 \end{array} \right] \left[ \begin{array}{c} y_{1} \\ y_{2} \\ y_{3} \end{array} \right] =\left[ \begin{array}{c} 1 \\ 2 \\ 3 \end{array} \right]\nonumber \] which yields very quickly that \(Y=\left[ \begin{array}{r} 1 \\ -2 \\ 2 \end{array} \right] \ .\)
Now you can find \(X\) by solving \(UX=Y\) . Thus in this case, \[\left[ \begin{array}{rrrr} 1 & 2 & 3 & 2 \\ 0 & -5 & -11 & -7 \\ 0 & 0 & 0 & -2 \end{array} \right] \left[ \begin{array}{c} x \\ y \\ z \\ w \end{array} \right] =\left[ \begin{array}{r} 1 \\ -2 \\ 2 \end{array} \right]\nonumber \] which yields \[X=\left[ \begin{array}{c} - \frac{3}{5}+\frac{7}{5}t \\ \frac{9}{5}-\frac{11}{5}t \\ t \\ -1 \end{array} \right] ,\enspace t\in \mathbb{R}\text{.}\nonumber \]
Below is a video on the LU decomposition using the shortcut method.
Justification for the Multiplier Method
Why does the multiplier method work for finding the \(LU\) factorization? Suppose \(A\) is a matrix which has the property that the row-echelon form for \(A\) may be achieved without switching rows. Thus every row which is replaced using this row operation in obtaining the row-echelon form may be modified by using a row which is above it.
Let \(L\) be a lower (upper) triangular matrix \(m\times m\) which has ones down the main diagonal. Then \(L^{-1}\) also is a lower (upper) triangular matrix which has ones down the main diagonal. In the case that \(L\) is of the form \[L=\left[ \begin{array}{cccc} 1 & & & \\ a_{1} & 1 & & \\ \vdots & & \ddots & \\ a_{n} & & & 1 \end{array} \right] \label{4nove1h}\] where all entries are zero except for the left column and main diagonal, it is also the case that \(L^{-1}\) is obtained from \(L\) by simply multiplying each entry below the main diagonal in \(L\) with \(-1\) . The same is true if the single nonzero column is in another position.
- Proof
-
Consider the usual setup for finding the inverse \(\left[ \begin{array}{cc} L & I \end{array} \right] .\) Then each row operation done to \(L\) to reduce to row reduced echelon form results in changing only the entries in \(I\) below the main diagonal. In the special case of \(L\) given in \(\eqref{4nove1h}\) or the single nonzero column is in another position, multiplication by \(-1\) as described in the lemma clearly results in \(L^{-1}\) .
For a simple illustration of the last claim, \[\left[ \begin{array}{cccccc} 1 & 0 & 0 & 1 & 0 & 0 \\ 0 & 1 & 0 & 0 & 1 & 0 \\ 0 & a & 1 & 0 & 0 & 1 \end{array} \right] \rightarrow \left[ \begin{array}{cccccc} 1 & 0 & 0 & 1 & 0 & 0 \\ 0 & 1 & 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 & -a & 1 \end{array} \right]\nonumber \]
Now let \(A\) be an \(m\times n\) matrix, say \[A=\left[ \begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{array} \right]\nonumber \] and assume \(A\) can be row reduced to an upper triangular form using only row operation 3. Thus, in particular, \(a_{11}\neq 0\) . Multiply on the left by \(E_{1}=\) \[\left[ \begin{array}{cccc} 1 & 0 & \cdots & 0 \\ - \frac{a_{21}}{a_{11}} & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ -\frac{a_{m1}}{a_{11}} & 0 & \cdots & 1 \end{array} \right]\nonumber \] This is the product of elementary matrices which make modifications in the first column only. It is equivalent to taking \(-a_{21}/a_{11}\) times the first row and adding to the second. Then taking \(-a_{31}/a_{11}\) times the first row and adding to the third and so forth. The quotients in the first column of the above matrix are the multipliers. Thus the result is of the form \[E_{1}A=\left[ \begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1n}^{\prime } \\ 0 & a_{22}^{\prime } & \cdots & a_{2n}^{\prime } \\ \vdots & \vdots & & \vdots \\ 0 & a_{m2}^{\prime } & \cdots & a_{mn}^{\prime } \end{array} \right]\nonumber \] By assumption, \(a_{22}^{\prime }\neq 0\) and so it is possible to use this entry to zero out all the entries below it in the matrix on the right by multiplication by a matrix of the form \(E_{2}=\left[ \begin{array}{cc} 1 & \mathbf{0} \\ \mathbf{0} & E \end{array} \right]\) where \(E\) is an \(\left[ m-1\right] \times \left[ m-1\right]\) matrix of the form \[E=\left[ \begin{array}{cccc} 1 & 0 & \cdots & 0 \\ -\frac{a_{32}^{\prime }}{a_{22}^{\prime }} & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ -\frac{a_{m2}^{\prime }}{a_{22}^{\prime }} & 0 & \cdots & 1 \end{array} \right]\nonumber \] Again, the entries in the first column below the 1 are the multipliers. Continuing this way, zeroing out the entries below the diagonal entries, finally leads to \[E_{m-1}E_{n-2}\cdots E_{1}A=U\nonumber \] where \(U\) is upper triangular. Each \(E_{j}\) has all ones down the main diagonal and is lower triangular. Now multiply both sides by the inverses of the \(E_{j}\) in the reverse order \(.\) This yields \[A=E_{1}^{-1}E_{2}^{-1}\cdots E_{m-1}^{-1}U\nonumber \] By Lemma \(\PageIndex{1}\) , this implies that the product of those \(E_{j}^{-1}\) is a lower triangular matrix having all ones down the main diagonal.
The above discussion and lemma gives the justification for the multiplier method. The expressions \[\frac{-a_{21}}{a_{11}},\frac{-a_{31}}{a_{11}},\cdots, \frac{-a_{m1}}{a_{11}}\nonumber \] denoted respectively by \(M_{21},\cdots ,M_{m1}\) to save notation which were obtained in building \(E_{1}\) are the multipliers. Then according to the lemma, to find \(E_{1}^{-1}\) you simply write \[\left[ \begin{array}{cccc} 1 & 0 & \cdots & 0 \\ -M_{21} & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ -M_{m1} & 0 & \cdots & 1 \end{array} \right]\nonumber \] Similar considerations apply to the other \(E_{j}^{-1}.\) Thus \(L\) is a product of the form \[\left[ \begin{array}{cccc} 1 & 0 & \cdots & 0 \\ -M_{21} & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ -M_{m1} & 0 & \cdots & 1 \end{array} \right] \cdots \left[ \begin{array}{cccc} 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \vdots & 0 & \ddots & \vdots \\ 0 & \cdots & -M_{m\left[ m-1\right] } & 1 \end{array} \right]\nonumber \] each factor having at most one nonzero column, the position of which moves from left to right in scanning the above product of matrices from left to right. It follows from what we know about the effect of multiplying on the left by an elementary matrix that the above product is of the form \[\left[ \begin{array}{ccccc} 1 & 0 & \cdots & 0 & 0 \\ -M_{21} & 1 & \cdots & 0 & 0 \\ \vdots & -M_{32} & \ddots & \vdots & \vdots \\ -M_{\left[ m-1\right] 1} & \vdots & \cdots & 1 & 0 \\ -M_{m1} & -M_{m2} & \cdots & -M_{m\left[m-1\right]} & 1 \end{array} \right]\nonumber \]
In words, beginning at the left column and moving toward the right, you simply insert, into the corresponding position in the identity matrix, \(-1\) times the multiplier which was used to zero out an entry in that position below the main diagonal in \(A,\) while retaining the main diagonal which consists entirely of ones. This is \(L.\)