34.3: Focus on LU

Last updated
Save as PDF

Page ID: 70171

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

In this assignment we will create algorithms that factorize invertible matrices (i.e., square matrices with linearly independent columns), \(A\), into the following decomposition (note: the terms decomposition and factorization are used interchangeably in literature):

LU Decomposition: \(A=LU\)

Each matrix in these decompositions is the matrix product of elementary matrices. Recall that an elementary matrix differs from the identity matrix (the square matrix with \(1\)s on the diagonal and \(0\)s elsewhere) by an elementary row operation.

The use of these matrix decompositions in Numerical Linear Algebra is motivated by computational efficiency or reduction of computational complexity (recall “Big-O notation” and counting the loops in your matrix multiplication algorithm) and numerical stability. Solving our old friend \(Ax=b\) by computing the inverse of \(A\), denoted \(A^{−1}\), and taking the matrix-vector product \(A^{−1}b=x\) is computational resource intensive and numerically unstable, in general. If the \(LU\) decomposition of \(A\) exists, then it will be less costly and more stable to:

Solve \(Ly=b\) for \(y\) by forward-substitution; and then
Solve \(Ux=y\) for \(x\) by backward-substitution

A final note to relate this assignment to the beginning of the course: The algorithms presented here are of a different class than the Jacobi Algorithm and Gauss-Siedel Algorithm. These are iterative algorithms. As you now know, this means that the algorithmic procedure is applied once, twice, and so on, until the output is within a desired tolerance, or until the process has been executed a given number of times (e.g., 100 iterations).

Gaussian Elimination & LU Decomposition

Recall that Gaussian elimination converts an arbitrary matrix, \(A\), into its row echelon form. For our purposes, let’s suppose that \(A\) is a square matrix and, therefore, an \(n \times n\) matrix. To simplify our tasks, let us further impose the condition that \(A\) is invertible. Thus, the columns of \(A\) are linearly independent. This means that Gaussian elimination will yield an upper-triangular matrix. Let us denote this matrix \(U\) for upper-triangular.

If there were a function, \(f\) that could take \(A\) and output \(U\), we could think of Gaussian Elimination as the following process:

\[f(A)=U \nonumber\]

With this information, we may now rewrite our equation from above as:

\[L^{-1}A = U \nonumber\]

You may have noticed the superscript in \(L^{−1}\). This just says that \(L^{−1}\) is the inverse of some matrix \(L\). And for any invertible matrix, \(L\), we have that the matrix products:

\[L^{-1}L = LL^{-1} = I \nonumber\]

This is analogous to (for every real number \(a \neq 0\)):

\[a^{-1}\times a = a\times a^{-1} = 1 \nonumber\]

Using the rules of matrix multiplication, verify the formula above by computing the following:

\[\begin{split}
L_{1}^{-1}L_{1} =
\left(
\begin{array}{*5{c}}
1 & 0 & 0 & 0 & 0 \\
-l_{21} & 1 & 0 & 0 & 0 \\
-l_{31} & 0 & 1 & 0 & 0 \\
-l_{41} & 0 & 0 & 1 & 0 \\
-l_{51} & 0 & 0 & 0 & 1 \\
\end{array}
\right) \left(
\begin{array}{*5{c}}
1 & 0 & 0 & 0 & 0 \\
l_{21} & 1 & 0 & 0 & 0 \\
l_{31} & 0 & 1 & 0 & 0 \\
l_{41} & 0 & 0 & 1 & 0 \\
l_{51} & 0 & 0 & 0 & 1 \\
\end{array}
\right)
=
\left(
\begin{array}{*5{c}}
1 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 & 1 \\
\end{array}
\right)
= I
\end{split} \nonumber\]

\[\begin{split}
L_{2}^{-1}L_{2} =
\left(
\begin{array}{*5{c}}
1 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 \\
0 & -l_{32} & 1 & 0 & 0 \\
0 & -l_{42} & 0 & 1 & 0 \\
0 & -l_{52} & 0 & 0 & 1 \\
\end{array}
\right) \left(
\begin{array}{*5{c}}
1 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 \\
0 & l_{32} & 1 & 0 & 0 \\
0 & l_{42} & 0 & 1 & 0 \\
0 & l_{52} & 0 & 0 & 1 \\
\end{array}
\right)
=
\left(
\begin{array}{*5{c}}
1 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 & 1 \\
\end{array}
\right)
= I
\end{split} \nonumber\]

\[\begin{split}
L_{4}^{-1}L_{4} =
\left(
\begin{array}{*5{c}}
1 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & -l_{54} & 1 \\
\end{array}
\right) \left(
\begin{array}{*5{c}}
1 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & l_{54} & 1 \\
\end{array}
\right)
=
\left(
\begin{array}{*5{c}}
1 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 & 1 \\
\end{array}
\right)
= I
\end{split} \nonumber\]

To understand \(L^{−1}\) more deeply, let’s turn our attention back to Gaussian elimination for a moment. Take as a given that, for a “sufficiently nice” \(n \times n\) matrix \(A\), the matrix \(L^{−1}\) that takes \(A\) to its upper-triangular or row echelon form, \(U\), has the structure:

\[L^{-1} = L_{n-1}L_{n-2} \ldots L_{2}L_{1} \nonumber\]

Each of the \(L_i\)is above is an elementary matrix that zeros out the subdiagonal entries of the \(i^{th}\) column of \(A\). This is the \(i^{th}\) step of Gaussian Elimination applied to the entire \(i^{th}\) column of A below the \(i^{th}\) diagonal element.

Let’s show this by computation of \(L_i\) for a “nice” matrix \(A\).

## Import all necessary packages
%matplotlib inline
import scipy.sparse as sparse #this helps to speed up the algorithms, but you will not use it. 
import numpy as np
import matplotlib.pyplot as plt
import sympy as sym
sym.init_printing(use_unicode=True)

## These will allow us to see a cool simulation of the Heat Equation problem (if we compute our answers correctly!)
from matplotlib import animation, rc
from IPython.display import HTML

Gaussian Elimination by Elementary Matrices, \(L_i\)

Let \(A\) be the following matrix:

\[\begin{split}A =
\begin{bmatrix}
2 & 1 & 1 & 0 \\
4 & 3 & 3 & 1 \\
8 & 7 & 9 & 5 \\
6 & 7 & 9 & 8 \\
\end{bmatrix}
\end{split} \nonumber\]

Do This

Create a \(4 \times 4\) unit lower-triangular matrix, \(L_1\) that eliminates all of the subdiagonal entries of the first column of \(A\). Display the matrix \(L_1\) using SymPy.

A = np.matrix([[2,1,1,0],[4,3,3,1],[8,7,9,5],[6,7,9,8]]) # Here is A for your convenience
As = sym.Matrix(A)
As

## Type your answer here ##
L1 = np.matrix([[1,,,],[,1,,],[,,1,],[,,,1]])

We should now have the following:

\[\begin{split}L_{1}A = A^{(1)} =
\begin{bmatrix}
2 & 1 & 1 & 0 \\
0 & u_{22} & u_{23} & u_{24} \\
0 & x & x & x \\
0 & x & x & x \\
\end{bmatrix}
=
\begin{bmatrix}
u_{11} & u_{12} & u_{13} & u_{14} \\
0 & u_{22} & u_{23} & u_{24} \\
0 & x & x & x \\
0 & x & x & x \\
\end{bmatrix}
\end{split} \nonumber\]

Since our first row remained unchanged, we know that our \(u_{1i}\) (the first row entries of \(U\)) are now determined. Similarly, we have that the \(u_{2i}\) (the second row entries of \(U\)) are determined as well. The \(x\) elements are elements that have changed, but are not yet in their final form. Note: Your \(u_{ij}\) will be whole, or integer (\(\mathbb{Z}\)), numbers.

Do This

Left-multiply \(A\) by \(L_1\) to confirm that all of the subdiagonal entries of the first column of \(A^{(1)}\) are zero. Display the result via SymPy.

## Type your answer here ##

Our next step will be to eliminate all nonzero entries from the second column of \(A^{(1)}=L_{1}A\) by left multiplication of \(L_2\). This should yield:

\[\begin{split}\begin{align}A^{(2)} &= L_{2}A^{(1)} \nonumber\\
&= L_{2}L_{1}A \nonumber\\
&=
\begin{bmatrix}
u_{11} & u_{12} & u_{13} & u_{14} \\
0 & u_{22} & u_{23} & u_{24} \\
0 & 0 & u_{33} & u_{34} \\
0 & 0 & x & x \\
\end{bmatrix}\nonumber
\end{align}
\end{split} \nonumber\]

Do This

Create a \(4 \times 4\) unit lower-triangular matrix, \(L_2\) that eliminates all of the subdiagonal entries of the second column of \(A^{(1)}\) yielding \(A^{(2)}\) as above. Display the matrix \(L_2\) using SymPy.

## Type your answer here ##
L2 = np.matrix([[1,,,],[,1,,],[,,1,],[,,,1]]) # for your convenience

Do This

Left-multiply \(A^{(1)}\) by \(L_2\) to confirm that all of the subdiagonal entries of column 2 of \(A^{(2)}\) are zero. Display the result via SymPy. Note: Your \(u_{ij}\) will be whole, or Integer (\(\mathbb{Z}\)), numbers.

## Type your answer here ##

We should now have:

\[\begin{split}
\begin{align}A^{(2)} &= L_{2}A^{(1)} \nonumber\\
&= L_{2}L_{1}A \nonumber\\
&=
\begin{bmatrix}
u_{11} & u_{12} & u_{13} & u_{14} \\
0 & u_{22} & u_{23} & u_{24} \\
0 & 0 & u_{33} & u_{34} \\
0 & 0 & x & x \\
\end{bmatrix}\nonumber
\end{align}
\end{split} \nonumber\]

We now want to build the final matrix \(L_3\) that will take our matrix \(A^{(2)}\) to upper-triangular form. So, let’s do that!

Do This

Create a \(4 \times 4\) unit lower-triangular matrix, \(L_3\) that eliminates all of the subdiagonal entries of the third column of \(A^{(2)}\) yielding:

\[
\begin{align}A^{(3)} &= L_{3}A^{(2)} \nonumber\\
&= L_{3}L_{2}A^{(1)} \nonumber\\
&= L_{3}L_{2}L_{1}A \nonumber\\
&= U \nonumber
\end{align}
\nonumber\]

Display the matrix \(L_3\) using SymPy.

## Type your answer here ##
L3 = np.matrix([[1,,,],[,1,,],[,,1,],[,,,1]]) # for your convenience

Do This

Left-multiply \(A^{(2)}\) by \(L_3\) to confirm that all of the subdiagonal entries of column 3 of \(A^{(3)}\) are zero. Display the result via SymPy. Note: Your \(u_{ij}\) will be whole, or integer (\(\mathbb{Z}\)), numbers. You should now notice that \(A^{(3)} = U\) is in row echelon form, and, hence, \(U\) is an upper-triangular matrix with \(0\)s below the diagonal!

## Type your answer here ##

Congratulations!

You have just decomposed your first matrix via the process below (and should now have a matrix, \(U\), that looks like the one below):

\[\begin{split}
\begin{align}L^{-1}A &= L_{3}L_{2}L_{1}A \nonumber\\
&= L_{3}L_{2}A^{(1)} \nonumber\\
&= L_{3}A^{(2)} \nonumber\\
&= A^{(3)} \nonumber\\
&= U \nonumber\\
&=
\begin{bmatrix}
2 & 1 & 1 & 0 \\
0 & 1 & 1 & 1 \\
0 & 0 & 2 & 2 \\
0 & 0 & 0 & 2
\end{bmatrix} \nonumber
\end{align}
\end{split} \nonumber\]

Do This

Finally, let’s explicitly generate the matrices \(L^{−1}\) and \(L\). Then, display them using SymPy.

It will be helpful to use the following:

\[\begin{align}L^{-1} &= L_{n-1}L_{n-2} \ldots L_{2}L_{1} \nonumber \end{align} \nonumber\]

and

\[\begin{align}L &= (L^{-1})^{-1} \nonumber \\
&= (L_{n-1}L_{n-2}...L_{2}L_{1})^{-1} \nonumber \\
&= L_{1}^{-1}L_{2}^{-1}...L_{n-2}^{-1}L_{n-1}^{-1} \nonumber
\end{align}
\nonumber\]

If you’re stuck, refer to the paragraph at the beginning of this section for the explicit formula. Recall: \(L^{-1}L = LL^{-1} = I\)

## Type your answer here ##

Do This

Look at all the matrices \(L_i\) and see the connections between the final \(L\).

print(L1)
print(L2)
print(L3)
print(L)

For our last bit of LU decomposition fun, let’s confirm that your matrices \(L\) and \(U\) fulfill the equality:

\[A = LU \nonumber\]

Indeed, there is a function in SymPy that will compute the LU decomposition for us.

Do This

Run the following function and print its outputs:

\[\texttt{L_actual, U_actual, _ = As.LUdecomposition()} \nonumber\]

Then, compute:

\[\texttt{L_actual*U_actual - As} \nonumber\]

and confirm that it outputs the zero matrix.

## Type your answer here ##

General LU Decomposition Algorithm

Do This

Using the scaffolded code below, complete the LU decomposition algorithm. (It may be helpful to test your code on the matrix \(A\) from above.)

## Type your answer here ##
C = np.matrix([[2,1,1,0],[4,3,3,1],[8,7,9,5],[6,7,9,8]]) # to test

def LU_decomp(B):
    n = len(B)
    U = B.copy()
    L = np.identity(n)
    for k in np.arange(0,n-1):
        for j in np.arange(k+1,n):
            L[j,k] =
            U[j,k:n] = U[,:] - L[,]*U[,:]
    return np.mat(L), np.mat(U)

L1,U1 = LU_dec(C) # syntax for returning matrices
np.linalg.norm(L1*U1 - A) # Test: should return 0

Solve \(Ax=b\) via LU Decomposition

You may wish to refer to the introduction of this assignment for a general overview of how to use LU Decomposition to solve \(Ax=b\).

Do This

Using the scaffolded code below, complete the LU solver algorithm. The algorithm should solve \(Ly = b\) for \(y\) via Forward-Substitution and then \(Ux=y\) for \(x\) by Backward-Substitution. (It may be helpful to test your code on a matrix \(A\) and vector \(b\) from homework 1 or another source.)

## Type your answer here ##
def LU_Axb_solve(A,b):
    L,U = LU_decomp(A)
    n = len(A)
    # Forward-Substitution: Ly = b for y
    y = np.zeros((,))
    for i in np.arange(0,n):
        y[i] = b[i].copy()
        for j in np.arange(0,i):
            y[] = y[] - L[,]*y[]
       
    # Backward-Substitution: Ux = y for x
    x = np.zeros((n,1))
    for i in np.arange(n-1,-1,-1):
        x[] = y[].copy()
        for j in np.arange(n-1,i,-1):
            x[] = x[] - U[,]*x[]
        x[] = x[]/U[,]
    
    return np.mat(x)