Skip to main content
Mathematics LibreTexts

34.3: Focus on LU

  • Page ID
    70171
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    In this assignment we will create algorithms that factorize invertible matrices (i.e., square matrices with linearly independent columns), \(A\), into the following decomposition (note: the terms decomposition and factorization are used interchangeably in literature):

    • LU Decomposition: \(A=LU\)

    Each matrix in these decompositions is the matrix product of elementary matrices. Recall that an elementary matrix differs from the identity matrix (the square matrix with \(1\)s on the diagonal and \(0\)s elsewhere) by an elementary row operation.

    The use of these matrix decompositions in Numerical Linear Algebra is motivated by computational efficiency or reduction of computational complexity (recall “Big-O notation” and counting the loops in your matrix multiplication algorithm) and numerical stability. Solving our old friend \(Ax=b\) by computing the inverse of \(A\), denoted \(A^{−1}\), and taking the matrix-vector product \(A^{−1}b=x\) is computational resource intensive and numerically unstable, in general. If the \(LU\) decomposition of \(A\) exists, then it will be less costly and more stable to:

    1. Solve \(Ly=b\) for \(y\) by forward-substitution; and then
    2. Solve \(Ux=y\) for \(x\) by backward-substitution

    A final note to relate this assignment to the beginning of the course: The algorithms presented here are of a different class than the Jacobi Algorithm and Gauss-Siedel Algorithm. These are iterative algorithms. As you now know, this means that the algorithmic procedure is applied once, twice, and so on, until the output is within a desired tolerance, or until the process has been executed a given number of times (e.g., 100 iterations).

    Gaussian Elimination & LU Decomposition

    Recall that Gaussian elimination converts an arbitrary matrix, \(A\), into its row echelon form. For our purposes, let’s suppose that \(A\) is a square matrix and, therefore, an \(n \times n\) matrix. To simplify our tasks, let us further impose the condition that \(A\) is invertible. Thus, the columns of \(A\) are linearly independent. This means that Gaussian elimination will yield an upper-triangular matrix. Let us denote this matrix \(U\) for upper-triangular.

    If there were a function, \(f\) that could take \(A\) and output \(U\), we could think of Gaussian Elimination as the following process:

    \[f(A)=U \nonumber\]

    With this information, we may now rewrite our equation from above as:

    \[L^{-1}A = U \nonumber\]

    You may have noticed the superscript in \(L^{−1}\). This just says that \(L^{−1}\) is the inverse of some matrix \(L\). And for any invertible matrix, \(L\), we have that the matrix products:

    \[L^{-1}L = LL^{-1} = I \nonumber\]

    This is analogous to (for every real number \(a \neq 0\)):

    \[a^{-1}\times a = a\times a^{-1} = 1 \nonumber\]

    Using the rules of matrix multiplication, verify the formula above by computing the following:

    \[\begin{split}
    L_{1}^{-1}L_{1} =
    \left(
    \begin{array}{*5{c}}
    1 & 0 & 0 & 0 & 0 \\
    -l_{21} & 1 & 0 & 0 & 0 \\
    -l_{31} & 0 & 1 & 0 & 0 \\
    -l_{41} & 0 & 0 & 1 & 0 \\
    -l_{51} & 0 & 0 & 0 & 1 \\
    \end{array}
    \right) \left(
    \begin{array}{*5{c}}
    1 & 0 & 0 & 0 & 0 \\
    l_{21} & 1 & 0 & 0 & 0 \\
    l_{31} & 0 & 1 & 0 & 0 \\
    l_{41} & 0 & 0 & 1 & 0 \\
    l_{51} & 0 & 0 & 0 & 1 \\
    \end{array}
    \right)
    =
    \left(
    \begin{array}{*5{c}}
    1 & 0 & 0 & 0 & 0 \\
    0 & 1 & 0 & 0 & 0 \\
    0 & 0 & 1 & 0 & 0 \\
    0 & 0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 0 & 1 \\
    \end{array}
    \right)
    = I
    \end{split} \nonumber\]

    \[\begin{split}
    L_{2}^{-1}L_{2} =
    \left(
    \begin{array}{*5{c}}
    1 & 0 & 0 & 0 & 0 \\
    0 & 1 & 0 & 0 & 0 \\
    0 & -l_{32} & 1 & 0 & 0 \\
    0 & -l_{42} & 0 & 1 & 0 \\
    0 & -l_{52} & 0 & 0 & 1 \\
    \end{array}
    \right) \left(
    \begin{array}{*5{c}}
    1 & 0 & 0 & 0 & 0 \\
    0 & 1 & 0 & 0 & 0 \\
    0 & l_{32} & 1 & 0 & 0 \\
    0 & l_{42} & 0 & 1 & 0 \\
    0 & l_{52} & 0 & 0 & 1 \\
    \end{array}
    \right)
    =
    \left(
    \begin{array}{*5{c}}
    1 & 0 & 0 & 0 & 0 \\
    0 & 1 & 0 & 0 & 0 \\
    0 & 0 & 1 & 0 & 0 \\
    0 & 0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 0 & 1 \\
    \end{array}
    \right)
    = I
    \end{split} \nonumber\]

    \[\begin{split}
    L_{4}^{-1}L_{4} =
    \left(
    \begin{array}{*5{c}}
    1 & 0 & 0 & 0 & 0 \\
    0 & 1 & 0 & 0 & 0 \\
    0 & 0 & 1 & 0 & 0 \\
    0 & 0 & 0 & 1 & 0 \\
    0 & 0 & 0 & -l_{54} & 1 \\
    \end{array}
    \right) \left(
    \begin{array}{*5{c}}
    1 & 0 & 0 & 0 & 0 \\
    0 & 1 & 0 & 0 & 0 \\
    0 & 0 & 1 & 0 & 0 \\
    0 & 0 & 0 & 1 & 0 \\
    0 & 0 & 0 & l_{54} & 1 \\
    \end{array}
    \right)
    =
    \left(
    \begin{array}{*5{c}}
    1 & 0 & 0 & 0 & 0 \\
    0 & 1 & 0 & 0 & 0 \\
    0 & 0 & 1 & 0 & 0 \\
    0 & 0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 0 & 1 \\
    \end{array}
    \right)
    = I
    \end{split} \nonumber\]

    To understand \(L^{−1}\) more deeply, let’s turn our attention back to Gaussian elimination for a moment. Take as a given that, for a “sufficiently nice” \(n \times n\) matrix \(A\), the matrix \(L^{−1}\) that takes \(A\) to its upper-triangular or row echelon form, \(U\), has the structure:

    \[L^{-1} = L_{n-1}L_{n-2} \ldots L_{2}L_{1} \nonumber\]

    Each of the \(L_i\)is above is an elementary matrix that zeros out the subdiagonal entries of the \(i^{th}\) column of \(A\). This is the \(i^{th}\) step of Gaussian Elimination applied to the entire \(i^{th}\) column of A below the \(i^{th}\) diagonal element.

    Let’s show this by computation of \(L_i\) for a “nice” matrix \(A\).

    ## Import all necessary packages
    %matplotlib inline
    import scipy.sparse as sparse #this helps to speed up the algorithms, but you will not use it. 
    import numpy as np
    import matplotlib.pyplot as plt
    import sympy as sym
    sym.init_printing(use_unicode=True)
    
    ## These will allow us to see a cool simulation of the Heat Equation problem (if we compute our answers correctly!)
    from matplotlib import animation, rc
    from IPython.display import HTML

    Gaussian Elimination by Elementary Matrices, \(L_i\)

    Let \(A\) be the following matrix:

    \[\begin{split}A =
    \begin{bmatrix}
    2 & 1 & 1 & 0 \\
    4 & 3 & 3 & 1 \\
    8 & 7 & 9 & 5 \\
    6 & 7 & 9 & 8 \\
    \end{bmatrix}
    \end{split} \nonumber\]

    Do This

    Create a \(4 \times 4\) unit lower-triangular matrix, \(L_1\) that eliminates all of the subdiagonal entries of the first column of \(A\). Display the matrix \(L_1\) using SymPy.

    A = np.matrix([[2,1,1,0],[4,3,3,1],[8,7,9,5],[6,7,9,8]]) # Here is A for your convenience
    As = sym.Matrix(A)
    As
    ## Type your answer here ##
    L1 = np.matrix([[1,,,],[,1,,],[,,1,],[,,,1]])

    We should now have the following:

    \[\begin{split}L_{1}A = A^{(1)} =
    \begin{bmatrix}
    2 & 1 & 1 & 0 \\
    0 & u_{22} & u_{23} & u_{24} \\
    0 & x & x & x \\
    0 & x & x & x \\
    \end{bmatrix}
    =
    \begin{bmatrix}
    u_{11} & u_{12} & u_{13} & u_{14} \\
    0 & u_{22} & u_{23} & u_{24} \\
    0 & x & x & x \\
    0 & x & x & x \\
    \end{bmatrix}
    \end{split} \nonumber\]

    Since our first row remained unchanged, we know that our \(u_{1i}\) (the first row entries of \(U\)) are now determined. Similarly, we have that the \(u_{2i}\) (the second row entries of \(U\)) are determined as well. The \(x\) elements are elements that have changed, but are not yet in their final form. Note: Your \(u_{ij}\) will be whole, or integer (\(\mathbb{Z}\)), numbers.

    Do This

    Left-multiply \(A\) by \(L_1\) to confirm that all of the subdiagonal entries of the first column of \(A^{(1)}\) are zero. Display the result via SymPy.

    ## Type your answer here ##

    Our next step will be to eliminate all nonzero entries from the second column of \(A^{(1)}=L_{1}A\) by left multiplication of \(L_2\). This should yield:

    \[\begin{split}\begin{align}A^{(2)} &= L_{2}A^{(1)} \nonumber\\
    &= L_{2}L_{1}A \nonumber\\
    &=
    \begin{bmatrix}
    u_{11} & u_{12} & u_{13} & u_{14} \\
    0 & u_{22} & u_{23} & u_{24} \\
    0 & 0 & u_{33} & u_{34} \\
    0 & 0 & x & x \\
    \end{bmatrix}\nonumber
    \end{align}
    \end{split} \nonumber\]

    Do This

    Create a \(4 \times 4\) unit lower-triangular matrix, \(L_2\) that eliminates all of the subdiagonal entries of the second column of \(A^{(1)}\) yielding \(A^{(2)}\) as above. Display the matrix \(L_2\) using SymPy.

    ## Type your answer here ##
    L2 = np.matrix([[1,,,],[,1,,],[,,1,],[,,,1]]) # for your convenience
    Do This

    Left-multiply \(A^{(1)}\) by \(L_2\) to confirm that all of the subdiagonal entries of column 2 of \(A^{(2)}\) are zero. Display the result via SymPy. Note: Your \(u_{ij}\) will be whole, or Integer (\(\mathbb{Z}\)), numbers.

    ## Type your answer here ##

    We should now have:

    \[\begin{split}
    \begin{align}A^{(2)} &= L_{2}A^{(1)} \nonumber\\
    &= L_{2}L_{1}A \nonumber\\
    &=
    \begin{bmatrix}
    u_{11} & u_{12} & u_{13} & u_{14} \\
    0 & u_{22} & u_{23} & u_{24} \\
    0 & 0 & u_{33} & u_{34} \\
    0 & 0 & x & x \\
    \end{bmatrix}\nonumber
    \end{align}
    \end{split} \nonumber\]

    We now want to build the final matrix \(L_3\) that will take our matrix \(A^{(2)}\) to upper-triangular form. So, let’s do that!

    Do This

    Create a \(4 \times 4\) unit lower-triangular matrix, \(L_3\) that eliminates all of the subdiagonal entries of the third column of \(A^{(2)}\) yielding:

    \[
    \begin{align}A^{(3)} &= L_{3}A^{(2)} \nonumber\\
    &= L_{3}L_{2}A^{(1)} \nonumber\\
    &= L_{3}L_{2}L_{1}A \nonumber\\
    &= U \nonumber
    \end{align}
    \nonumber\]

    Display the matrix \(L_3\) using SymPy.

    ## Type your answer here ##
    L3 = np.matrix([[1,,,],[,1,,],[,,1,],[,,,1]]) # for your convenience
    Do This

    Left-multiply \(A^{(2)}\) by \(L_3\) to confirm that all of the subdiagonal entries of column 3 of \(A^{(3)}\) are zero. Display the result via SymPy. Note: Your \(u_{ij}\) will be whole, or integer (\(\mathbb{Z}\)), numbers. You should now notice that \(A^{(3)} = U\) is in row echelon form, and, hence, \(U\) is an upper-triangular matrix with \(0\)s below the diagonal!

    ## Type your answer here ##

    Congratulations!

    You have just decomposed your first matrix via the process below (and should now have a matrix, \(U\), that looks like the one below):

    \[\begin{split}
    \begin{align}L^{-1}A &= L_{3}L_{2}L_{1}A \nonumber\\
    &= L_{3}L_{2}A^{(1)} \nonumber\\
    &= L_{3}A^{(2)} \nonumber\\
    &= A^{(3)} \nonumber\\
    &= U \nonumber\\
    &=
    \begin{bmatrix}
    2 & 1 & 1 & 0 \\
    0 & 1 & 1 & 1 \\
    0 & 0 & 2 & 2 \\
    0 & 0 & 0 & 2
    \end{bmatrix} \nonumber
    \end{align}
    \end{split} \nonumber\]

    Do This

    Finally, let’s explicitly generate the matrices \(L^{−1}\) and \(L\). Then, display them using SymPy.

    It will be helpful to use the following:

    \[\begin{align}L^{-1} &= L_{n-1}L_{n-2} \ldots L_{2}L_{1} \nonumber \end{align} \nonumber\]

    and

    \[\begin{align}L &= (L^{-1})^{-1} \nonumber \\
    &= (L_{n-1}L_{n-2}...L_{2}L_{1})^{-1} \nonumber \\
    &= L_{1}^{-1}L_{2}^{-1}...L_{n-2}^{-1}L_{n-1}^{-1} \nonumber
    \end{align}
    \nonumber\]

    If you’re stuck, refer to the paragraph at the beginning of this section for the explicit formula. Recall: \(L^{-1}L = LL^{-1} = I\)

    ## Type your answer here ##
    Do This

    Look at all the matrices \(L_i\) and see the connections between the final \(L\).

    print(L1)
    print(L2)
    print(L3)
    print(L)

    For our last bit of LU decomposition fun, let’s confirm that your matrices \(L\) and \(U\) fulfill the equality:

    \[A = LU \nonumber\]

    Indeed, there is a function in SymPy that will compute the LU decomposition for us.

    Do This

    Run the following function and print its outputs:

    \[\texttt{L_actual, U_actual, _ = As.LUdecomposition()} \nonumber\]

    Then, compute:

    \[\texttt{L_actual*U_actual - As} \nonumber\]

    and confirm that it outputs the zero matrix.

    ## Type your answer here ##

    General LU Decomposition Algorithm

    Do This

    Using the scaffolded code below, complete the LU decomposition algorithm. (It may be helpful to test your code on the matrix \(A\) from above.)

    ## Type your answer here ##
    C = np.matrix([[2,1,1,0],[4,3,3,1],[8,7,9,5],[6,7,9,8]]) # to test
    
    def LU_decomp(B):
        n = len(B)
        U = B.copy()
        L = np.identity(n)
        for k in np.arange(0,n-1):
            for j in np.arange(k+1,n):
                L[j,k] =
                U[j,k:n] = U[,:] - L[,]*U[,:]
        return np.mat(L), np.mat(U)
    
    L1,U1 = LU_dec(C) # syntax for returning matrices
    np.linalg.norm(L1*U1 - A) # Test: should return 0

    Solve \(Ax=b\) via LU Decomposition

    You may wish to refer to the introduction of this assignment for a general overview of how to use LU Decomposition to solve \(Ax=b\).

    Do This

    Using the scaffolded code below, complete the LU solver algorithm. The algorithm should solve \(Ly = b\) for \(y\) via Forward-Substitution and then \(Ux=y\) for \(x\) by Backward-Substitution. (It may be helpful to test your code on a matrix \(A\) and vector \(b\) from homework 1 or another source.)

    ## Type your answer here ##
    def LU_Axb_solve(A,b):
        L,U = LU_decomp(A)
        n = len(A)
        # Forward-Substitution: Ly = b for y
        y = np.zeros((,))
        for i in np.arange(0,n):
            y[i] = b[i].copy()
            for j in np.arange(0,i):
                y[] = y[] - L[,]*y[]
           
        # Backward-Substitution: Ux = y for x
        x = np.zeros((n,1))
        for i in np.arange(n-1,-1,-1):
            x[] = y[].copy()
            for j in np.arange(n-1,i,-1):
                x[] = x[] - U[,]*x[]
            x[] = x[]/U[,]
        
        return np.mat(x)

    This page titled 34.3: Focus on LU is shared under a CC BY-NC 4.0 license and was authored, remixed, and/or curated by Dirk Colbry via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.

    • Was this article helpful?