Skip to main content
\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)
Mathematics LibreTexts

6.2: Linear Maps and Functionals. Matrices

  • Page ID
    19198
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)

    For an adequate definition of differentiability, we need the notion of a linear map. Below, \(E^{\prime}, E^{\prime \prime},\) and \(E\) denote normed spaces over the same scalar field, \(E^{1}\) or \(C.\)

    Definition 1

    A function \(f : E^{\prime} \rightarrow E\) is a linear map if and only if for all \(\vec{x}, \vec{y} \in E^{\prime}\) and scalars \(a, b\)

    \[f(a \vec{x}+b \vec{y})=a f(\vec{x})+b f(\vec{y});\]

    equivalently, iff for all such \(\vec{x}, \vec{y},\) and \(a\)

    \[f(\vec{x}+\vec{y})=f(x)+f(y) \text { and } f(a \vec{x})=a f(\vec{x}). \text {(Verify!)}\]

    If \(E=E^{\prime},\) such a map is also called a linear operator.

    If the range space \(E\) is the scalar field of \(E^{\prime},\) (i.e., \(E^{1}\) or \(C,)\) the linear \(f\) is also called a (real or complex) linear functional on \(E^{\prime}.\)

    Note 1. Induction extends formula (1) to any "linear combinations":

    \[f\left(\sum_{i=1}^{m} a_{i} \vec{x}_{i}\right)=\sum_{i=1}^{m} a_{i} f\left(\vec{x}_{i}\right)\]

    for all \(\vec{x}_{i} \in E^{\prime}\) and scalars \(a_{i}\).

    Briefly: A linear map \(f\) preserves linear combinations.

    Note 2. Taking \(a=b=0\) in (1), we obtain \(f(\overrightarrow{0})=0\) if \(f\) is linear.

    Examples

    (a) Let \(E^{\prime}=E^{n}\left(C^{n}\right).\) Fix a vector \(\vec{v}=\left(v_{1}, \ldots, v_{n}\right)\) in \(E^{\prime}\) and set

    \[\left(\forall \vec{x} \in E^{\prime}\right) \quad f(\vec{x})=\vec{x} \cdot \vec{v}\]

    (inner product; see Chapter 3, §§1-3 and §9).

    Then

    \[\begin{aligned} f(a \vec{x}+b \vec{y}) &=(a \vec{x}) \cdot \vec{v}+(b \vec{y}) \cdot \vec{v} \\ &=a(\vec{x} \cdot \vec{v})+b(\vec{y} \cdot \vec{v}) \\ &=a f(\vec{x})+b f(\vec{y}); \end{aligned}\]

    so \(f\) is linear. Note that if \(E^{\prime}=E^{n},\) then by definition,

    \[f(\vec{x})=\vec{x} \cdot \vec{v}=\sum_{k=1}^{n} x_{k} v_{k}=\sum_{k=1}^{n} v_{k} x_{k}.\]

    If, however, \(E^{\prime}=C^{n},\) then

    \[f(\vec{x})=\vec{x} \cdot \vec{v}=\sum_{k=1}^{n} x_{k} \overline{v}_{k}=\sum_{k=1}^{n} \overline{v}_{k} x_{k},\]

    where \(\overline{v}_{k}\) is the conjugate of the complex number \(v_{k}\).

    By Theorem 3 in Chapter 4, §3, \(f\) is continuous (a polynomial!).

    Moreover, \(f(\vec{x})=\vec{x} \cdot \vec{v}\) is a scalar (in \(E^{1}\) or \(C).\) Thus the range of \(f\) lies in the scalar field of \(E^{\prime};\) so \(f\) is a linear functional on \(E^{\prime}.\)

    (b) Let \(I=[0,1].\) Let \(E^{\prime}\) be the set of all functions \(u : I \rightarrow E\) that are of class \(CD^{\infty}\) (Chapter 5, §6) on \(I\), hence bounded there (Theorem 2 of Chapter 4, §8).

    As in Example (C) in Chapter 3, §10, \(E^{\prime}\) is a normed linear space, with norm

    \[\|u\|=\sup _{x \in I}|u(x)|.\]

    Here each function \(u \in E^{\prime}\) is treated as a single "point" in \(E^{\prime}.\) The
    distance between two such points, \(u\) and \(v,\) equals \(\|u-v\|,\) by definition.

    Now define a map \(D\) on \(E^{\prime}\) by setting \(D(u)=u^{\prime}\) (derivative of \(u\) on \(I\)). As every \(u \in E^{\prime}\) is of class \(CD^{\infty},\) so is \(u^{\prime}.\)

    Thus \(D(u)=u^{\prime} \in E^{\prime},\) and so \(D : E^{\prime} \rightarrow E^{\prime}\) is a linear operator. (Its linearity follows from Theorem 4 in Chapter 5, §1.)

    (c) Let again \(I=[0,1].\) Let \(E^{\prime}\) be the set of all functions \(u : I \rightarrow E\) that are bounded and have antiderivatives (Chapter 5, §5) on \(I.\) With norm \(\|u\|\) as in Example (b), \(E^{\prime}\) is a normed linear space.

    Now define \(\phi : E^{\prime} \rightarrow E\) by

    \[\phi(u)=\int_{0}^{1} u,\]

    with \(\int u\) as in Chapter 5, §5. (Recall that \(\int_{0}^{1} u\) is an element of \(E\) if \(u : I \rightarrow E.\) ) By Corollary 1 in Chapter 5, §5, \(\phi\) is a linear map of \(E^{\prime}\) into \(E\). (Why?)

    (d) The zero map \(f=0\) on \(E^{\prime}\) is always linear. (Why?)

    Theorem \(\PageIndex{1}\)

    A linear map \(f : E^{\prime} \rightarrow E\) is continuous (even uniformly so) on all of \(E^{\prime}\) iff it is continuous at \(\overrightarrow{0};\) equivalently, iff there is a real \(c>0\) such that

    \[\left(\forall \vec{x} \in E^{\prime}\right) \quad|f(\vec{x})| \leq c|\vec{x}|.\]

    (We call this property linear boundedness.)

    Proof

    Assume that \(f\) is continuous at \(\overrightarrow{0}.\) Then, given \(\varepsilon>0,\) there is \(\delta>0\) such that

    \[|f(\vec{x})-f(\overrightarrow{0})|=|f(\vec{x})| \leq \varepsilon\]

    whenever \(|\vec{x}-\overrightarrow{0}|=|\vec{x}|<\delta\).

    Now, for any \(\vec{x} \neq \overrightarrow{0},\) we surely have

    \[\left|\frac{\delta \vec{x}}{|\vec{x}|}\right|=\frac{\delta}{2}<\delta.\]

    Hence

    \[(\forall \vec{x} \neq \overrightarrow{0}) \quad\left|f\left(\frac{\delta \vec{x}}{2|\vec{x}|}\right)\right| \leq \varepsilon,\]

    or, by linearity,

    \[\frac{\delta}{2|\vec{x}|}|f(\vec{x})| \leq \varepsilon,\]

    i.e.,

    \[|f(\vec{x})| \leq \frac{2 \varepsilon}{\delta}|\vec{x}|.\]

    By Note 2, this also holds if \(\vec{x}=\overrightarrow{0}\).

    Thus, taking \(c=2 \varepsilon / \delta,\) we obtain

    \[\left(\forall \vec{x} \in E^{\prime}\right) \quad f(\vec{x}) \leq c|\vec{x}| \quad \text {(linear boundedness).}\]

    Now assume (3). Then

    \[\left(\forall \vec{x}, \vec{y} \in E^{\prime}\right) \quad|f(\vec{x}-\vec{y})| \leq c|\vec{x}-\vec{y}|;\]

    or, by linearity,

    \[\left(\forall \vec{x}, \vec{y} \in E^{\prime}\right) \quad|f(\vec{x})-f(\vec{y})| \leq c|\vec{x}-\vec{y}|.\]

    Hence \(f\) is uniformly continuous (given \(\varepsilon>0,\) take \(\delta=\varepsilon / c).\) This, in turn, implies continuity at \(\overrightarrow{0};\) so all conditions are equivalent, as claimed. \(\quad \square\)

    A linear map need not be continuous. But, for \(E^{n}\) and \(C^{n},\) we have the following result.

    Theorem \(\PageIndex{2}\)

    (i) Any linear map on \(E^{n}\) or \(C^{n}\) is uniformly continuous.

    (ii) Every linear functional on \(E^{n}\left(C^{n}\right)\) has the form

    \[f(\vec{x})=\vec{x} \cdot \vec{v} \quad \text {(dot product)}\]

    for some unique vector \(\vec{v} \in E^{n}\left(C^{n}\right),\) dependent on \(f\) only.

    Proof

    Suppose \(f : E^{n} \rightarrow E\) is linear; so \(f\) preserves linear combinations.

    But every \(\vec{x} \in E^{n}\) is such a combination,

    \[\vec{x}=\sum_{k=1}^{n} x_{k} \vec{e}_{k} \quad \text {(Theorem 2 in Chapter 3, §§1-3).}\]

    Thus, by Note 1,

    \[f(\vec{x})=f\left(\sum_{k=1}^{n} x_{k} \vec{e}_{k}\right)=\sum_{k=1}^{n} x_{k} f\left(\vec{e}_{k}\right).\]

    Here the function values \(f\left(\vec{e}_{k}\right)\) are fixed vectors in the range space \(E,\) say,

    \[f\left(\vec{e}_{k}\right)=v_{k} \in E,\]

    so that

    \[f(\vec{x})=\sum_{k=1}^{n} x_{k} f\left(\vec{e}_{k}\right)=\sum_{k=1}^{n} x_{k} v_{k}, \quad v_{k} \in E.\]

    Thus \(f\) is a polynomial in \(n\) real variables \(x_{k},\) hence continuous (even uniformly so, by Theorem 1).

    In particular, if \(E=E^{1}\) (i.e., \(f\) is a linear functional) then all \(v_{k}\) in (5) are real numbers; so they form a vector

    \[\vec{v}=\left(v_{1}, \ldots, v_{k}\right) \text { in } E^{n},\]

    and (5) can be written as

    \[f(\vec{x})=\vec{x} \cdot \vec{v}.\]

    The vector \(\vec{v}\) is unique. For suppose there are two vectors, \(\vec{u}\) and \(\vec{v},\) such that

    \[\left(\forall \vec{x} \in E^{n}\right) \quad f(\vec{x})=\vec{x} \cdot \vec{v}=\vec{x} \cdot \vec{u}.\]

    Then

    \[\left(\forall \vec{x} \in E^{n}\right) \quad \vec{x} \cdot(\vec{v}-\vec{u})=0.\]

    By Problem 10 of Chapter 3, §§1-3, this yields \(\vec{v}-\vec{u}=\overrightarrow{0},\) or \(\vec{v}=\vec{u}.\) This completes the proof for \(E=E^{n}.\)

    It is analogous for \(C^{n};\) only in (ii) the \(v_{k}\) are complex and one has to replace them by their conjugates \(\overline{v}_{k}\) when forming the vector \(\vec{v}\) to obtain \(f(\vec{x})=\vec{x} \cdot \vec{v}\). Thus all is proved. \(\quad \square\)

    Note 3. Formula (5) shows that a linear map \(f : E^{n}\left(C^{n}\right) \rightarrow E\) is uniquely determined by the \(n\) function values \(v_{k}=f\left(\vec{e}_{k}\right)\).

    If further \(E=E^{m}\left(C^{m}\right),\) the vectors \(v_{k}\) are \(m\) -tuples of scalars,

    \[v_{k}=\left(v_{1 k}, \ldots, v_{m k}\right).\]

    We often write such vectors vertically, as the \(n\) "columns" in an array of \(m\) "rows" and \(n\) "columns":

    \[\left(\begin{array}{cccc}{v_{11}} & {v_{12}} & {\dots} & {v_{1 n}} \\ {v_{21}} & {v_{22}} & {\dots} & {v_{2 n}} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ {v_{m 1}} & {v_{m 2}} & {\dots} & {v_{m n}} \end{array}\right).\]

    Formally, (6) is a double sequence of \(m n\) terms, called an \(m \times n\) matrix. We denote it by \([f]=\left(v_{i k}\right),\) where for \(k=1,2, \ldots, n\),

    \[f\left(\vec{e}_{k}\right)=v_{k}=\left(v_{1 k}, \ldots, v_{m k}\right).\]

    Thus linear maps \(f : E^{n} \rightarrow E^{m}\) (or \(f : C^{n} \rightarrow C^{m})\) correspond one-to-one to their matrices \([f].\)

    The easy proof of Corollaries 1 to 3 below is left to the reader.

    Corollary \(\PageIndex{1}\)

    If \(f, g : E^{\prime} \rightarrow E\) are linear, so is

    \[h=a f+b g\]

    for any scalars \(a, b\).

    If further \(E^{\prime}=E^{n}\left(C^{n}\right)\) and \(E=E^{m}\left(C^{m}\right),\) with \([f]=\left(v_{i k}\right)\) and \([g]=\left(w_{i k}\right)\), then

    \[[h]=\left(a v_{i k}+b w_{i k}\right).\]

    Corollary \(\PageIndex{2}\)

    A map \(f : E^{n}\left(C^{n}\right) \rightarrow E\) is linear iff

    \[f(\vec{x})=\sum_{k=1}^{n} v_{k} x_{k},\]

    where \(v_{k}=f\left(\vec{e}_{k}\right)\).

    Hint: For the "if," use Corollary 1. For the "only if," use formula (5) above.

    Corollary \(\PageIndex{3}\)

    If \(f : E^{\prime} \rightarrow E^{\prime \prime}\) and \(g : E^{\prime \prime} \rightarrow E\) are linear, so is the composite \(h=g \circ f.\)

    Our next theorem deals with the matrix of the composite linear map \(g \circ f\)

    Theorem \(\PageIndex{3}\)

    Let \(f : E^{\prime} \rightarrow E^{\prime \prime}\) and \(g : E^{\prime \prime} \rightarrow E\) be linear, with

    \[E^{\prime}=E^{n}\left(C^{n}\right), E^{\prime \prime}=E^{m}\left(C^{m}\right), \text { and } E=E^{r}\left(C^{r}\right).\]

    If \([f]=\left(v_{i k}\right)\) and \([g]=\left(w_{j i}\right),\) then

    \[[h]=[g \circ f]=\left(z_{j k}\right),\]

    where

    \[z_{j k}=\sum_{i=1}^{m} w_{j i} v_{i k}, \quad j=1,2, \ldots, r, k=1,2, \ldots, n.\]

    Proof

    Denote the basic unit vectors in \(E^{\prime}\) by

    \[e_{1}^{\prime}, \ldots, e_{n}^{\prime},\]

    those in \(E^{\prime \prime}\) by

    \[e_{1}^{\prime \prime}, \ldots, e_{m}^{\prime \prime},\]

    and those in \(E\) by

    \[e_{1}, \ldots, e_{r}.\]

    Then for \(k=1,2, \ldots, n\),

    \[f\left(e_{k}^{\prime}\right)=v_{k}=\sum_{i=1}^{m} v_{i k} e_{i}^{\prime \prime} \text { and } h\left(e_{k}^{\prime}\right)=\sum_{j=1}^{r} z_{j k} e_{j},\]

    and for \(i=1, \dots m\),

    \[g\left(e_{i}^{\prime \prime}\right)=\sum_{j=1}^{r} w_{j i} e_{j}.\]

    Also,

    \[h\left(e_{k}^{\prime}\right)=g\left(f\left(e_{k}^{\prime}\right)\right)=g\left(\sum_{i=1}^{m} v_{i k} e_{i}^{\prime \prime}\right)=\sum_{i=1}^{m} v_{i k} g\left(e_{i}^{\prime \prime}\right)=\sum_{i=1}^{m} v_{i k}\left(\sum_{j=1}^{r} w_{j i} e_{j}\right).\]

    Thus

    \[h\left(e_{k}^{\prime}\right)=\sum_{j=1}^{r} z_{j k} e_{j}=\sum_{j=1}^{r}\left(\sum_{i=1}^{m} w_{j i} v_{i k}\right) e_{j}.\]

    But the representation in terms of the \(e_{j}\) is unique (Theorem 2 in Chapter 3, §§1-3), so, equating coefficients, we get (7). \(\quad \square\)

    Note 4. Observe that \(z_{j k}\) is obtained, so to say, by "dot-multiplying" the \(j\)th row of \([g]\) (an \(r \times m\) matrix) by the \(k\)th column of \([f]\) (an \(m \times n\) matrix).

    It is natural to set

    \[[g][f]=[g \circ f],\]

    or

    \[\left(w_{j i}\right)\left(v_{i k}\right)=\left(z_{j k}\right),\]

    with \(z_{j k}\) as in (7).

    Caution. Matrix multiplication, so defined, is not commutative.

    Definition 2

    The set of all continuous linear maps \(f : E^{\prime} \rightarrow E\) (for fixed \(E^{\prime} \) and \(E\)) is denoted \(L(E^{\prime}, E).\)

    If \(E=E^{\prime},\) we write \(L(E)\) instead.

    For each \(f\) in \(L\left(E^{\prime}, E\right),\) we define its norm by

    \[\|f\|=\sup _{|\vec{x}| \leq 1}|f(\vec{x})|.\]

    Note that \(\|f\|<+\infty,\) by Theorem 1.

    Theorem \(\PageIndex{4}\)

    \(L(E^{\prime}, E)\) is a normed linear space under the norm defined above and under the usual operations on functions, as in Corollary 1.

    Proof

    Corollary 1 easily implies that \(L(E^{\prime}, E)\) is a vector space. We now show that \(\|\cdot\|\) is a genuine norm.

    The triangle law,

    \[\|f+g\| \leq\|f\|+\|g\|,\]

    follows exactly as in Example (C) of Chapter 3, §10. (Verify!)

    Also, by Problem 5 in Chapter 2, §§8-9, \(\sup |a f(\vec{x})|=|a| \sup |f(\vec{x})|.\) Hence \(\|a f\|=|a|\|f\|\) for any scalar \(a.\)

    As noted above, \(0 \leq\|f\|<+\infty\).

    It remains to show that \(\|f\|=0\) iff \(f\) is the zero map. If

    \[\|f\|=\sup _{|\vec{x}| \leq 1}|f(\vec{x})|=0,\]

    then \(|f(\vec{x})|=0\) when \(|\vec{x}| \leq 1.\) Hence, if \(\vec{x} \neq \overrightarrow{0}\),

    \[f(\frac{\vec{x}}{|\vec{x}|})=\frac{1}{|\vec{x}|} f(\vec{x})=0.\]

    As \(f(\overrightarrow{0})=0,\) we have \(f(\vec{x})=0\) for all \(\vec{x} \in E^{\prime}\).

    Thus \(\|f\|=0\) implies \(f=0,\) and the converse is clear. Thus all is proved. \(\quad \square\)

    Note 5. A similar proof, via \(f\left(\frac{\vec{x}}{|\vec{x}|}\right)\) and properties of lub, shows that

    \[\|f\|=\sup _{\vec{x} \neq 0}\left|\frac{f(\vec{x})}{|\vec{x}|}\right|\]

    and

    \[(\forall \vec{x} \in E^{\prime}) \quad|f(\vec{x})| \leq\|f\||\vec{x}|.\]

    It also follows that \(\|f\|\) is the least real \(c\) such that

    \[(\forall \vec{x} \in E^{\prime}) \quad|f(\vec{x})| \leq c|\vec{x}|.\]

    Verify. (See Problem 3'.)

    As in any normed space, we define distances in \(L(E^{\prime}, E)\) by

    \[\rho(f, g)=\|f-g\|,\]

    making it a metric space; so we may speak of convergence, limits, etc. in it.

    Corollary \(\PageIndex{4}\)

    If \(f \in L(E^{\prime}, E^{\prime \prime})\) and \(g \in L(E^{\prime \prime}, E),\) then

    \[\|g \circ f\| \leq\|g\|\|f\|.\]

    Proof

    By Note 5,

    \[\left(\forall \vec{x} \in E^{\prime}\right) \quad|g(f(\vec{x}))| \leq\|g\||f(\vec{x})| \leq\|g\|\|f\||\vec{x}|.\]

    Hence

    \[(\forall \vec{x} \neq \overrightarrow{0}) \quad\left|\frac{(g \circ f)(\vec{x})}{|\vec{x}|}\right| \leq\|g\|\|f\|,\]

    and so

    \[\|g\|\|f\| \geq \sup _{\vec{x} \neq \overline{0}} \frac{|(g \circ f)(\vec{x})|}{|\vec{x}|}=\|g \circ f\|. \quad \square\]