$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$

# 6.2: Linear Maps and Functionals. Matrices

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$ $$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$

For an adequate definition of differentiability, we need the notion of a linear map. Below, $$E^{\prime}, E^{\prime \prime},$$ and $$E$$ denote normed spaces over the same scalar field, $$E^{1}$$ or $$C.$$

Definition 1

A function $$f : E^{\prime} \rightarrow E$$ is a linear map if and only if for all $$\vec{x}, \vec{y} \in E^{\prime}$$ and scalars $$a, b$$

$f(a \vec{x}+b \vec{y})=a f(\vec{x})+b f(\vec{y});$

equivalently, iff for all such $$\vec{x}, \vec{y},$$ and $$a$$

$f(\vec{x}+\vec{y})=f(x)+f(y) \text { and } f(a \vec{x})=a f(\vec{x}). \text {(Verify!)}$

If $$E=E^{\prime},$$ such a map is also called a linear operator.

If the range space $$E$$ is the scalar field of $$E^{\prime},$$ (i.e., $$E^{1}$$ or $$C,)$$ the linear $$f$$ is also called a (real or complex) linear functional on $$E^{\prime}.$$

Note 1. Induction extends formula (1) to any "linear combinations":

$f\left(\sum_{i=1}^{m} a_{i} \vec{x}_{i}\right)=\sum_{i=1}^{m} a_{i} f\left(\vec{x}_{i}\right)$

for all $$\vec{x}_{i} \in E^{\prime}$$ and scalars $$a_{i}$$.

Briefly: A linear map $$f$$ preserves linear combinations.

Note 2. Taking $$a=b=0$$ in (1), we obtain $$f(\overrightarrow{0})=0$$ if $$f$$ is linear.

Examples

(a) Let $$E^{\prime}=E^{n}\left(C^{n}\right).$$ Fix a vector $$\vec{v}=\left(v_{1}, \ldots, v_{n}\right)$$ in $$E^{\prime}$$ and set

$\left(\forall \vec{x} \in E^{\prime}\right) \quad f(\vec{x})=\vec{x} \cdot \vec{v}$

(inner product; see Chapter 3, §§1-3 and §9).

Then

\begin{aligned} f(a \vec{x}+b \vec{y}) &=(a \vec{x}) \cdot \vec{v}+(b \vec{y}) \cdot \vec{v} \\ &=a(\vec{x} \cdot \vec{v})+b(\vec{y} \cdot \vec{v}) \\ &=a f(\vec{x})+b f(\vec{y}); \end{aligned}

so $$f$$ is linear. Note that if $$E^{\prime}=E^{n},$$ then by definition,

$f(\vec{x})=\vec{x} \cdot \vec{v}=\sum_{k=1}^{n} x_{k} v_{k}=\sum_{k=1}^{n} v_{k} x_{k}.$

If, however, $$E^{\prime}=C^{n},$$ then

$f(\vec{x})=\vec{x} \cdot \vec{v}=\sum_{k=1}^{n} x_{k} \overline{v}_{k}=\sum_{k=1}^{n} \overline{v}_{k} x_{k},$

where $$\overline{v}_{k}$$ is the conjugate of the complex number $$v_{k}$$.

By Theorem 3 in Chapter 4, §3, $$f$$ is continuous (a polynomial!).

Moreover, $$f(\vec{x})=\vec{x} \cdot \vec{v}$$ is a scalar (in $$E^{1}$$ or $$C).$$ Thus the range of $$f$$ lies in the scalar field of $$E^{\prime};$$ so $$f$$ is a linear functional on $$E^{\prime}.$$

(b) Let $$I=[0,1].$$ Let $$E^{\prime}$$ be the set of all functions $$u : I \rightarrow E$$ that are of class $$CD^{\infty}$$ (Chapter 5, §6) on $$I$$, hence bounded there (Theorem 2 of Chapter 4, §8).

As in Example (C) in Chapter 3, §10, $$E^{\prime}$$ is a normed linear space, with norm

$\|u\|=\sup _{x \in I}|u(x)|.$

Here each function $$u \in E^{\prime}$$ is treated as a single "point" in $$E^{\prime}.$$ The
distance between two such points, $$u$$ and $$v,$$ equals $$\|u-v\|,$$ by definition.

Now define a map $$D$$ on $$E^{\prime}$$ by setting $$D(u)=u^{\prime}$$ (derivative of $$u$$ on $$I$$). As every $$u \in E^{\prime}$$ is of class $$CD^{\infty},$$ so is $$u^{\prime}.$$

Thus $$D(u)=u^{\prime} \in E^{\prime},$$ and so $$D : E^{\prime} \rightarrow E^{\prime}$$ is a linear operator. (Its linearity follows from Theorem 4 in Chapter 5, §1.)

(c) Let again $$I=[0,1].$$ Let $$E^{\prime}$$ be the set of all functions $$u : I \rightarrow E$$ that are bounded and have antiderivatives (Chapter 5, §5) on $$I.$$ With norm $$\|u\|$$ as in Example (b), $$E^{\prime}$$ is a normed linear space.

Now define $$\phi : E^{\prime} \rightarrow E$$ by

$\phi(u)=\int_{0}^{1} u,$

with $$\int u$$ as in Chapter 5, §5. (Recall that $$\int_{0}^{1} u$$ is an element of $$E$$ if $$u : I \rightarrow E.$$ ) By Corollary 1 in Chapter 5, §5, $$\phi$$ is a linear map of $$E^{\prime}$$ into $$E$$. (Why?)

(d) The zero map $$f=0$$ on $$E^{\prime}$$ is always linear. (Why?)

Theorem $$\PageIndex{1}$$

A linear map $$f : E^{\prime} \rightarrow E$$ is continuous (even uniformly so) on all of $$E^{\prime}$$ iff it is continuous at $$\overrightarrow{0};$$ equivalently, iff there is a real $$c>0$$ such that

$\left(\forall \vec{x} \in E^{\prime}\right) \quad|f(\vec{x})| \leq c|\vec{x}|.$

(We call this property linear boundedness.)

Proof

Assume that $$f$$ is continuous at $$\overrightarrow{0}.$$ Then, given $$\varepsilon>0,$$ there is $$\delta>0$$ such that

$|f(\vec{x})-f(\overrightarrow{0})|=|f(\vec{x})| \leq \varepsilon$

whenever $$|\vec{x}-\overrightarrow{0}|=|\vec{x}|<\delta$$.

Now, for any $$\vec{x} \neq \overrightarrow{0},$$ we surely have

$\left|\frac{\delta \vec{x}}{|\vec{x}|}\right|=\frac{\delta}{2}<\delta.$

Hence

$(\forall \vec{x} \neq \overrightarrow{0}) \quad\left|f\left(\frac{\delta \vec{x}}{2|\vec{x}|}\right)\right| \leq \varepsilon,$

or, by linearity,

$\frac{\delta}{2|\vec{x}|}|f(\vec{x})| \leq \varepsilon,$

i.e.,

$|f(\vec{x})| \leq \frac{2 \varepsilon}{\delta}|\vec{x}|.$

By Note 2, this also holds if $$\vec{x}=\overrightarrow{0}$$.

Thus, taking $$c=2 \varepsilon / \delta,$$ we obtain

$\left(\forall \vec{x} \in E^{\prime}\right) \quad f(\vec{x}) \leq c|\vec{x}| \quad \text {(linear boundedness).}$

Now assume (3). Then

$\left(\forall \vec{x}, \vec{y} \in E^{\prime}\right) \quad|f(\vec{x}-\vec{y})| \leq c|\vec{x}-\vec{y}|;$

or, by linearity,

$\left(\forall \vec{x}, \vec{y} \in E^{\prime}\right) \quad|f(\vec{x})-f(\vec{y})| \leq c|\vec{x}-\vec{y}|.$

Hence $$f$$ is uniformly continuous (given $$\varepsilon>0,$$ take $$\delta=\varepsilon / c).$$ This, in turn, implies continuity at $$\overrightarrow{0};$$ so all conditions are equivalent, as claimed. $$\quad \square$$

A linear map need not be continuous. But, for $$E^{n}$$ and $$C^{n},$$ we have the following result.

Theorem $$\PageIndex{2}$$

(i) Any linear map on $$E^{n}$$ or $$C^{n}$$ is uniformly continuous.

(ii) Every linear functional on $$E^{n}\left(C^{n}\right)$$ has the form

$f(\vec{x})=\vec{x} \cdot \vec{v} \quad \text {(dot product)}$

for some unique vector $$\vec{v} \in E^{n}\left(C^{n}\right),$$ dependent on $$f$$ only.

Proof

Suppose $$f : E^{n} \rightarrow E$$ is linear; so $$f$$ preserves linear combinations.

But every $$\vec{x} \in E^{n}$$ is such a combination,

$\vec{x}=\sum_{k=1}^{n} x_{k} \vec{e}_{k} \quad \text {(Theorem 2 in Chapter 3, §§1-3).}$

Thus, by Note 1,

$f(\vec{x})=f\left(\sum_{k=1}^{n} x_{k} \vec{e}_{k}\right)=\sum_{k=1}^{n} x_{k} f\left(\vec{e}_{k}\right).$

Here the function values $$f\left(\vec{e}_{k}\right)$$ are fixed vectors in the range space $$E,$$ say,

$f\left(\vec{e}_{k}\right)=v_{k} \in E,$

so that

$f(\vec{x})=\sum_{k=1}^{n} x_{k} f\left(\vec{e}_{k}\right)=\sum_{k=1}^{n} x_{k} v_{k}, \quad v_{k} \in E.$

Thus $$f$$ is a polynomial in $$n$$ real variables $$x_{k},$$ hence continuous (even uniformly so, by Theorem 1).

In particular, if $$E=E^{1}$$ (i.e., $$f$$ is a linear functional) then all $$v_{k}$$ in (5) are real numbers; so they form a vector

$\vec{v}=\left(v_{1}, \ldots, v_{k}\right) \text { in } E^{n},$

and (5) can be written as

$f(\vec{x})=\vec{x} \cdot \vec{v}.$

The vector $$\vec{v}$$ is unique. For suppose there are two vectors, $$\vec{u}$$ and $$\vec{v},$$ such that

$\left(\forall \vec{x} \in E^{n}\right) \quad f(\vec{x})=\vec{x} \cdot \vec{v}=\vec{x} \cdot \vec{u}.$

Then

$\left(\forall \vec{x} \in E^{n}\right) \quad \vec{x} \cdot(\vec{v}-\vec{u})=0.$

By Problem 10 of Chapter 3, §§1-3, this yields $$\vec{v}-\vec{u}=\overrightarrow{0},$$ or $$\vec{v}=\vec{u}.$$ This completes the proof for $$E=E^{n}.$$

It is analogous for $$C^{n};$$ only in (ii) the $$v_{k}$$ are complex and one has to replace them by their conjugates $$\overline{v}_{k}$$ when forming the vector $$\vec{v}$$ to obtain $$f(\vec{x})=\vec{x} \cdot \vec{v}$$. Thus all is proved. $$\quad \square$$

Note 3. Formula (5) shows that a linear map $$f : E^{n}\left(C^{n}\right) \rightarrow E$$ is uniquely determined by the $$n$$ function values $$v_{k}=f\left(\vec{e}_{k}\right)$$.

If further $$E=E^{m}\left(C^{m}\right),$$ the vectors $$v_{k}$$ are $$m$$ -tuples of scalars,

$v_{k}=\left(v_{1 k}, \ldots, v_{m k}\right).$

We often write such vectors vertically, as the $$n$$ "columns" in an array of $$m$$ "rows" and $$n$$ "columns":

$\left(\begin{array}{cccc}{v_{11}} & {v_{12}} & {\dots} & {v_{1 n}} \\ {v_{21}} & {v_{22}} & {\dots} & {v_{2 n}} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ {v_{m 1}} & {v_{m 2}} & {\dots} & {v_{m n}} \end{array}\right).$

Formally, (6) is a double sequence of $$m n$$ terms, called an $$m \times n$$ matrix. We denote it by $$[f]=\left(v_{i k}\right),$$ where for $$k=1,2, \ldots, n$$,

$f\left(\vec{e}_{k}\right)=v_{k}=\left(v_{1 k}, \ldots, v_{m k}\right).$

Thus linear maps $$f : E^{n} \rightarrow E^{m}$$ (or $$f : C^{n} \rightarrow C^{m})$$ correspond one-to-one to their matrices $$[f].$$

The easy proof of Corollaries 1 to 3 below is left to the reader.

Corollary $$\PageIndex{1}$$

If $$f, g : E^{\prime} \rightarrow E$$ are linear, so is

$h=a f+b g$

for any scalars $$a, b$$.

If further $$E^{\prime}=E^{n}\left(C^{n}\right)$$ and $$E=E^{m}\left(C^{m}\right),$$ with $$[f]=\left(v_{i k}\right)$$ and $$[g]=\left(w_{i k}\right)$$, then

$[h]=\left(a v_{i k}+b w_{i k}\right).$

Corollary $$\PageIndex{2}$$

A map $$f : E^{n}\left(C^{n}\right) \rightarrow E$$ is linear iff

$f(\vec{x})=\sum_{k=1}^{n} v_{k} x_{k},$

where $$v_{k}=f\left(\vec{e}_{k}\right)$$.

Hint: For the "if," use Corollary 1. For the "only if," use formula (5) above.

Corollary $$\PageIndex{3}$$

If $$f : E^{\prime} \rightarrow E^{\prime \prime}$$ and $$g : E^{\prime \prime} \rightarrow E$$ are linear, so is the composite $$h=g \circ f.$$

Our next theorem deals with the matrix of the composite linear map $$g \circ f$$

Theorem $$\PageIndex{3}$$

Let $$f : E^{\prime} \rightarrow E^{\prime \prime}$$ and $$g : E^{\prime \prime} \rightarrow E$$ be linear, with

$E^{\prime}=E^{n}\left(C^{n}\right), E^{\prime \prime}=E^{m}\left(C^{m}\right), \text { and } E=E^{r}\left(C^{r}\right).$

If $$[f]=\left(v_{i k}\right)$$ and $$[g]=\left(w_{j i}\right),$$ then

$[h]=[g \circ f]=\left(z_{j k}\right),$

where

$z_{j k}=\sum_{i=1}^{m} w_{j i} v_{i k}, \quad j=1,2, \ldots, r, k=1,2, \ldots, n.$

Proof

Denote the basic unit vectors in $$E^{\prime}$$ by

$e_{1}^{\prime}, \ldots, e_{n}^{\prime},$

those in $$E^{\prime \prime}$$ by

$e_{1}^{\prime \prime}, \ldots, e_{m}^{\prime \prime},$

and those in $$E$$ by

$e_{1}, \ldots, e_{r}.$

Then for $$k=1,2, \ldots, n$$,

$f\left(e_{k}^{\prime}\right)=v_{k}=\sum_{i=1}^{m} v_{i k} e_{i}^{\prime \prime} \text { and } h\left(e_{k}^{\prime}\right)=\sum_{j=1}^{r} z_{j k} e_{j},$

and for $$i=1, \dots m$$,

$g\left(e_{i}^{\prime \prime}\right)=\sum_{j=1}^{r} w_{j i} e_{j}.$

Also,

$h\left(e_{k}^{\prime}\right)=g\left(f\left(e_{k}^{\prime}\right)\right)=g\left(\sum_{i=1}^{m} v_{i k} e_{i}^{\prime \prime}\right)=\sum_{i=1}^{m} v_{i k} g\left(e_{i}^{\prime \prime}\right)=\sum_{i=1}^{m} v_{i k}\left(\sum_{j=1}^{r} w_{j i} e_{j}\right).$

Thus

$h\left(e_{k}^{\prime}\right)=\sum_{j=1}^{r} z_{j k} e_{j}=\sum_{j=1}^{r}\left(\sum_{i=1}^{m} w_{j i} v_{i k}\right) e_{j}.$

But the representation in terms of the $$e_{j}$$ is unique (Theorem 2 in Chapter 3, §§1-3), so, equating coefficients, we get (7). $$\quad \square$$

Note 4. Observe that $$z_{j k}$$ is obtained, so to say, by "dot-multiplying" the $$j$$th row of $$[g]$$ (an $$r \times m$$ matrix) by the $$k$$th column of $$[f]$$ (an $$m \times n$$ matrix).

It is natural to set

$[g][f]=[g \circ f],$

or

$\left(w_{j i}\right)\left(v_{i k}\right)=\left(z_{j k}\right),$

with $$z_{j k}$$ as in (7).

Caution. Matrix multiplication, so defined, is not commutative.

Definition 2

The set of all continuous linear maps $$f : E^{\prime} \rightarrow E$$ (for fixed $$E^{\prime}$$ and $$E$$) is denoted $$L(E^{\prime}, E).$$

If $$E=E^{\prime},$$ we write $$L(E)$$ instead.

For each $$f$$ in $$L\left(E^{\prime}, E\right),$$ we define its norm by

$\|f\|=\sup _{|\vec{x}| \leq 1}|f(\vec{x})|.$

Note that $$\|f\|<+\infty,$$ by Theorem 1.

Theorem $$\PageIndex{4}$$

$$L(E^{\prime}, E)$$ is a normed linear space under the norm defined above and under the usual operations on functions, as in Corollary 1.

Proof

Corollary 1 easily implies that $$L(E^{\prime}, E)$$ is a vector space. We now show that $$\|\cdot\|$$ is a genuine norm.

The triangle law,

$\|f+g\| \leq\|f\|+\|g\|,$

follows exactly as in Example (C) of Chapter 3, §10. (Verify!)

Also, by Problem 5 in Chapter 2, §§8-9, $$\sup |a f(\vec{x})|=|a| \sup |f(\vec{x})|.$$ Hence $$\|a f\|=|a|\|f\|$$ for any scalar $$a.$$

As noted above, $$0 \leq\|f\|<+\infty$$.

It remains to show that $$\|f\|=0$$ iff $$f$$ is the zero map. If

$\|f\|=\sup _{|\vec{x}| \leq 1}|f(\vec{x})|=0,$

then $$|f(\vec{x})|=0$$ when $$|\vec{x}| \leq 1.$$ Hence, if $$\vec{x} \neq \overrightarrow{0}$$,

$f(\frac{\vec{x}}{|\vec{x}|})=\frac{1}{|\vec{x}|} f(\vec{x})=0.$

As $$f(\overrightarrow{0})=0,$$ we have $$f(\vec{x})=0$$ for all $$\vec{x} \in E^{\prime}$$.

Thus $$\|f\|=0$$ implies $$f=0,$$ and the converse is clear. Thus all is proved. $$\quad \square$$

Note 5. A similar proof, via $$f\left(\frac{\vec{x}}{|\vec{x}|}\right)$$ and properties of lub, shows that

$\|f\|=\sup _{\vec{x} \neq 0}\left|\frac{f(\vec{x})}{|\vec{x}|}\right|$

and

$(\forall \vec{x} \in E^{\prime}) \quad|f(\vec{x})| \leq\|f\||\vec{x}|.$

It also follows that $$\|f\|$$ is the least real $$c$$ such that

$(\forall \vec{x} \in E^{\prime}) \quad|f(\vec{x})| \leq c|\vec{x}|.$

Verify. (See Problem 3'.)

As in any normed space, we define distances in $$L(E^{\prime}, E)$$ by

$\rho(f, g)=\|f-g\|,$

making it a metric space; so we may speak of convergence, limits, etc. in it.

Corollary $$\PageIndex{4}$$

If $$f \in L(E^{\prime}, E^{\prime \prime})$$ and $$g \in L(E^{\prime \prime}, E),$$ then

$\|g \circ f\| \leq\|g\|\|f\|.$

Proof

By Note 5,

$\left(\forall \vec{x} \in E^{\prime}\right) \quad|g(f(\vec{x}))| \leq\|g\||f(\vec{x})| \leq\|g\|\|f\||\vec{x}|.$

Hence

$(\forall \vec{x} \neq \overrightarrow{0}) \quad\left|\frac{(g \circ f)(\vec{x})}{|\vec{x}|}\right| \leq\|g\|\|f\|,$

and so

$\|g\|\|f\| \geq \sup _{\vec{x} \neq \overline{0}} \frac{|(g \circ f)(\vec{x})|}{|\vec{x}|}=\|g \circ f\|. \quad \square$