6.2: Linear Maps and Functionals. Matrices
( \newcommand{\kernel}{\mathrm{null}\,}\)
For an adequate definition of differentiability, we need the notion of a linear map. Below, E′,E′′, and E denote normed spaces over the same scalar field, E1 or C.
A function f:E′→E is a linear map if and only if for all →x,→y∈E′ and scalars a,b
f(a→x+b→y)=af(→x)+bf(→y);
equivalently, iff for all such →x,→y, and a
f(→x+→y)=f(x)+f(y) and f(a→x)=af(→x).(Verify!)
If E=E′, such a map is also called a linear operator.
If the range space E is the scalar field of E′, (i.e., E1 or C,) the linear f is also called a (real or complex) linear functional on E′.
Note 1. Induction extends formula (1) to any "linear combinations":
f(m∑i=1ai→xi)=m∑i=1aif(→xi)
for all →xi∈E′ and scalars ai.
Briefly: A linear map f preserves linear combinations.
Note 2. Taking a=b=0 in (1), we obtain f(→0)=0 if f is linear.
(a) Let E′=En(Cn). Fix a vector →v=(v1,…,vn) in E′ and set
(∀→x∈E′)f(→x)=→x⋅→v
(inner product; see Chapter 3, §§1-3 and §9).
Then
f(a→x+b→y)=(a→x)⋅→v+(b→y)⋅→v=a(→x⋅→v)+b(→y⋅→v)=af(→x)+bf(→y);
so f is linear. Note that if E′=En, then by definition,
f(→x)=→x⋅→v=n∑k=1xkvk=n∑k=1vkxk.
If, however, E′=Cn, then
f(→x)=→x⋅→v=n∑k=1xk¯vk=n∑k=1¯vkxk,
where ¯vk is the conjugate of the complex number vk.
By Theorem 3 in Chapter 4, §3, f is continuous (a polynomial!).
Moreover, f(→x)=→x⋅→v is a scalar (in E1 or C). Thus the range of f lies in the scalar field of E′; so f is a linear functional on E′.
(b) Let I=[0,1]. Let E′ be the set of all functions u:I→E that are of class CD∞ (Chapter 5, §6) on I, hence bounded there (Theorem 2 of Chapter 4, §8).
As in Example (C) in Chapter 3, §10, E′ is a normed linear space, with norm
‖u‖=supx∈I|u(x)|.
Here each function u∈E′ is treated as a single "point" in E′. The
distance between two such points, u and v, equals ‖u−v‖, by definition.
Now define a map D on E′ by setting D(u)=u′ (derivative of u on I). As every u∈E′ is of class CD∞, so is u′.
Thus D(u)=u′∈E′, and so D:E′→E′ is a linear operator. (Its linearity follows from Theorem 4 in Chapter 5, §1.)
(c) Let again I=[0,1]. Let E′ be the set of all functions u:I→E that are bounded and have antiderivatives (Chapter 5, §5) on I. With norm ‖u‖ as in Example (b), E′ is a normed linear space.
Now define ϕ:E′→E by
ϕ(u)=∫10u,
with ∫u as in Chapter 5, §5. (Recall that ∫10u is an element of E if u:I→E. ) By Corollary 1 in Chapter 5, §5, ϕ is a linear map of E′ into E. (Why?)
(d) The zero map f=0 on E′ is always linear. (Why?)
A linear map f:E′→E is continuous (even uniformly so) on all of E′ iff it is continuous at →0; equivalently, iff there is a real c>0 such that
(∀→x∈E′)|f(→x)|≤c|→x|.
(We call this property linear boundedness.)
- Proof
-
Assume that f is continuous at →0. Then, given ε>0, there is δ>0 such that
|f(→x)−f(→0)|=|f(→x)|≤ε
whenever |→x−→0|=|→x|<δ.
Now, for any →x≠→0, we surely have
|δ→x|→x||=δ2<δ.
Hence
(∀→x≠→0)|f(δ→x2|→x|)|≤ε,
or, by linearity,
δ2|→x||f(→x)|≤ε,
i.e.,
|f(→x)|≤2εδ|→x|.
By Note 2, this also holds if →x=→0.
Thus, taking c=2ε/δ, we obtain
(∀→x∈E′)f(→x)≤c|→x|(linear boundedness).
Now assume (3). Then
(∀→x,→y∈E′)|f(→x−→y)|≤c|→x−→y|;
or, by linearity,
(∀→x,→y∈E′)|f(→x)−f(→y)|≤c|→x−→y|.
Hence f is uniformly continuous (given ε>0, take δ=ε/c). This, in turn, implies continuity at →0; so all conditions are equivalent, as claimed. ◻
A linear map need not be continuous. But, for En and Cn, we have the following result.
(i) Any linear map on En or Cn is uniformly continuous.
(ii) Every linear functional on En(Cn) has the form
f(→x)=→x⋅→v(dot product)
for some unique vector →v∈En(Cn), dependent on f only.
- Proof
-
Suppose f:En→E is linear; so f preserves linear combinations.
But every →x∈En is such a combination,
→x=n∑k=1xk→ek(Theorem 2 in Chapter 3, §§1-3).
Thus, by Note 1,
f(→x)=f(n∑k=1xk→ek)=n∑k=1xkf(→ek).
Here the function values f(→ek) are fixed vectors in the range space E, say,
f(→ek)=vk∈E,
so that
f(→x)=n∑k=1xkf(→ek)=n∑k=1xkvk,vk∈E.
Thus f is a polynomial in n real variables xk, hence continuous (even uniformly so, by Theorem 1).
In particular, if E=E1 (i.e., f is a linear functional) then all vk in (5) are real numbers; so they form a vector
→v=(v1,…,vk) in En,
and (5) can be written as
f(→x)=→x⋅→v.
The vector →v is unique. For suppose there are two vectors, →u and →v, such that
(∀→x∈En)f(→x)=→x⋅→v=→x⋅→u.
Then
(∀→x∈En)→x⋅(→v−→u)=0.
By Problem 10 of Chapter 3, §§1-3, this yields →v−→u=→0, or →v=→u. This completes the proof for E=En.
It is analogous for Cn; only in (ii) the vk are complex and one has to replace them by their conjugates ¯vk when forming the vector →v to obtain f(→x)=→x⋅→v. Thus all is proved. ◻
Note 3. Formula (5) shows that a linear map f:En(Cn)→E is uniquely determined by the n function values vk=f(→ek).
If further E=Em(Cm), the vectors vk are m -tuples of scalars,
vk=(v1k,…,vmk).
We often write such vectors vertically, as the n "columns" in an array of m "rows" and n "columns":
(v11v12…v1nv21v22…v2n⋮⋮⋱⋮vm1vm2…vmn).
Formally, (6) is a double sequence of mn terms, called an m×n matrix. We denote it by [f]=(vik), where for k=1,2,…,n,
f(→ek)=vk=(v1k,…,vmk).
Thus linear maps f:En→Em (or f:Cn→Cm) correspond one-to-one to their matrices [f].
The easy proof of Corollaries 1 to 3 below is left to the reader.
If f,g:E′→E are linear, so is
h=af+bg
for any scalars a,b.
If further E′=En(Cn) and E=Em(Cm), with [f]=(vik) and [g]=(wik), then
[h]=(avik+bwik).
A map f:En(Cn)→E is linear iff
f(→x)=n∑k=1vkxk,
where vk=f(→ek).
Hint: For the "if," use Corollary 1. For the "only if," use formula (5) above.
If f:E′→E′′ and g:E′′→E are linear, so is the composite h=g∘f.
Our next theorem deals with the matrix of the composite linear map g∘f
Let f:E′→E′′ and g:E′′→E be linear, with
E′=En(Cn),E′′=Em(Cm), and E=Er(Cr).
If [f]=(vik) and [g]=(wji), then
[h]=[g∘f]=(zjk),
where
zjk=m∑i=1wjivik,j=1,2,…,r,k=1,2,…,n.
- Proof
-
Denote the basic unit vectors in E′ by
e′1,…,e′n,
those in E′′ by
e′′1,…,e′′m,
and those in E by
e1,…,er.
Then for k=1,2,…,n,
f(e′k)=vk=m∑i=1vike′′i and h(e′k)=r∑j=1zjkej,
and for i=1,…m,
g(e′′i)=r∑j=1wjiej.
Also,
h(e′k)=g(f(e′k))=g(m∑i=1vike′′i)=m∑i=1vikg(e′′i)=m∑i=1vik(r∑j=1wjiej).
Thus
h(e′k)=r∑j=1zjkej=r∑j=1(m∑i=1wjivik)ej.
But the representation in terms of the ej is unique (Theorem 2 in Chapter 3, §§1-3), so, equating coefficients, we get (7). ◻
Note 4. Observe that zjk is obtained, so to say, by "dot-multiplying" the jth row of [g] (an r×m matrix) by the kth column of [f] (an m×n matrix).
It is natural to set
[g][f]=[g∘f],
or
(wji)(vik)=(zjk),
with zjk as in (7).
Caution. Matrix multiplication, so defined, is not commutative.
The set of all continuous linear maps f:E′→E (for fixed E′ and E) is denoted L(E′,E).
If E=E′, we write L(E) instead.
For each f in L(E′,E), we define its norm by
‖f‖=sup|→x|≤1|f(→x)|.
Note that ‖f‖<+∞, by Theorem 1.
L(E′,E) is a normed linear space under the norm defined above and under the usual operations on functions, as in Corollary 1.
- Proof
-
Corollary 1 easily implies that L(E′,E) is a vector space. We now show that ‖⋅‖ is a genuine norm.
The triangle law,
‖f+g‖≤‖f‖+‖g‖,
follows exactly as in Example (C) of Chapter 3, §10. (Verify!)
Also, by Problem 5 in Chapter 2, §§8-9, sup|af(→x)|=|a|sup|f(→x)|. Hence ‖af‖=|a|‖f‖ for any scalar a.
As noted above, 0≤‖f‖<+∞.
It remains to show that ‖f‖=0 iff f is the zero map. If
‖f‖=sup|→x|≤1|f(→x)|=0,
then |f(→x)|=0 when |→x|≤1. Hence, if →x≠→0,
f(→x|→x|)=1|→x|f(→x)=0.
As f(→0)=0, we have f(→x)=0 for all →x∈E′.
Thus ‖f‖=0 implies f=0, and the converse is clear. Thus all is proved. ◻
Note 5. A similar proof, via f(→x|→x|) and properties of lub, shows that
‖f‖=sup→x≠0|f(→x)|→x||
and
(∀→x∈E′)|f(→x)|≤‖f‖|→x|.
It also follows that ‖f‖ is the least real c such that
(∀→x∈E′)|f(→x)|≤c|→x|.
Verify. (See Problem 3'.)
As in any normed space, we define distances in L(E′,E) by
ρ(f,g)=‖f−g‖,
making it a metric space; so we may speak of convergence, limits, etc. in it.
If f∈L(E′,E′′) and g∈L(E′′,E), then
‖g∘f‖≤‖g‖‖f‖.
- Proof
-
By Note 5,
(∀→x∈E′)|g(f(→x))|≤‖g‖|f(→x)|≤‖g‖‖f‖|→x|.
Hence
(∀→x≠→0)|(g∘f)(→x)|→x||≤‖g‖‖f‖,
and so
‖g‖‖f‖≥sup→x≠¯0|(g∘f)(→x)||→x|=‖g∘f‖.◻