6.6: Determinants. Jacobians. Bijective Linear Operators
( \newcommand{\kernel}{\mathrm{null}\,}\)
We assume the reader to be familiar with elements of linear algebra. Thus we only briefly recall some definitions and well-known rules.
Given a linear operator ϕ:En→En( or ϕ:Cn→Cn), with matrix
[ϕ]=(vik),i,k=1,…,n,
we define the determinant of [ϕ] by
det[ϕ]=det(vik)=|v11v12…v1nv21v22…v2n⋮⋮⋱⋮vn1vn2…vnn|=∑(−1)λv1k1v2k2…vnkn
where the sum is over all ordered n-tuples (k1,…,kn) of distinct integers kj(1≤kj≤n), and
λ={0 if ∏j<m(km−kj)>0 and 1 if ∏j<m(km−kj)<0
Recall (Problem 12 in §2) that a set B={→v1,→v2,…,→vn} in a vector space E is a basis iff
(i) B spans E, i.e., each →v∈E has the form
→v=n∑i=1ai→vi
for some scalars ai, and
(ii) this representation is unique.
The latter is true iff the →vi are independent, i.e.,
n∑i=1ai→vi=→0⟺ai=0,i=1,…,n.
If E has a basis of n vectors, we call E n-dimensional (e.g., En and Cn).
Determinants and bases satisfy the following rules.
(a) Multiplication rule. If ϕ,g:En→En( or Cn→Cn) are linear, then
det[g]⋅det[ϕ]=det([g][ϕ])=det[g∘ϕ]
(see §2, Theorem 3 and Note 4).
(b) If ϕ(→x)=→x (identity map), then [ϕ]=(vik), where
vik={0 if i≠k and 1 if i=k
hence det [ϕ]=1.( Why ?) See also the Problems.
(c) An n -dimensional space E is spanned by a set of n vectors iff they are independent. If so, each basis consists of exactly n vectors.
For any function f:En→En (or f:Cn→Cn), we define the f-induced Jacobian map Jf:En→E1(Jf:Cn→C) by setting
Jf(→x)=det(vik),
where vik=Dkfi(→x),→x∈En(Cn), and f=(f1,…,fn).
The determinant
Jf(→p)=det(Dkfi(→p))
is called the Jacobian of f at →p.
By our conventions, it is always defined, as are the functions Dkfi.
Explicitly, Jf(→p) is the determinant of the right-side matrix in formula (14) in §3. Briefly,
Jf=det(Dkfi).
By Definition 2 and Note 2 in §5,
Jf(→p)=det[d1f(→p;⋅)].
If f is differentiable at →p,
Jf(→p)=det[f′(→p)].
Note 1. More generally, given any functions vik:E′→E1(C), we can define a map f:E′→E1(C) by
f(→x)=det(vik(→x));
briefly f=det(vik),i,k=1,…,n.
We then call f a functional determinant.
If E′=En(Cn) then f is a function of n variables, since →x=(x1,x2,…,xn). If all vik are continuous or differentiable at some →p∈E′, so is f; for by (1),f is a finite sum of functions of the form
(−1)λvik1vik2…vikn,
and each of these is continuous or differentiable if the viki are (see Problems 7 and 8 in §3).
Note 2. Hence the Jacobian map Jf is continuous or differentiable at →p if all the partially derived functions Dkfi(i,k≤n) are.
If, in addition, Jf(→p)≠0, then Jf≠0 on some globe about →p. (Apply Problem 7 in Chapter 4, §2, to |Jf|.)
In classical notation, one writes
∂(f1,…,fn)∂(x1,…,xn) or ∂(y1,…,yn)∂(x1,…,xn)
for Jf(→x). Here (y1,…,yn)=f(x1,…,xn).
The remarks made in §4 apply to this "variable" notation too. The chain rule easily yields the following corollary.
If f:En→En and g:En→En (or f,g:Cn→Cn) are differentiable at →p and →q=f(→p), respectively, and if
h=g∘f,
then
Jh(→p)=Jg(→q)⋅Jf(→p)=det(zik),
where
zik=Dkhi(→p),i,k=1,…,n;
or, setting
(u1,…,un)=g(y1,…,yn) and (y1,…,yn)=f(x1,…,xn) ("variables"),
we have
∂(u1,…,un)∂(x1,…,xn)=∂(u1,…,un)∂(y1,…,yn)⋅∂(y1,…,yn)∂(x1,…,xn)=det(zik),
where
zik=∂ui∂xk,i,k=1,…,n.
- Proof
-
By Note 2 in §4,
[h′(→p)]=[g′(→q)]⋅[f′(→p)].
Thus by rule (a) above,
det[h′(→p)]=det[g′(→q)]⋅det[f′(→p)],
i.e.,
Jh(→p)=Jg(→q)⋅Jf(→p).
Also, if [h′(→p)]=(zik), Definition 2 yields zik=Dkhi(→p).
This proves (i), hence (ii) also. ◻
In practice, Jacobians mostly occur when a change of variables is made. For instance, in E2, we may pass from Cartesian coordinates (x,y) to another system (u,v) such that
x=f1(u,v) and y=f2(u,v).
We then set f=(f1,f2) and obtain f:E2→E2,
Jf=det(Dkfi),k,i=1,2.
Let x=f1(r,θ)=rcosθ and y=f2(r,θ)=rsinθ.
Then using the "variable" notation, we obtain Jf(r,θ) as
∂(x,y)∂(r,θ)=|∂x∂r∂x∂θ∂y∂r∂y∂θ|=|cosθ−rsinθsinθrcosθ|=rcos2θ+rsin2θ=r.
Thus here Jf(r,θ)=r for all r,θ∈E1;Jf is independent of θ.
We now concentrate on one-to-one (invertible) functions.
For a linear map ϕ:En→En(orϕ:Cn→Cn), the following are equivalent:
(i) ϕ is one-to-one;
(ii) the column vectors →v1,…,→vn of the matrix [ϕ] are independent;
(iii) ϕ is onto En(Cn);
(iv) det[ϕ]≠0.
- Proof
-
Assume (i) and let
n∑k=1ck→vk=→0.
To deduce (ii), we must show that all ck vanish.
Now, by Note 3 in §2, →vk=ϕ(→ek); so by linearity,
n∑k=1ck→vk=→0
implies
ϕ(n∑k=1ck→ek)=→0.
As ϕ is one-to-one, it can vanish at →0 only. Thus
n∑k=1ck→ek=→0.
Hence by Theorem 2 in Chapter 3, §§1-3, ck=0,k=1,…,n, and (ii) follows.
Next, assume (ii); so, by rule (c) above, {→v1,…,→vn} is a basis.
Thus each →y∈En(Cn) has the form
→y=n∑k=1ak→vk=n∑k=1akϕ(→ek)=ϕ(n∑k=1ak→ek)=ϕ(→x),
where
→x=n∑k=1ak→ek (uniquely).
Hence (ii) implies both (iii) and (i). (Why?)
Now assume (iii). Then each →y∈En(Cn) has the form →y=ϕ(→x), where
→x=n∑k=1xk→ek,
by Theorem 2 in Chapter 3, §§1-3. Hence again
→y=n∑k=1xkϕ(→ek)=n∑k=1xk→vk;
so the →vk span all of En(Cn). By rule (c) above, this implies (ii), hence (i), too. Thus (i), (ii), and (iii) are equivalent.
Also, by rules (a) and (b), we have
det[ϕ]⋅det[ϕ−1]=det[ϕ∘ϕ−1]=1
if ϕ is one-to-one (for ϕ∘ϕ−1 is the identity map). Hence det[ϕ]≠0 if (i) holds.
For the converse, suppose ϕ is not one-to-one. Then by (ii), the →vk are not independent. Thus one of them is a linear combination of the others, say,
→v1=n∑k=2ak→vk.
But by linear algebra (Problem 13(iii)), det[ϕ] does not change if →v1 is replaced by
→v1−n∑k=2ak→vk=→0.
Thus det[ϕ]=0 (one column turning to →0). This completes the proof. ◻
Note 3. Maps that are both onto and one-to-one are called bijective. Such is ϕ in Theorem 1. This means that the equation
ϕ(→x)=→y
has a unique solution
→x=ϕ−1(→y)
for each →y. Componentwise, by Theorem 1, the equations
n∑k=1xkvik=yi,i=1,…,n,
have a unique solution for the xk iff det(vik)≠0.
If ϕ∈L(E′,E) is bijective, with E′ and E complete, then ϕ−1∈L(E,E′).
- Proof for E=En(Cn)
-
The notation ϕ∈L(E′,E) means that ϕ:E′→E is linear and continuous.
As ϕ is bijective, ϕ−1:E→E′ is linear (Problem 12).
If E=En(Cn), it is continuous, too (Theorem 2 in §2).
Thus ϕ−1∈L(E,E′).◻
Note. The case E=En(Cn) suffices for an undergraduate course. (The beginner is advised to omit the "starred" §8.) Corollary 2 and Theorem 2 below, however, are valid in the general case. So is Theorem 1 in §7.
Let E,E′ and ϕ be as in Corollary 2. Set
‖ϕ−1‖=1ε.
Then any map θ∈L(E′,E) with ‖θ−ϕ‖<ε is one-to-one, and θ−1 is uniformly continuous.
- Proof
-
Proof. By Corollary 2, ϕ−1∈L(E,E′), so ‖ϕ−1‖ is defined and >0 (for ϕ−1 is not the zero map, being one-to-one).
Thus we may set
ε=1‖ϕ−1‖,‖ϕ−1‖=1ε.
Clearly →x=ϕ−1(→y) if →y=ϕ(→x). Also,
|ϕ−1(→y)|≤1ε|→y|
by Note 5 in §2, Hence
|→y|≥ε|ϕ−1(→y)|,
i.e.,
|ϕ(→x)|≥ε|→x|
for all →x∈E′ and →y∈E.
Now suppose ϕ∈L(E′,E) and ‖θ−ϕ‖=σ<ε.
Obviously, θ=ϕ−(ϕ−θ), and by Note 5 in §2,
|(ϕ−θ)(→x)|≤‖ϕ−θ‖|→x|=σ|→x|.
Thus for every →x∈E′,
|θ(→x)|≥|ϕ(→x)|−|(ϕ−θ)(→x)|≥|ϕ(→x)|−σ|→x|≥(ε−σ)|→x|
by (2). Therefore, given →p≠→r in E′ and setting →x=→p−→r≠→0, we obtain
|θ(→p)−θ(→r)|=|θ(→p−→r)|=|θ(→x)|≥(ε−σ)|→x|>0
(since σ<ε).
We see that →p≠→r implies θ(→p)≠θ(→r); so θ is one-to-one, indeed.
Also, setting θ(→x)=→z and →x=θ−1(→z) in (3), we get
|→z|≥(ε−σ)|θ−1(→z)|;
that is,
|θ−1(→z)|≤(ε−σ)−1|→z|
for all →z in the range of θ (domain of θ−1).
Thus θ−1 is linearly bounded (by Theorem 1 in §2), hence uniformly continuous, as claimed.◻
If E′=E=En(Cn) in Theorem 2 above, then for given ϕ and δ>0, there always is δ′>0 such that
‖θ−ϕ‖<δ′ implies ‖θ−1−ϕ−1‖<δ.
In other words, the transformation ϕ→ϕ−1 is continuous on L(E),E= En(Cn).
- Proof
-
First, since E′=E=En(Cn),θ is bijective by Theorem 1(iii), so θ−1∈L(E).
As before, set ‖θ−ϕ‖=σ<ε.
By Note 5 in §2, formula (5) above implies that
‖θ−1‖≤1ε−σ.
Also,
ϕ−1∘(θ−ϕ)∘θ−1=ϕ−1−θ−1
(see Problem 11).
Hence by Corollary 4 in §2, recalling that ‖ϕ−1‖=1/ε, we get
‖θ−1−ϕ−1‖≤‖ϕ−1‖⋅‖θ−ϕ‖⋅‖θ−1‖≤σε(ε−σ)→0 as σ→0.◻