6.6: Determinants. Jacobians. Bijective Linear Operators
We assume the reader to be familiar with elements of linear algebra. Thus we only briefly recall some definitions and well-known rules.
Given a linear operator \(\phi : E^{n} \rightarrow E^{n}\left(\text { or } \phi : C^{n} \rightarrow C^{n}\right),\) with matrix
\[
[\phi]=\left(v_{i k}\right), \quad i, k=1, \ldots, n,
\]
we define the determinant of \([\phi]\) by
\[
\begin{aligned} \operatorname{det}[\phi]=\operatorname{det}\left(v_{i k}\right) &=\left|\begin{array}{cccc}{v_{11}} & {v_{12}} & {\dots} & {v_{1 n}} \\ {v_{21}} & {v_{22}} & {\dots} & {v_{2 n}} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ {v_{n 1}} & {v_{n 2}} & {\dots} & {v_{n n}}\end{array}\right| \\[12pt] &=\sum(-1)^{\lambda} v_{1 k_{1}} v_{2 k_{2}} \ldots v_{n k_{n}} \end{aligned}
\]
where the sum is over all ordered \(n\)-tuples \(\left(k_{1}, \ldots, k_{n}\right)\) of distinct integers \(k_{j}\left(1 \leq k_{j} \leq n\right),\) and
\[
\lambda=\left\{\begin{array}{ll}{0} & {\text { if } \prod_{j<m}\left(k_{m}-k_{j}\right)>0 \text { and }} \\ {1} & {\text { if } \prod_{j<m}\left(k_{m}-k_{j}\right)<0}\end{array}\right.
\]
Recall (Problem 12 in §2) that a set \(B=\left\{\vec{v}_{1}, \vec{v}_{2}, \ldots, \vec{v}_{n}\right\}\) in a vector space \(E\) is a basis iff
(i) \(B\) spans \(E,\) i.e., each \(\vec{v} \in E\) has the form
\[
\vec{v}=\sum_{i=1}^{n} a_{i} \vec{v}_{i}
\]
for some scalars \(a_{i},\) and
(ii) this representation is unique.
The latter is true iff the \(\vec{v}_{i}\) are independent, i.e.,
\[
\sum_{i=1}^{n} a_{i} \vec{v}_{i}=\overrightarrow{0} \Longleftrightarrow a_{i}=0, i=1, \ldots, n.
\]
If \(E\) has a basis of \(n\) vectors, we call \(E\) n-dimensional (e.g., \(E^{n}\) and \(C^{n} )\).
Determinants and bases satisfy the following rules.
(a) Multiplication rule. If \(\phi, g : E^{n} \rightarrow E^{n}\left(\text { or } C^{n} \rightarrow C^{n}\right)\) are linear, then
\[
\operatorname{det}[g] \cdot \operatorname{det}[\phi]=\operatorname{det}([g][\phi])=\operatorname{det}[g \circ \phi]
\]
(see §2, Theorem 3 and Note 4).
(b) If \(\phi(\vec{x})=\vec{x}\) (identity map), then \([\phi]=\left(v_{i k}\right)\), where
\[
v_{i k}=\left\{\begin{array}{ll}{0} & {\text { if } i \neq k \text { and }} \\ {1} & {\text { if } i=k}\end{array}\right.
\]
hence det \([\phi]=1 .(\text { Why } ?)\) See also the Problems.
(c) An \(n\) -dimensional space \(E\) is spanned by a set of \(n\) vectors iff they are independent. If so, each basis consists of exactly \(n\) vectors.
For any function \(f : E^{n} \rightarrow E^{n}\) (or \(f : C^{n} \rightarrow C^{n} ),\) we define the \(f\)-induced Jacobian map \(J_{f} : E^{n} \rightarrow E^{1}\left(J_{f} : C^{n} \rightarrow C\right)\) by setting
\[
J_{f}(\vec{x})=\operatorname{det}\left(v_{i k}\right),
\]
where \(v_{i k}=D_{k} f_{i}(\vec{x}), \vec{x} \in E^{n}\left(C^{n}\right),\) and \(f=\left(f_{1}, \ldots, f_{n}\right)\).
The determinant
\[
J_{f}(\vec{p})=\operatorname{det}\left(D_{k} f_{i}(\vec{p})\right)
\]
is called the Jacobian of \(f\) at \(\vec{p}\).
By our conventions, it is always defined, as are the functions \(D_{k} f_{i}\).
Explicitly, \(J_{f}(\vec{p})\) is the determinant of the right-side matrix in formula \((14)\) in §3. Briefly,
\[
J_{f}=\operatorname{det}\left(D_{k} f_{i}\right).
\]
By Definition 2 and Note 2 in §5,
\[
J_{f}(\vec{p})=\operatorname{det}\left[d^{1} f(\vec{p} ; \cdot)\right].
\]
If \(f\) is differentiable at \(\vec{p}\),
\[
J_{f}(\vec{p})=\operatorname{det}\left[f^{\prime}(\vec{p})\right].
\]
Note 1. More generally, given any functions \(v_{i k} : E^{\prime} \rightarrow E^{1}(C),\) we can define a map \(f : E^{\prime} \rightarrow E^{1}(C)\) by
\[
f(\vec{x})=\operatorname{det}\left(v_{i k}(\vec{x})\right);
\]
briefly \(f=\operatorname{det}\left(v_{i k}\right), i, k=1, \ldots, n\).
We then call \(f\) a functional determinant.
If \(E^{\prime}=E^{n}\left(C^{n}\right)\) then \(f\) is a function of \(n\) variables, since \(\vec{x}=\left(x_{1}, x_{2}, \ldots, x_{n}\right)\). If all \(v_{i k}\) are continuous or differentiable at some \(\vec{p} \in E^{\prime},\) so is \(f ;\) for by \((1), f\) is a finite sum of functions of the form
\[
(-1)^{\lambda} v_{i k_{1}} v_{i k_{2}} \dots v_{i k_{n}},
\]
and each of these is continuous or differentiable if the \(v_{i k_{i}}\) are (see Problems 7 and 8 in §3).
Note 2. Hence the Jacobian map \(J_{f}\) is continuous or differentiable at \(\vec{p}\) if all the partially derived functions \(D_{k} f_{i}(i, k \leq n)\) are.
If, in addition, \(J_{f}(\vec{p}) \neq 0,\) then \(J_{f} \neq 0\) on some globe about \(\vec{p}.\) (Apply Problem 7 in Chapter 4, §2, to \(\left|J_{f}\right|.)\)
In classical notation, one writes
\[
\frac{\partial\left(f_{1}, \ldots, f_{n}\right)}{\partial\left(x_{1}, \ldots, x_{n}\right)} \text { or } \frac{\partial\left(y_{1}, \ldots, y_{n}\right)}{\partial\left(x_{1}, \ldots, x_{n}\right)}
\]
for \(J_{f}(\vec{x}) .\) Here \(\left(y_{1}, \ldots, y_{n}\right)=f\left(x_{1}, \ldots, x_{n}\right)\).
The remarks made in §4 apply to this "variable" notation too. The chain rule easily yields the following corollary.
If \(f : E^{n} \rightarrow E^{n}\) and \(g : E^{n} \rightarrow E^{n}\) (or \(f, g : C^{n} \rightarrow C^{n})\) are differentiable at \(\vec{p}\) and \(\vec{q}=f(\vec{p}),\) respectively, and if
\[h=g \circ f,\]
then
\[J_{h}(\vec{p})=J_{g}(\vec{q}) \cdot J_{f}(\vec{p})=\operatorname{det}\left(z_{i k}\right),\]
where
\[z_{i k}=D_{k} h_{i}(\vec{p}), \quad i, k=1, \ldots, n;\]
or, setting
\[\begin{aligned}\left(u_{1}, \ldots, u_{n}\right) &=g\left(y_{1}, \ldots, y_{n}\right) \text { and } \\\left(y_{1}, \ldots, y_{n}\right) &=f\left(x_{1}, \ldots, x_{n}\right) \text { ("variables")}, \end{aligned}\]
we have
\[\frac{\partial\left(u_{1}, \ldots, u_{n}\right)}{\partial\left(x_{1}, \ldots, x_{n}\right)}=\frac{\partial\left(u_{1}, \ldots, u_{n}\right)}{\partial\left(y_{1}, \ldots, y_{n}\right)} \cdot \frac{\partial\left(y_{1}, \ldots, y_{n}\right)}{\partial\left(x_{1}, \ldots, x_{n}\right)}=\operatorname{det}\left(z_{i k}\right),\]
where
\[z_{i k}=\frac{\partial u_{i}}{\partial x_{k}}, \quad i, k=1, \ldots, n.\]
- Proof
-
By Note 2 in §4,
\[\left[h^{\prime}(\vec{p})\right]=\left[g^{\prime}(\vec{q})\right] \cdot\left[f^{\prime}(\vec{p})\right].\]
Thus by rule (a) above,
\[\operatorname{det}\left[h^{\prime}(\vec{p})\right]=\operatorname{det}\left[g^{\prime}(\vec{q})\right] \cdot \operatorname{det}\left[f^{\prime}(\vec{p})\right],\]
i.e.,
\[J_{h}(\vec{p})=J_{g}(\vec{q}) \cdot J_{f}(\vec{p}).\]
Also, if \(\left[h^{\prime}(\vec{p})\right]=\left(z_{i k}\right),\) Definition 2 yields \(z_{i k}=D_{k} h_{i}(\vec{p})\).
This proves (i), hence (ii) also. \(\quad \square\)
In practice, Jacobians mostly occur when a change of variables is made. For instance, in \(E^{2},\) we may pass from Cartesian coordinates \((x, y)\) to another system \((u, v)\) such that
\[x=f_{1}(u, v) \text { and } y=f_{2}(u, v).\]
We then set \(f=\left(f_{1}, f_{2}\right)\) and obtain \(f : E^{2} \rightarrow E^{2}\),
\[J_{f}=\operatorname{det}\left(D_{k} f_{i}\right), \quad k, i=1,2.\]
Let \(x=f_{1}(r, \theta)=r \cos \theta\) and \(y=f_{2}(r, \theta)=r \sin \theta\).
Then using the "variable" notation, we obtain \(J_{f}(r, \theta)\) as
\[\begin{aligned} \frac{\partial(x, y)}{\partial(r, \theta)}=\left|\begin{array}{ll}{\frac{\partial x}{\partial r}} & {\frac{\partial x}{\partial \theta}} \\ {\frac{\partial y}{\partial r}} & {\frac{\partial y}{\partial \theta}}\end{array}\right| &=\left|\begin{array}{cc}{\cos \theta} & {-r \sin \theta} \\ {\sin \theta} & {r \cos \theta}\end{array}\right| \\ &=r \cos ^{2} \theta+r \sin ^{2} \theta=r. \end{aligned}\]
Thus here \(J_{f}(r, \theta)=r\) for all \(r, \theta \in E^{1} ; J_{f}\) is independent of \(\theta\).
We now concentrate on one-to-one (invertible) functions.
For a linear map \(\phi : E^{n} \rightarrow E^{n}\left(\text {or} \phi : C^{n} \rightarrow C^{n}\right),\) the following are equivalent:
(i) \(\phi\) is one-to-one;
(ii) the column vectors \(\vec{v}_{1}, \ldots, \vec{v}_{n}\) of the matrix \([\phi]\) are independent;
(iii) \(\phi\) is onto \(E^{n}\left(C^{n}\right)\);
(iv) \(\operatorname{det}[\phi] \neq 0\).
- Proof
-
Assume (i) and let
\[\sum_{k=1}^{n} c_{k} \vec{v}_{k}=\overrightarrow{0}.\]
To deduce (ii), we must show that all \(c_{k}\) vanish.
Now, by Note 3 in §2, \(\vec{v}_{k}=\phi\left(\vec{e}_{k}\right);\) so by linearity,
\[\sum_{k=1}^{n} c_{k} \vec{v}_{k}=\overrightarrow{0}\]
implies
\[\phi\left(\sum_{k=1}^{n} c_{k} \vec{e}_{k}\right)=\overrightarrow{0}.\]
As \(\phi\) is one-to-one, it can vanish at \(\overrightarrow{0}\) only. Thus
\[\sum_{k=1}^{n} c_{k} \vec{e}_{k}=\overrightarrow{0}.\]
Hence by Theorem 2 in Chapter 3, §§1-3, \(c_{k}=0, k=1, \ldots, n,\) and (ii) follows.
Next, assume (ii); so, by rule (c) above, \(\left\{\vec{v}_{1}, \ldots, \vec{v}_{n}\right\}\) is a basis.
Thus each \(\vec{y} \in E^{n}\left(C^{n}\right)\) has the form
\[\vec{y}=\sum_{k=1}^{n} a_{k} \vec{v}_{k}=\sum_{k=1}^{n} a_{k} \phi\left(\vec{e}_{k}\right)=\phi\left(\sum_{k=1}^{n} a_{k} \vec{e}_{k}\right)=\phi(\vec{x}),\]
where
\[\vec{x}=\sum_{k=1}^{n} a_{k} \vec{e}_{k} \text { (uniquely).}\]
Hence (ii) implies both (iii) and (i). (Why?)
Now assume (iii). Then each \(\vec{y} \in E^{n}\left(C^{n}\right)\) has the form \(\vec{y}=\phi(\vec{x}),\) where
\[\vec{x}=\sum_{k=1}^{n} x_{k} \vec{e}_{k},\]
by Theorem 2 in Chapter 3, §§1-3. Hence again
\[\vec{y}=\sum_{k=1}^{n} x_{k} \phi\left(\vec{e}_{k}\right)=\sum_{k=1}^{n} x_{k} \vec{v}_{k};\]
so the \(\vec{v}_{k}\) span all of \(E^{n}\left(C^{n}\right).\) By rule (c) above, this implies (ii), hence (i), too. Thus (i), (ii), and (iii) are equivalent.
Also, by rules (a) and (b), we have
\[\operatorname{det}[\phi] \cdot \operatorname{det}\left[\phi^{-1}\right]=\operatorname{det}\left[\phi \circ \phi^{-1}\right]=1\]
if \(\phi\) is one-to-one (for \(\phi \circ \phi^{-1}\) is the identity map). Hence \(\operatorname{det}[\phi] \neq 0\) if (i) holds.
For the converse, suppose \(\phi\) is not one-to-one. Then by (ii), the \(\vec{v}_{k}\) are not independent. Thus one of them is a linear combination of the others, say,
\[\vec{v}_{1}=\sum_{k=2}^{n} a_{k} \vec{v}_{k}.\]
But by linear algebra (Problem 13(iii)), \(\operatorname{det}[\phi]\) does not change if \(\vec{v}_{1}\) is replaced by
\[\vec{v}_{1}-\sum_{k=2}^{n} a_{k} \vec{v}_{k}=\overrightarrow{0}.\]
Thus \(\operatorname{det}[\phi]=0\) (one column turning to \(\overrightarrow{0}).\) This completes the proof. \(\quad \square\)
Note 3. Maps that are both onto and one-to-one are called bijective. Such is \(\phi\) in Theorem 1. This means that the equation
\[\phi(\vec{x})=\vec{y}\]
has a unique solution
\[\vec{x}=\phi^{-1}(\vec{y})\]
for each \(\vec{y}.\) Componentwise, by Theorem 1, the equations
\[\sum_{k=1}^{n} x_{k} v_{i k}=y_{i}, \quad i=1, \ldots, n,\]
have a unique solution for the \(x_{k}\) iff \(\operatorname{det}\left(v_{i k}\right) \neq 0\).
If \(\phi \in L\left(E^{\prime}, E\right)\) is bijective, with \(E^{\prime}\) and \(E\) complete, then \(\phi^{-1} \in L\left(E, E^{\prime}\right).\)
- Proof for \(E=E^{n}\left(C^{n}\right)\)
-
The notation \(\phi \in L\left(E^{\prime}, E\right)\) means that \(\phi : E^{\prime} \rightarrow E\) is linear and continuous.
As \(\phi\) is bijective, \(\phi^{-1} : E \rightarrow E^{\prime}\) is linear (Problem 12).
If \(E=E^{n}\left(C^{n}\right),\) it is continuous, too (Theorem 2 in §2).
Thus \(\phi^{-1} \in L\left(E, E^{\prime}\right). \quad \square\)
Note. The case \(E=E^{n}\left(C^{n}\right)\) suffices for an undergraduate course. (The beginner is advised to omit the "starred" §8.) Corollary 2 and Theorem 2 below, however, are valid in the general case. So is Theorem 1 in §7.
Let \(E, E^{\prime}\) and \(\phi\) be as in Corollary 2. Set
\[\left\|\phi^{-1}\right\|=\frac{1}{\varepsilon}.\]
Then any map \(\theta \in L\left(E^{\prime}, E\right)\) with \(\|\theta-\phi\|<\varepsilon\) is one-to-one, and \(\theta^{-1}\) is uniformly continuous.
- Proof
-
Proof. By Corollary 2, \(\phi^{-1} \in L\left(E, E^{\prime}\right),\) so \(\left\|\phi^{-1}\right\|\) is defined and \(>0\) (for \(\phi^{-1}\) is not the zero map, being one-to-one).
Thus we may set
\[\varepsilon=\frac{1}{\left\|\phi^{-1}\right\|}, \quad\left\|\phi^{-1}\right\|=\frac{1}{\varepsilon}.\]
Clearly \(\vec{x}=\phi^{-1}(\vec{y})\) if \(\vec{y}=\phi(\vec{x}).\) Also,
\[\left|\phi^{-1}(\vec{y})\right| \leq \frac{1}{\varepsilon}|\vec{y}|\]
by Note 5 in §2, Hence
\[|\vec{y}| \geq \varepsilon\left|\phi^{-1}(\vec{y})\right|,\]
i.e.,
\[|\phi(\vec{x})| \geq \varepsilon|\vec{x}|\]
for all \(\vec{x} \in E^{\prime}\) and \(\vec{y} \in E\).
Now suppose \(\phi \in L\left(E^{\prime}, E\right)\) and \(\|\theta-\phi\|=\sigma<\varepsilon\).
Obviously, \(\theta=\phi-(\phi-\theta),\) and by Note 5 in §2,
\[|(\phi-\theta)(\vec{x})| \leq\|\phi-\theta\||\vec{x}|=\sigma|\vec{x}|.\]
Thus for every \(\vec{x} \in E^{\prime}\),
\[\begin{aligned}|\theta(\vec{x})| & \geq|\phi(\vec{x})|-|(\phi-\theta)(\vec{x})| \\ & \geq|\phi(\vec{x})|-\sigma|\vec{x}| \\ & \geq(\varepsilon-\sigma)|\vec{x}| \end{aligned}\]
by (2). Therefore, given \(\vec{p} \neq \vec{r}\) in \(E^{\prime}\) and setting \(\vec{x}=\vec{p}-\vec{r} \neq \overrightarrow{0},\) we obtain
\[|\theta(\vec{p})-\theta(\vec{r})|=|\theta(\vec{p}-\vec{r})|=|\theta(\vec{x})| \geq(\varepsilon-\sigma)|\vec{x}|>0\]
(since \(\sigma<\varepsilon )\).
We see that \(\vec{p} \neq \vec{r}\) implies \(\theta(\vec{p}) \neq \theta(\vec{r});\) so \(\theta\) is one-to-one, indeed.
Also, setting \(\theta(\vec{x})=\vec{z}\) and \(\vec{x}=\theta^{-1}(\vec{z})\) in (3), we get
\[|\vec{z}| \geq(\varepsilon-\sigma)\left|\theta^{-1}(\vec{z})\right|;\]
that is,
\[\left|\theta^{-1}(\vec{z})\right| \leq(\varepsilon-\sigma)^{-1}|\vec{z}|\]
for all \(\vec{z}\) in the range of \(\theta\) (domain of \(\theta^{-1})\).
Thus \(\theta^{-1}\) is linearly bounded (by Theorem 1 in §2), hence uniformly continuous, as claimed.\(\quad \square\)
If \(E^{\prime}=E=E^{n}\left(C^{n}\right)\) in Theorem 2 above, then for given \(\phi\) and \(\delta>0,\) there always is \(\delta^{\prime}>0\) such that
\[\|\theta-\phi\|<\delta^{\prime} \text { implies }\left\|\theta^{-1}-\phi^{-1}\right\|<\delta.\]
In other words, the transformation \(\phi \rightarrow \phi^{-1}\) is continuous on \(L(E), E=\) \(E^{n}\left(C^{n}\right).\)
- Proof
-
First, since \(E^{\prime}=E=E^{n}\left(C^{n}\right), \theta\) is bijective by Theorem 1(iii), so \(\theta^{-1} \in L(E)\).
As before, set \(\|\theta-\phi\|=\sigma<\varepsilon\).
By Note 5 in §2, formula (5) above implies that
\[\left\|\theta^{-1}\right\| \leq \frac{1}{\varepsilon-\sigma}.\]
Also,
\[\phi^{-1} \circ(\theta-\phi) \circ \theta^{-1}=\phi^{-1}-\theta^{-1}\]
(see Problem 11).
Hence by Corollary 4 in §2, recalling that \(\left\|\phi^{-1}\right\|=1 / \varepsilon,\) we get
\[\left\|\theta^{-1}-\phi^{-1}\right\| \leq\left\|\phi^{-1}\right\| \cdot\|\theta-\phi\| \cdot\left\|\theta^{-1}\right\| \leq \frac{\sigma}{\varepsilon(\varepsilon-\sigma)} \rightarrow 0 \text { as } \sigma \rightarrow 0. \quad \square\]