Skip to main content
\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)
Mathematics LibreTexts

6.6: Determinants. Jacobians. Bijective Linear Operators

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)

    We assume the reader to be familiar with elements of linear algebra. Thus we only briefly recall some definitions and well-known rules.


    Given a linear operator \(\phi : E^{n} \rightarrow E^{n}\left(\text { or } \phi : C^{n} \rightarrow C^{n}\right),\) with matrix
    [\phi]=\left(v_{i k}\right), \quad i, k=1, \ldots, n,
    we define the determinant of \([\phi]\) by
    \begin{aligned} \operatorname{det}[\phi]=\operatorname{det}\left(v_{i k}\right) &=\left|\begin{array}{cccc}{v_{11}} & {v_{12}} & {\dots} & {v_{1 n}} \\ {v_{21}} & {v_{22}} & {\dots} & {v_{2 n}} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ {v_{n 1}} & {v_{n 2}} & {\dots} & {v_{n n}}\end{array}\right| \\[12pt] &=\sum(-1)^{\lambda} v_{1 k_{1}} v_{2 k_{2}} \ldots v_{n k_{n}} \end{aligned}
    where the sum is over all ordered \(n\)-tuples \(\left(k_{1}, \ldots, k_{n}\right)\) of distinct integers \(k_{j}\left(1 \leq k_{j} \leq n\right),\) and
    \lambda=\left\{\begin{array}{ll}{0} & {\text { if } \prod_{j<m}\left(k_{m}-k_{j}\right)>0 \text { and }} \\ {1} & {\text { if } \prod_{j<m}\left(k_{m}-k_{j}\right)<0}\end{array}\right.

    Recall (Problem 12 in §2) that a set \(B=\left\{\vec{v}_{1}, \vec{v}_{2}, \ldots, \vec{v}_{n}\right\}\) in a vector space \(E\) is a basis iff
    (i) \(B\) spans \(E,\) i.e., each \(\vec{v} \in E\) has the form
    \vec{v}=\sum_{i=1}^{n} a_{i} \vec{v}_{i}
    for some scalars \(a_{i},\) and
    (ii) this representation is unique.
    The latter is true iff the \(\vec{v}_{i}\) are independent, i.e.,
    \sum_{i=1}^{n} a_{i} \vec{v}_{i}=\overrightarrow{0} \Longleftrightarrow a_{i}=0, i=1, \ldots, n.
    If \(E\) has a basis of \(n\) vectors, we call \(E\) n-dimensional (e.g., \(E^{n}\) and \(C^{n} )\).
    Determinants and bases satisfy the following rules.
    (a) Multiplication rule. If \(\phi, g : E^{n} \rightarrow E^{n}\left(\text { or } C^{n} \rightarrow C^{n}\right)\) are linear, then
    \operatorname{det}[g] \cdot \operatorname{det}[\phi]=\operatorname{det}([g][\phi])=\operatorname{det}[g \circ \phi]
    (see §2, Theorem 3 and Note 4).
    (b) If \(\phi(\vec{x})=\vec{x}\) (identity map), then \([\phi]=\left(v_{i k}\right)\), where
    v_{i k}=\left\{\begin{array}{ll}{0} & {\text { if } i \neq k \text { and }} \\ {1} & {\text { if } i=k}\end{array}\right.
    hence det \([\phi]=1 .(\text { Why } ?)\) See also the Problems.
    (c) An \(n\) -dimensional space \(E\) is spanned by a set of \(n\) vectors iff they are independent. If so, each basis consists of exactly \(n\) vectors.


    For any function \(f : E^{n} \rightarrow E^{n}\) (or \(f : C^{n} \rightarrow C^{n} ),\) we define the \(f\)-induced Jacobian map \(J_{f} : E^{n} \rightarrow E^{1}\left(J_{f} : C^{n} \rightarrow C\right)\) by setting
    J_{f}(\vec{x})=\operatorname{det}\left(v_{i k}\right),
    where \(v_{i k}=D_{k} f_{i}(\vec{x}), \vec{x} \in E^{n}\left(C^{n}\right),\) and \(f=\left(f_{1}, \ldots, f_{n}\right)\).
    The determinant
    J_{f}(\vec{p})=\operatorname{det}\left(D_{k} f_{i}(\vec{p})\right)
    is called the Jacobian of \(f\) at \(\vec{p}\).
    By our conventions, it is always defined, as are the functions \(D_{k} f_{i}\).

    Explicitly, \(J_{f}(\vec{p})\) is the determinant of the right-side matrix in formula \((14)\) in §3. Briefly,

    J_{f}=\operatorname{det}\left(D_{k} f_{i}\right).

    By Definition 2 and Note 2 in §5,

    J_{f}(\vec{p})=\operatorname{det}\left[d^{1} f(\vec{p} ; \cdot)\right].

    If \(f\) is differentiable at \(\vec{p}\),


    Note 1. More generally, given any functions \(v_{i k} : E^{\prime} \rightarrow E^{1}(C),\) we can define a map \(f : E^{\prime} \rightarrow E^{1}(C)\) by

    f(\vec{x})=\operatorname{det}\left(v_{i k}(\vec{x})\right);

    briefly \(f=\operatorname{det}\left(v_{i k}\right), i, k=1, \ldots, n\).

    We then call \(f\) a functional determinant.

    If \(E^{\prime}=E^{n}\left(C^{n}\right)\) then \(f\) is a function of \(n\) variables, since \(\vec{x}=\left(x_{1}, x_{2}, \ldots, x_{n}\right)\). If all \(v_{i k}\) are continuous or differentiable at some \(\vec{p} \in E^{\prime},\) so is \(f ;\) for by \((1), f\) is a finite sum of functions of the form

    (-1)^{\lambda} v_{i k_{1}} v_{i k_{2}} \dots v_{i k_{n}},

    and each of these is continuous or differentiable if the \(v_{i k_{i}}\) are (see Problems 7 and 8 in §3).

    Note 2. Hence the Jacobian map \(J_{f}\) is continuous or differentiable at \(\vec{p}\) if all the partially derived functions \(D_{k} f_{i}(i, k \leq n)\) are.

    If, in addition, \(J_{f}(\vec{p}) \neq 0,\) then \(J_{f} \neq 0\) on some globe about \(\vec{p}.\) (Apply Problem 7 in Chapter 4, §2, to \(\left|J_{f}\right|.)\)

    In classical notation, one writes

    \frac{\partial\left(f_{1}, \ldots, f_{n}\right)}{\partial\left(x_{1}, \ldots, x_{n}\right)} \text { or } \frac{\partial\left(y_{1}, \ldots, y_{n}\right)}{\partial\left(x_{1}, \ldots, x_{n}\right)}

    for \(J_{f}(\vec{x}) .\) Here \(\left(y_{1}, \ldots, y_{n}\right)=f\left(x_{1}, \ldots, x_{n}\right)\).

    The remarks made in §4 apply to this "variable" notation too. The chain rule easily yields the following corollary.

    Corollary \(\PageIndex{1}\)

    If \(f : E^{n} \rightarrow E^{n}\) and \(g : E^{n} \rightarrow E^{n}\) (or \(f, g : C^{n} \rightarrow C^{n})\) are differentiable at \(\vec{p}\) and \(\vec{q}=f(\vec{p}),\) respectively, and if

    \[h=g \circ f,\]


    \[J_{h}(\vec{p})=J_{g}(\vec{q}) \cdot J_{f}(\vec{p})=\operatorname{det}\left(z_{i k}\right),\]


    \[z_{i k}=D_{k} h_{i}(\vec{p}), \quad i, k=1, \ldots, n;\]

    or, setting

    \[\begin{aligned}\left(u_{1}, \ldots, u_{n}\right) &=g\left(y_{1}, \ldots, y_{n}\right) \text { and } \\\left(y_{1}, \ldots, y_{n}\right) &=f\left(x_{1}, \ldots, x_{n}\right) \text { ("variables")}, \end{aligned}\]

    we have

    \[\frac{\partial\left(u_{1}, \ldots, u_{n}\right)}{\partial\left(x_{1}, \ldots, x_{n}\right)}=\frac{\partial\left(u_{1}, \ldots, u_{n}\right)}{\partial\left(y_{1}, \ldots, y_{n}\right)} \cdot \frac{\partial\left(y_{1}, \ldots, y_{n}\right)}{\partial\left(x_{1}, \ldots, x_{n}\right)}=\operatorname{det}\left(z_{i k}\right),\]


    \[z_{i k}=\frac{\partial u_{i}}{\partial x_{k}}, \quad i, k=1, \ldots, n.\]


    By Note 2 in §4,

    \[\left[h^{\prime}(\vec{p})\right]=\left[g^{\prime}(\vec{q})\right] \cdot\left[f^{\prime}(\vec{p})\right].\]

    Thus by rule (a) above,

    \[\operatorname{det}\left[h^{\prime}(\vec{p})\right]=\operatorname{det}\left[g^{\prime}(\vec{q})\right] \cdot \operatorname{det}\left[f^{\prime}(\vec{p})\right],\]


    \[J_{h}(\vec{p})=J_{g}(\vec{q}) \cdot J_{f}(\vec{p}).\]

    Also, if \(\left[h^{\prime}(\vec{p})\right]=\left(z_{i k}\right),\) Definition 2 yields \(z_{i k}=D_{k} h_{i}(\vec{p})\).

    This proves (i), hence (ii) also. \(\quad \square\)

    In practice, Jacobians mostly occur when a change of variables is made. For instance, in \(E^{2},\) we may pass from Cartesian coordinates \((x, y)\) to another system \((u, v)\) such that

    \[x=f_{1}(u, v) \text { and } y=f_{2}(u, v).\]

    We then set \(f=\left(f_{1}, f_{2}\right)\) and obtain \(f : E^{2} \rightarrow E^{2}\),

    \[J_{f}=\operatorname{det}\left(D_{k} f_{i}\right), \quad k, i=1,2.\]

    Example (passage to polar coordinates)

    Let \(x=f_{1}(r, \theta)=r \cos \theta\) and \(y=f_{2}(r, \theta)=r \sin \theta\).

    Then using the "variable" notation, we obtain \(J_{f}(r, \theta)\) as

    \[\begin{aligned} \frac{\partial(x, y)}{\partial(r, \theta)}=\left|\begin{array}{ll}{\frac{\partial x}{\partial r}} & {\frac{\partial x}{\partial \theta}} \\ {\frac{\partial y}{\partial r}} & {\frac{\partial y}{\partial \theta}}\end{array}\right| &=\left|\begin{array}{cc}{\cos \theta} & {-r \sin \theta} \\ {\sin \theta} & {r \cos \theta}\end{array}\right| \\ &=r \cos ^{2} \theta+r \sin ^{2} \theta=r. \end{aligned}\]

    Thus here \(J_{f}(r, \theta)=r\) for all \(r, \theta \in E^{1} ; J_{f}\) is independent of \(\theta\).

    We now concentrate on one-to-one (invertible) functions.

    Theorem \(\PageIndex{1}\)

    For a linear map \(\phi : E^{n} \rightarrow E^{n}\left(\text {or} \phi : C^{n} \rightarrow C^{n}\right),\) the following are equivalent:

    (i) \(\phi\) is one-to-one;

    (ii) the column vectors \(\vec{v}_{1}, \ldots, \vec{v}_{n}\) of the matrix \([\phi]\) are independent;

    (iii) \(\phi\) is onto \(E^{n}\left(C^{n}\right)\);

    (iv) \(\operatorname{det}[\phi] \neq 0\).


    Assume (i) and let

    \[\sum_{k=1}^{n} c_{k} \vec{v}_{k}=\overrightarrow{0}.\]

    To deduce (ii), we must show that all \(c_{k}\) vanish.

    Now, by Note 3 in §2, \(\vec{v}_{k}=\phi\left(\vec{e}_{k}\right);\) so by linearity,

    \[\sum_{k=1}^{n} c_{k} \vec{v}_{k}=\overrightarrow{0}\]


    \[\phi\left(\sum_{k=1}^{n} c_{k} \vec{e}_{k}\right)=\overrightarrow{0}.\]

    As \(\phi\) is one-to-one, it can vanish at \(\overrightarrow{0}\) only. Thus

    \[\sum_{k=1}^{n} c_{k} \vec{e}_{k}=\overrightarrow{0}.\]

    Hence by Theorem 2 in Chapter 3, §§1-3, \(c_{k}=0, k=1, \ldots, n,\) and (ii) follows.

    Next, assume (ii); so, by rule (c) above, \(\left\{\vec{v}_{1}, \ldots, \vec{v}_{n}\right\}\) is a basis.

    Thus each \(\vec{y} \in E^{n}\left(C^{n}\right)\) has the form

    \[\vec{y}=\sum_{k=1}^{n} a_{k} \vec{v}_{k}=\sum_{k=1}^{n} a_{k} \phi\left(\vec{e}_{k}\right)=\phi\left(\sum_{k=1}^{n} a_{k} \vec{e}_{k}\right)=\phi(\vec{x}),\]


    \[\vec{x}=\sum_{k=1}^{n} a_{k} \vec{e}_{k} \text { (uniquely).}\]

    Hence (ii) implies both (iii) and (i). (Why?)

    Now assume (iii). Then each \(\vec{y} \in E^{n}\left(C^{n}\right)\) has the form \(\vec{y}=\phi(\vec{x}),\) where

    \[\vec{x}=\sum_{k=1}^{n} x_{k} \vec{e}_{k},\]

    by Theorem 2 in Chapter 3, §§1-3. Hence again

    \[\vec{y}=\sum_{k=1}^{n} x_{k} \phi\left(\vec{e}_{k}\right)=\sum_{k=1}^{n} x_{k} \vec{v}_{k};\]

    so the \(\vec{v}_{k}\) span all of \(E^{n}\left(C^{n}\right).\) By rule (c) above, this implies (ii), hence (i), too. Thus (i), (ii), and (iii) are equivalent.

    Also, by rules (a) and (b), we have

    \[\operatorname{det}[\phi] \cdot \operatorname{det}\left[\phi^{-1}\right]=\operatorname{det}\left[\phi \circ \phi^{-1}\right]=1\]

    if \(\phi\) is one-to-one (for \(\phi \circ \phi^{-1}\) is the identity map). Hence \(\operatorname{det}[\phi] \neq 0\) if (i) holds.

    For the converse, suppose \(\phi\) is not one-to-one. Then by (ii), the \(\vec{v}_{k}\) are not independent. Thus one of them is a linear combination of the others, say,

    \[\vec{v}_{1}=\sum_{k=2}^{n} a_{k} \vec{v}_{k}.\]

    But by linear algebra (Problem 13(iii)), \(\operatorname{det}[\phi]\) does not change if \(\vec{v}_{1}\) is replaced by

    \[\vec{v}_{1}-\sum_{k=2}^{n} a_{k} \vec{v}_{k}=\overrightarrow{0}.\]

    Thus \(\operatorname{det}[\phi]=0\) (one column turning to \(\overrightarrow{0}).\) This completes the proof. \(\quad \square\)

    Note 3. Maps that are both onto and one-to-one are called bijective. Such is \(\phi\) in Theorem 1. This means that the equation


    has a unique solution


    for each \(\vec{y}.\) Componentwise, by Theorem 1, the equations

    \[\sum_{k=1}^{n} x_{k} v_{i k}=y_{i}, \quad i=1, \ldots, n,\]

    have a unique solution for the \(x_{k}\) iff \(\operatorname{det}\left(v_{i k}\right) \neq 0\).

    Corollary \(\PageIndex{2}\)

    If \(\phi \in L\left(E^{\prime}, E\right)\) is bijective, with \(E^{\prime}\) and \(E\) complete, then \(\phi^{-1} \in L\left(E, E^{\prime}\right).\)

    Proof for \(E=E^{n}\left(C^{n}\right)\)

    The notation \(\phi \in L\left(E^{\prime}, E\right)\) means that \(\phi : E^{\prime} \rightarrow E\) is linear and continuous.

    As \(\phi\) is bijective, \(\phi^{-1} : E \rightarrow E^{\prime}\) is linear (Problem 12).

    If \(E=E^{n}\left(C^{n}\right),\) it is continuous, too (Theorem 2 in §2).

    Thus \(\phi^{-1} \in L\left(E, E^{\prime}\right). \quad \square\)

    Note. The case \(E=E^{n}\left(C^{n}\right)\) suffices for an undergraduate course. (The beginner is advised to omit the "starred" §8.) Corollary 2 and Theorem 2 below, however, are valid in the general case. So is Theorem 1 in §7.

    Theorem \(\PageIndex{2}\)

    Let \(E, E^{\prime}\) and \(\phi\) be as in Corollary 2. Set


    Then any map \(\theta \in L\left(E^{\prime}, E\right)\) with \(\|\theta-\phi\|<\varepsilon\) is one-to-one, and \(\theta^{-1}\) is uniformly continuous.


    Proof. By Corollary 2, \(\phi^{-1} \in L\left(E, E^{\prime}\right),\) so \(\left\|\phi^{-1}\right\|\) is defined and \(>0\) (for \(\phi^{-1}\) is not the zero map, being one-to-one).

    Thus we may set

    \[\varepsilon=\frac{1}{\left\|\phi^{-1}\right\|}, \quad\left\|\phi^{-1}\right\|=\frac{1}{\varepsilon}.\]

    Clearly \(\vec{x}=\phi^{-1}(\vec{y})\) if \(\vec{y}=\phi(\vec{x}).\) Also,

    \[\left|\phi^{-1}(\vec{y})\right| \leq \frac{1}{\varepsilon}|\vec{y}|\]

    by Note 5 in §2, Hence

    \[|\vec{y}| \geq \varepsilon\left|\phi^{-1}(\vec{y})\right|,\]


    \[|\phi(\vec{x})| \geq \varepsilon|\vec{x}|\]

    for all \(\vec{x} \in E^{\prime}\) and \(\vec{y} \in E\).

    Now suppose \(\phi \in L\left(E^{\prime}, E\right)\) and \(\|\theta-\phi\|=\sigma<\varepsilon\).

    Obviously, \(\theta=\phi-(\phi-\theta),\) and by Note 5 in §2,

    \[|(\phi-\theta)(\vec{x})| \leq\|\phi-\theta\||\vec{x}|=\sigma|\vec{x}|.\]

    Thus for every \(\vec{x} \in E^{\prime}\),

    \[\begin{aligned}|\theta(\vec{x})| & \geq|\phi(\vec{x})|-|(\phi-\theta)(\vec{x})| \\ & \geq|\phi(\vec{x})|-\sigma|\vec{x}| \\ & \geq(\varepsilon-\sigma)|\vec{x}| \end{aligned}\]

    by (2). Therefore, given \(\vec{p} \neq \vec{r}\) in \(E^{\prime}\) and setting \(\vec{x}=\vec{p}-\vec{r} \neq \overrightarrow{0},\) we obtain

    \[|\theta(\vec{p})-\theta(\vec{r})|=|\theta(\vec{p}-\vec{r})|=|\theta(\vec{x})| \geq(\varepsilon-\sigma)|\vec{x}|>0\]

    (since \(\sigma<\varepsilon )\).

    We see that \(\vec{p} \neq \vec{r}\) implies \(\theta(\vec{p}) \neq \theta(\vec{r});\) so \(\theta\) is one-to-one, indeed.

    Also, setting \(\theta(\vec{x})=\vec{z}\) and \(\vec{x}=\theta^{-1}(\vec{z})\) in (3), we get

    \[|\vec{z}| \geq(\varepsilon-\sigma)\left|\theta^{-1}(\vec{z})\right|;\]

    that is,

    \[\left|\theta^{-1}(\vec{z})\right| \leq(\varepsilon-\sigma)^{-1}|\vec{z}|\]

    for all \(\vec{z}\) in the range of \(\theta\) (domain of \(\theta^{-1})\).

    Thus \(\theta^{-1}\) is linearly bounded (by Theorem 1 in §2), hence uniformly continuous, as claimed.\(\quad \square\)

    Corollary \(\PageIndex{3}\)

    If \(E^{\prime}=E=E^{n}\left(C^{n}\right)\) in Theorem 2 above, then for given \(\phi\) and \(\delta>0,\) there always is \(\delta^{\prime}>0\) such that

    \[\|\theta-\phi\|<\delta^{\prime} \text { implies }\left\|\theta^{-1}-\phi^{-1}\right\|<\delta.\]

    In other words, the transformation \(\phi \rightarrow \phi^{-1}\) is continuous on \(L(E), E=\) \(E^{n}\left(C^{n}\right).\)


    First, since \(E^{\prime}=E=E^{n}\left(C^{n}\right), \theta\) is bijective by Theorem 1(iii), so \(\theta^{-1} \in L(E)\).

    As before, set \(\|\theta-\phi\|=\sigma<\varepsilon\).

    By Note 5 in §2, formula (5) above implies that

    \[\left\|\theta^{-1}\right\| \leq \frac{1}{\varepsilon-\sigma}.\]


    \[\phi^{-1} \circ(\theta-\phi) \circ \theta^{-1}=\phi^{-1}-\theta^{-1}\]

    (see Problem 11).

    Hence by Corollary 4 in §2, recalling that \(\left\|\phi^{-1}\right\|=1 / \varepsilon,\) we get

    \[\left\|\theta^{-1}-\phi^{-1}\right\| \leq\left\|\phi^{-1}\right\| \cdot\|\theta-\phi\| \cdot\left\|\theta^{-1}\right\| \leq \frac{\sigma}{\varepsilon(\varepsilon-\sigma)} \rightarrow 0 \text { as } \sigma \rightarrow 0. \quad \square\]