Search

Text Color

Margin Size

Font Type

Enable Dyslexic Font

6.7: Inverse and Implicit Functions. Open and Closed Maps

Last updated

Sep 5, 2021
Save as PDF
- 6.6.E: Problems on Bijective Linear Maps and Jacobians
- 6.7.E: Problems on Inverse and Implicit Functions, Open and Closed Maps

Elias Zakon
University of Windsor via The Trilla Group (support by Saylor Foundation)

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\id}{\mathrm{id}}$ $\newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$ $\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$ $\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\id}{\mathrm{id}}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\kernel}{\mathrm{null}\,}$

$\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$

$\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$

$\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$ $\newcommand{\AA}{\unicode[.8,0]{x212B}}$

$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vectorC}[1]{\textbf{#1}}$

$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$

$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$

$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\avec}{\mathbf a}$

$\newcommand{\bvec}{\mathbf b}$

$\newcommand{\cvec}{\mathbf c}$

$\newcommand{\dvec}{\mathbf d}$

$\newcommand{\dtil}{\widetilde{\mathbf d}}$

$\newcommand{\evec}{\mathbf e}$

$\newcommand{\fvec}{\mathbf f}$

$\newcommand{\nvec}{\mathbf n}$

$\newcommand{\pvec}{\mathbf p}$

$\newcommand{\qvec}{\mathbf q}$

$\newcommand{\svec}{\mathbf s}$

$\newcommand{\tvec}{\mathbf t}$

$\newcommand{\uvec}{\mathbf u}$

$\newcommand{\vvec}{\mathbf v}$

$\newcommand{\wvec}{\mathbf w}$

$\newcommand{\xvec}{\mathbf x}$

$\newcommand{\yvec}{\mathbf y}$

$\newcommand{\zvec}{\mathbf z}$

$\newcommand{\rvec}{\mathbf r}$

$\newcommand{\mvec}{\mathbf m}$

$\newcommand{\zerovec}{\mathbf 0}$

$\newcommand{\onevec}{\mathbf 1}$

$\newcommand{\real}{\mathbb R}$

$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$

$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$

$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$

$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$

$\newcommand{\bcal}{\cal B}$

$\newcommand{\ccal}{\cal C}$

$\newcommand{\scal}{\cal S}$

$\newcommand{\wcal}{\cal W}$

$\newcommand{\ecal}{\cal E}$

$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$

$\newcommand{\gray}[1]{\color{gray}{#1}}$

$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$

$\newcommand{\rank}{\operatorname{rank}}$

$\newcommand{\row}{\text{Row}}$

$\newcommand{\col}{\text{Col}}$

$\renewcommand{\row}{\text{Row}}$

$\newcommand{\nul}{\text{Nul}}$

$\newcommand{\var}{\text{Var}}$

$\newcommand{\corr}{\text{corr}}$

$\newcommand{\len}[1]{\left|#1\right|}$

$\newcommand{\bbar}{\overline{\bvec}}$

$\newcommand{\bhat}{\widehat{\bvec}}$

$\newcommand{\bperp}{\bvec^\perp}$

$\newcommand{\xhat}{\widehat{\xvec}}$

$\newcommand{\vhat}{\widehat{\vvec}}$

$\newcommand{\uhat}{\widehat{\uvec}}$

$\newcommand{\what}{\widehat{\wvec}}$

$\newcommand{\Sighat}{\widehat{\Sigma}}$

$\newcommand{\lt}{<}$

$\newcommand{\gt}{>}$

$\newcommand{\amp}{&}$

$\definecolor{fillinmathshade}{gray}{0.9}$

I. "If $f \in C D^{1}$ at $\vec{p},$ then $f$ resembles a linear map (namely $d f )$ at $\vec{p}."$ Pursuing this basic idea, we first make precise our notion of " $f \in C D^{1}$ at $\vec{p}$ ."

Definition 1

A map $f : E^{\prime} \rightarrow E$ is continuously differentiable, or of class $C D^{1}$ (written $f \in C D^{1}),$ at $\vec{p}$ iff the following statement is true:

$\begin{array}{l}{\text { Given any } \varepsilon>0, \text { there is } \delta>0 \text { such that } f \text { is differentiable on the }} \\ {\text { globe } \overline{G}=\overline{G_{\vec{p}}(\delta)}, \text { with }} \\ {\qquad\|d f(\vec{x} ; \cdot)-d f(\vec{p} ; \cdot)\|<\varepsilon \text { for all } \vec{x} \in \overline{G}.}\end{array}$

By Problem 10 in §5, this definition agrees with Definition 1 §5, but is no longer limited to the case $E^{\prime}=E^{n}\left(C^{n}\right).$ See also Problems 1 and 2 below.

We now obtain the following result.

Theorem $\PageIndex{1}$

Let $E^{\prime}$ and $E$ be complete. If $f : E^{\prime} \rightarrow E$ is of class $C D^{1}$ at $\vec{p}$ and if $d f(\vec{p} ; \cdot)$ is bijective (§6), then $f$ is one-to-one on some globe $\overline{G}=\overline{G_{\vec{p}}}(\delta).$

Thus $f$ "locally" resembles df $(\vec{p} ; \cdot)$ in this respect.

Proof

Set $\phi=d f(\vec{p} ; \cdot)$ and

$\left\|\phi^{-1}\right\|=\frac{1}{\varepsilon}$

(cf. Theorem 2 of §6).

By Definition 1, fix $\delta>0$ so that for $\vec{x} \in \overline{G}=\overline{G_{\vec{p}}(\delta)}$ .

$\|d f(\vec{x} ; \cdot)-\phi\|<\frac{1}{2} \varepsilon.$

Then by Note 5 in §2,

$(\forall \vec{x} \in \overline{G})\left(\forall \vec{u} \in E^{\prime}\right) \quad|d f(\vec{x} ; \vec{u})-\phi(\vec{u})| \leq \frac{1}{2} \varepsilon|\vec{u}|.$

Now fix any $\vec{r}, \vec{s} \in \overline{G}, \vec{r} \neq \vec{s},$ and set $\vec{u}=\vec{r}-\vec{s} \neq 0.$ Again, by Note 5 in §2,

$|\vec{u}|=\left|\phi^{-1}(\phi(\vec{u}))\right| \leq\left\|\phi^{-1}\right\||\phi(\vec{u})|=\frac{1}{\varepsilon}|\phi(\vec{u})|;$

$0<\varepsilon|\vec{u}| \leq|\phi(\vec{u})|.$

By convexity, $\overline{G} \supseteq I=L[\vec{s}, \vec{r}],$ so (1) holds for $\vec{x} \in I, \vec{x}=\vec{s}+t \vec{u}, 0 \leq t \leq 1$ .

Noting this, set

$h(t)=f(\vec{s}+t \vec{u})-t \phi(\vec{u}), \quad t \in E^{1}.$

Then for $0 \leq t \leq 1$ ,

$\begin{aligned} h^{\prime}(t) &=D_{\vec{u}} f(\vec{s}+t \vec{u})-\phi(\vec{u}) \\ &=d f(\vec{s}+t \vec{u} ; \vec{u})-\phi(\vec{u}). \end{aligned}$

(Verify!) Thus by (1) and (2),

$\begin{aligned} \sup _{0 \leq t \leq 1}\left|h^{\prime}(t)\right| &=\sup _{0 \leq t \leq 1}|d f(\vec{s}+t \vec{u} ; \vec{u})-\phi(\vec{u})| \\ & \leq \frac{\varepsilon}{2}|\vec{u}| \leq \frac{1}{2}|\phi(\vec{u})|. \end{aligned}$

(Explain!) Now, by Corollary 1 in Chapter 5, §4,

$|h(1)-h(0)| \leq(1-0) \cdot \sup _{0 \leq t \leq 1}\left|h^{\prime}(t)\right| \leq \frac{1}{2}|\phi(\vec{u})|.$

As $h(0)=f(\vec{s})$ and

$h(1)=f(\vec{s}+\vec{u})-\phi(\vec{u})=f(\vec{r})-\phi(\vec{u}),$

we obtain (even if $\vec{r}=\vec{s})$

$|f(\vec{r})-f(\vec{s})-\phi(\vec{u})| \leq \frac{1}{2}|\phi(\vec{u})| \quad(\vec{r}, \vec{s} \in \overline{G}, \vec{u}=\vec{r}-\vec{s}).$

But by the triangle law,

$|\phi(\vec{u})|-|f(\vec{r})-f(\vec{s})| \leq|f(\vec{r})-f(\vec{s})-\phi(\vec{u})|.$

Thus

$|f(\vec{r})-f(\vec{s})| \geq \frac{1}{2}|\phi(\vec{u})| \geq \frac{1}{2} \varepsilon|\vec{u}|=\frac{1}{2} \varepsilon|\vec{r}-\vec{s}|$

by (2).

Hence $f(\vec{r}) \neq f(\vec{s})$ whenever $\vec{r} \neq \vec{s}$ in $\overline{G};$ so $f$ is one-to-one on $\overline{G},$ as claimed. $\quad \square$

Corollary $\PageIndex{1}$

Under the assumptions of Theorem 1, the maps $f$ and $f^{-1}$ (the inverse of $f$ restricted to $\overline{G}$ ) are uniformly continuous on $\overline{G}$ and $f[\overline{G}],$ respectively.

Proof

By (3),

$\begin{aligned}|f(\vec{r})-f(\vec{s})| & \leq|\phi(\vec{u})|+\frac{1}{2}|\phi(\vec{u})| \\ & \leq|2 \phi(\vec{u})| \\ & \leq 2\|\phi\||\vec{u}| \\ &=2\|\phi\||\vec{r}-\vec{s}| \quad(\vec{r}, \vec{s} \in \overline{G}). \end{aligned}$

This implies uniform continuity for $f$ . (Why?)

Next, let $g=f^{-1}$ on $H=f[\overline{G}]$ .

If $\vec{x}, \vec{y} \in H,$ let $\vec{r}=g(\vec{x})$ and $\vec{s}=g(\vec{y});$ so $\vec{r}, \vec{s} \in \overline{G},$ with $\vec{x}=f(\vec{r})$ and $\vec{y}=f(\vec{s}).$ Hence by (4),

$|\vec{x}-\vec{y}| \geq \frac{1}{2} \varepsilon|g(\vec{x})-g(\vec{y})|,$

proving all for $g,$ too. $\quad \square$

Again, $f$ resembles $\phi$ which is uniformly continuous, along with $\phi^{-1}$ .

II. We introduce the following definition.

Definition 2

A map $f :(S, \rho) \rightarrow\left(T, \rho^{\prime}\right)$ is closed (open) on $D \subseteq S$ iff, for any $X \subseteq D$ the set $f[X]$ is closed (open) in $T$ whenever $X$ is so in $S.$

Note that continuous maps have such a property for inverse images (Problem 15 in Chapter 4, §2).

Corollary $\PageIndex{2}$

Under the assumptions of Theorem 1, $f$ is closed on $\overline{G},$ and so the set $f[\overline{G}]$ is closed in $E.$

Similarly for the map $f^{-1}$ on $f[\overline{G}]$ .

Proof for $E^{\prime}=E=E^{n}\left(C^{n}\right)$ (for the general case, see Problem 6)

Given any closed $X \subseteq \overline{G},$ we must show that $f[X]$ is closed in $E.$

Now, as $\overline{G}$ is closed and bounded, it is compact (Theorem 4 of Chapter 4, §6).

So also is $X$ (Theorem 1 in Chapter 4, §6), and so is $f[X]$ (Theorem 1 of Chapter 4, §8).

By Theorem 2 in Chapter 4, §6, $f[X]$ is closed, as required. $\quad \square$

For the rest of this section, we shall set $E^{\prime}=E=E^{n}\left(C^{n}\right)$ .

Theorem $\PageIndex{2}$

If $E^{\prime}=E=E^{n}\left(C^{n}\right)$ in Theorem 1, with other assumptions unchanged, then $f$ is open on the globe $G=G_{\vec{p}}(\delta),$ with $\delta$ sufficiently small.

Proof: We first prove the following lemma.

Lemma

$f[G]$ contains a globe $G_{\vec{q}}(\alpha)$ where $\vec{q}=f(\vec{p})$ .

Proof

Indeed, let

$\alpha=\frac{1}{4} \varepsilon \delta,$

where $\delta$ and $\varepsilon$ are as in the proof of Theorem 1. (We continue the notation and formulas of that proof.)

Fix any $\vec{c} \in G_{\vec{q}}(\alpha);$ so

$|\vec{c}-\vec{q}|<\alpha=\frac{1}{4} \varepsilon \delta.$

Set $h=|f-\vec{c}|$ on $E^{\prime}.$ As $f$ is uniformly continuous on $\overline{G},$ so is $h$ .

Now, $\overline{G}$ is compact in $E^{n}\left(C^{n}\right);$ so Theorem 2(ii) in Chapter 4, §8, yields a point $\vec{r} \in \overline{G}$ such that

$h(\vec{r})=\min h[\overline{G}].$

We claim that $\vec{r}$ is in $G$ (the interior of $\overline{G})$ .

Otherwise, $|\vec{r}-\vec{p}|=\delta ;$ for by (4),

$\begin{aligned} 2 \alpha=\frac{1}{2} \varepsilon \delta=\frac{1}{2} \varepsilon|\vec{r}-\vec{p}| & \leq|f(\vec{r})-f(\vec{p})| \\ & \leq|f(\vec{r})-\vec{c}|+|\vec{c}-f(\vec{p})| \\ &=h(\vec{r})+h(\vec{p}). \end{aligned}$

But

$h(\vec{p})=|\vec{c}-f(\vec{p})|=|\vec{c}-\vec{q}|<\alpha;$

and so (7) yields

$h(\vec{p})<\alpha<h(\vec{r}),$

contrary to the minimality of $h(\vec{r})$ (see (6)). Thus $|\vec{r}-\vec{p}|$ cannot equal $\delta$ .

We obtain $|\vec{r}-\vec{p}|<\delta,$ so $\vec{r} \in G_{\vec{p}}(\delta)=G$ and $f(\vec{r}) \in f[G].$ We shall now show that $\vec{c}=f(\vec{r}).$

To this end, we set $\vec{v}=\vec{c}-f(\vec{r})$ and prove that $\vec{v}=\overrightarrow{0}.$ Let

$\vec{u}=\phi^{-1}(\vec{v}),$

where

$\phi=d f(\vec{p} ; \cdot),$

as before. Then

$\vec{v}=\phi(\vec{u})=d f(\vec{p} ; \vec{u}).$

With $\vec{r}$ as above, fix some

$\vec{s}=\vec{r}+t \vec{u} \quad(0<t<1)$

with $t$ so small that $\vec{s} \in G$ also. Then by formula (3),

$|f(\vec{s})-f(\vec{r})-\phi(t \vec{u})| \leq \frac{1}{2}|t \vec{v}|;$

also,

$|f(\vec{r})-\vec{c}+\phi(t \vec{u})|=(1-t)|\vec{v}|=(1-t) h(\vec{r})$

by our choice of $\vec{v}, \vec{u}$ and $h.$ Hence by the triangle law,

$h(\vec{s})=|f(\vec{s})-\vec{c}| \leq\left(1-\frac{1}{2} t\right) h(\vec{r}).$

(Verify!)

As $0<t<1,$ this implies $h(\vec{r})=0$ (otherwise, $h(\vec{s})<h(\vec{r}),$ violating (6)).

Thus, indeed,

$|\vec{v}|=|f(\vec{r})-\vec{c}|=0,$

i.e.,

$\vec{c}=f(\vec{r}) \in f[G] \quad \text { for } \vec{r} \in G.$

But $\vec{c}$ was an arbitrary point of $G_{\vec{q}}(\alpha).$ Hence

$G_{\vec{q}}(\alpha) \subseteq f[G],$

proving the lemma. $\quad \square$

Proof of Theorem 2. The lemma shows that $f(\vec{p})$ is in the interior of $f[G]$ if $\vec{p}, f, d f(\vec{p} ; \cdot),$ and $\delta$ are as in Theorem 1.

But Definition 1 implies that here $f \in C D^{1}$ on all of $G$ (see Problem 1).

Also, $d f(\vec{x} ; \cdot)$ is bijective for any $\vec{x} \in G$ by our choice of $G$ and Theorems 1 and 2 in §6.

Thus $f$ maps all $\vec{x} \in G$ onto interior points of $f[G];$ i.e., $f$ maps any open set $X \subseteq G$ onto an open $f[X],$ as required. $\quad \square$

Note 1. A map

$f :(S, \rho) \underset{\text { onto }}{\longleftrightarrow} (T, \rho^{\prime})$

is both open and closed ("clopen") iff $f^{-1}$ is continuous - see Problem 15(iv)(v) in Chapter 4, §2, interchanging $f$ and $f^{-1}.$

Thus $\phi=d f(\vec{p} ; \cdot)$ in Theorem 1 is "clopen" on all of $E^{\prime}$ .

Again, $f$ locally resembles $d f(\vec{p} ; \cdot)$ .

III. The Inverse Function Theorem. We now further pursue these ideas.

Theorem $\PageIndex{3}$ (inverse functions)

Under the assumptions of Theorem 2, let $g$ be the inverse of $f_{G}\left(f \text { restricted to } G=G_{\vec{p}}(\delta)\right)$ .

Then $g \in C D^{1}$ on $f[G]$ and $d g(\vec{y} ; \cdot)$ is the inverse of $d f(\vec{x} ; \cdot)$ whenever $\vec{x}=g(\vec{y}), \vec{x} \in G.$

Briefly: "The differential of the inverse is the inverse of the differential."

Proof

Fix any $\vec{y} \in f[G]$ and $\vec{x}=g(\vec{y}) ;$ so $\vec{y}=f(\vec{x})$ and $\vec{x} \in G.$ Let $U=d f(\vec{x} ; \cdot).$

As noted above, $U$ is bijective for every $\vec{x} \in G$ by Theorems 1 and 2 in §6; so we may set $V=U^{-1}.$ We must show that $V=d g(\vec{y} ; \cdot).$

To do this, give $\vec{y}$ an arbitrary (variable) increment $\Delta \vec{y},$ so small that $\vec{y}+\Delta \vec{y}$ stays in $f[G]$ (an open set by Theorem 2).

As $g$ and $f_{G}$ are one-to-one, $\Delta \vec{y}$ uniquely determines

$\Delta \vec{x}=g(\vec{y}+\Delta \vec{y})-g(\vec{y})=\vec{t},$

and vice versa:

$\Delta \vec{y}=f(\vec{x}+\vec{t})-f(\vec{x}).$

Here $\Delta \vec{y}$ and $\vec{t}$ are the mutually corresponding increments of $\vec{y}=f(\vec{x})$ and $\vec{x}=g(\vec{y}).$ By continuity, $\vec{y} \rightarrow \overrightarrow{0}$ iff $\vec{t} \rightarrow \overrightarrow{0}.$

As $U=d f(\vec{x} ; \cdot)$ ,

$\lim _{\vec{t} \rightarrow \overline{0}} \frac{1}{|\vec{t}|}|f(\vec{x}+\vec{t})-f(\vec{t})-U(\vec{t})|=0,$

$\lim _{\vec{t} \rightarrow \overrightarrow{0}} \frac{1}{|\vec{t}|}|F(\vec{t})|=0,$

where

$F(\vec{t})=f(\vec{x}+\vec{t})-f(\vec{t})-U(\vec{t}).$

As $V=U^{-1},$ we have

$V(U(\vec{t}))=\vec{t}=g(\vec{y}+\Delta \vec{y})-g(\vec{y}).$

So from (9),

$\begin{aligned} V(F(\vec{t})) &=V(\Delta \vec{y})-\vec{t} \\ &=V(\Delta \vec{y})-[g(\vec{y}+\Delta \vec{y})-g(\vec{y})]; \end{aligned}$

that is,

$\frac{1}{|\Delta \vec{y}|}|g(\vec{y}+\Delta \vec{y})-g(\vec{y})-V(\Delta \vec{y})|=\frac{|V(F(\vec{t}))|}{|\Delta \vec{y}|}, \quad \Delta \vec{y} \neq \overrightarrow{0}.$

Now, formula (4), with $\vec{r}=\vec{x}, \vec{s}=\vec{x}+\vec{t},$ and $\vec{u}=\vec{t},$ shows that

$|f(\vec{x}+\vec{t})-f(\vec{x})| \geq \frac{1}{2} \varepsilon|\vec{t}|;$

i.e., $|\Delta \vec{y}| \geq \frac{1}{2} \varepsilon|\vec{t}|.$ Hence by (8),

$\frac{|V(F(\vec{t}))|}{|\Delta \vec{y}|} \leq \frac{|V(F(\vec{t}) |}{\frac{1}{2} \varepsilon|\vec{t}|}=\frac{2}{\varepsilon}\left|V\left(\frac{1}{|\vec{t}|} F(\vec{t})\right)\right| \leq \frac{2}{\varepsilon}\|V\| \frac{1}{|\vec{t}|}|F(\vec{t})| \rightarrow 0 \text { as } \vec{t} \rightarrow \overrightarrow{0}.$

Since $\vec{t} \rightarrow \overrightarrow{0}$ as $\Delta \vec{y} \rightarrow \overrightarrow{0}$ (change of variables!), the expression (10) tends to 0 as $\Delta \vec{y} \rightarrow \overrightarrow{0}.$

By definition, then, $g$ is differentiable at $\vec{y},$ with $d g(\vec{y};)=V=U^{-1}$ .

Moreover, Corollary 3 in §6, applies here. Thus

$\left(\forall \delta^{\prime}>0\right)\left(\exists \delta^{\prime \prime}>0\right) \quad\|U-W\|<\delta^{\prime \prime} \Rightarrow\left\|U^{-1}-W^{-1}\right\|<\delta^{\prime}.$

Taking here $U^{-1}=d g(\vec{y})$ and $W^{-1}=d g(\vec{y}+\Delta \vec{y}),$ we see that $g \in C D^{1}$ near $\vec{y}.$ This completes the proof. $\quad \square$

Note 2. If $E^{\prime}=E=E^{n}\left(C^{n}\right),$ the bijectivity of $\phi=d f(\vec{p} ; \cdot)$ is equivalent to

$\operatorname{det}[\phi]=\operatorname{det}\left[f^{\prime}(\vec{p})\right] \neq 0$

(Theorem 1 of §6).

In this case, the fact that $f$ is one-to-one on $G=G_{\vec{p}}(\delta)$ means, componentwise (see Note 3 in §6), that the system of $n$ equations

$f_{i}(\vec{x})=f\left(x_{1}, \ldots, x_{n}\right)=y_{i}, \quad i=1, \ldots, n,$

has a unique solution for the $n$ unknowns $x_{k}$ as long as

$\left(y_{1}, \ldots, y_{n}\right)=\vec{y} \in f[G].$

Theorem 3 shows that this solution has the form

$x_{k}=g_{k}(\vec{y}), \quad k=1, \ldots, n,$

where the $g_{k}$ are of class $C D^{1}$ on $f[G]$ provided the $f_{i}$ are of class $C D^{1}$ near $\vec{p}$ and det $\left[f^{\prime}(\vec{p})\right] \neq 0.$ Here

$\operatorname{det}\left[f^{\prime}(\vec{p})\right]=J_{f}(\vec{p}),$

as in §6.

Thus again $f$ "locally" resembles a linear map, $\phi=d f(\vec{p} ; \cdot)$ .

IV. The Implicit Function Theorem. Generalizing, we now ask, what about solving $n$ equations in $n+m$ unknowns $x_{1}, \ldots, x_{n}, y_{1}, \ldots, y_{m}?$ Say, we want to solve

$f_{k}\left(x_{1}, \ldots, x_{n}, y_{1}, \ldots, y_{m}\right)=0, \quad k=1,2, \ldots, n,$

for the first $n$ unknowns (or variables) $x_{k},$ thus expressing them as

$x_{k}=H_{k}\left(y_{1}, \ldots, y_{m}\right), \quad k=1, \ldots, n,$

with $H_{k} : E^{m} \rightarrow E^{1}$ or $H_{k} : C^{m} \rightarrow C$ .

Let us set $\vec{x}=\left(x_{1}, \ldots, x_{n}\right), \vec{y}=\left(y_{1}, \ldots, y_{m}\right),$ and

$(\vec{x}, \vec{y})=\left(x_{1}, \ldots, x_{n}, y_{1}, \ldots, y_{m}\right)$

so that $(\vec{x}, \vec{y}) \in E^{n+m}\left(C^{n+m}\right)$ .

Thus the system of equations (11) simplifies to

$f_{k}(\vec{x}, \vec{y})=0, \quad k=1, \ldots, n$

$f(\vec{x}, \vec{y})=\overrightarrow{0},$

where $f=\left(f_{1}, \ldots, f_{n}\right)$ is a map of $E^{n+m}\left(C^{n+m}\right)$ into $E^{n}\left(C^{n}\right) ; f$ is a function of $n+m$ variables, but it has $n$ components $f_{k};$ i.e.,

$f(\vec{x}, \vec{y})=f\left(x_{1}, \ldots, x_{n}, y_{1}, \ldots, y_{m}\right)$

is a vector in $E^{n}\left(C^{n}\right)$ .

Theorem $\PageIndex{4}$ (implicit functions)

Let $E^{\prime}=E^{n+m}\left(C^{n+m}\right), E=E^{n}\left(C^{n}\right),$ and let $f : E^{\prime} \rightarrow E$ be of class $C D^{1}$ near

$(\vec{p}, \vec{q})=\left(p_{1}, \ldots, p_{n}, q_{1}, \ldots, q_{m}\right), \quad \vec{p} \in E^{n}\left(C^{n}\right), \vec{q} \in E^{m}\left(C^{m}\right).$

Let $[\phi]$ be the $n \times n$ matrix

$\left(D_{j} f_{k}(\vec{p}, \vec{q})\right), \quad j, k=1, \ldots, n.$

If $\operatorname{det}[\phi] \neq 0$ and if $f(\vec{p}, \vec{q})=\overrightarrow{0},$ then there are open sets

$P \subseteq E^{n}\left(C^{n}\right) \text { and } Q \subseteq E^{m}\left(C^{m}\right),$

with $\vec{p} \in P$ and $\vec{q} \in Q,$ for which there is a unique map

$H : Q \rightarrow P$

with

$f(H(\vec{y}), \vec{y})=\overrightarrow{0}$

for all $\vec{y} \in Q;$ furthermore, $H \in C D^{1}$ on $Q$ .

Thus $\vec{x}=H(\vec{y})$ is a solution of (11) in vector form.

Proof

With the above notation, set

$F(\vec{x}, \vec{y})=(f(\vec{x}, \vec{y}), \vec{y}), \quad F : E^{\prime} \rightarrow E^{\prime}.$

Then

$F(\vec{p}, \vec{q})=(f(\vec{p}, \vec{q}), \vec{q})=(\overrightarrow{0}, \vec{q}),$

since $f(\vec{p}, \vec{q})=\overrightarrow{0}$ .

As $f \in C D^{1}$ near $(\vec{p}, \vec{q}),$ so is $F$ (verify componentwise via Problem 9(ii) in §3 and Definition 1 of §5).

By Theorem 4, §3, $\operatorname{det}\left[F^{\prime}(\vec{p}, \vec{q})\right]=\operatorname{det}[\phi] \neq 0$ (explain!).

Thus Theorem 1 above shows that $F$ is one-to-one on some globe $G$ about $(\vec{p}, \vec{q}).$

Clearly $G$ contains an open interval about $(\vec{p}, \vec{q}).$ We denote it by $P \times Q$ where $\vec{p} \in P, \vec{q} \in Q ; P$ is open in $E^{n}\left(C^{n}\right)$ and $Q$ is open in $E^{m}\left(C^{m}\right).$

By Theorem 3, $F_{P \times Q}$ ( $F$ restricted to $P \times Q)$ has an inverse

$g : A \underset{\text { onto }}{\longleftrightarrow} P \times Q,$

where $A=F[P \times Q]$ is open in $E^{\prime}$ (Theorem 2), and $g \in C D^{1}$ on $A.$ Let the map $u=\left(g_{1}, \ldots, g_{n}\right)$ comprise the first $n$ components of $g$ (exactly as $f$ comprises the first $n$ components of $F )$ .

Then

$g(\vec{x}, \vec{y})=(u(\vec{x}, \vec{y}), \vec{y})$

exactly as $F(\vec{x}, \vec{y})=(f(\vec{x}, \vec{y}), \vec{y}).$ Also, $u : A \rightarrow P$ is of class $C D^{1}$ on $A,$ as $g$ is (explain!).

Now set

$H(\vec{y})=u(\overrightarrow{0}, \vec{y});$

here $\vec{y} \in Q,$ while

$(\overrightarrow{0}, \vec{y}) \in A=F[P \times Q],$

for $F$ preserves $\vec{y}$ (the last $m$ coordinates). Also set

$\alpha(\vec{x}, \vec{y})=\vec{x}.$

Then $f=\alpha \circ F$ (why?), and

$f(H(\vec{y}), \vec{y})=f(u(\overrightarrow{0}, \vec{y}), \vec{y})=f(g(\overrightarrow{0}, \vec{y}))=\alpha(F(g(\overrightarrow{0}, \vec{y}))=\alpha(\overrightarrow{0}, \vec{y})=\overrightarrow{0}$

by our choice of $\alpha$ and $g$ (inverse to $F).$ Thus

$f(H(\vec{y}), \vec{y})=\overrightarrow{0}, \quad \vec{y} \in Q,$

as desired.

Moreover, as $H(\vec{y})=u(\overrightarrow{0}, \vec{y}),$ we have

$\frac{\partial}{\partial y_{i}} H(\vec{y})=\frac{\partial}{\partial y_{i}} u(\overrightarrow{0}, \vec{y}), \quad \vec{y} \in Q, i \leq m.$

As $u \in C D^{1},$ all $\partial u / \partial y_{i}$ are continuous (Definition 1 in §5); hence so are the $\partial H / \partial y_{i}.$ Thus by Theorem 3 in §3, $H \in C D^{1}$ on $Q.$

Finally, $H$ is unique for the given $P, Q;$ for

$\begin{aligned} f(\vec{x}, \vec{y})=\overrightarrow{0} & \Longrightarrow(f(\vec{x}, \vec{y}), \vec{y})=(\overrightarrow{0}, \vec{y}) \\ & \Longrightarrow F(\vec{x}, \vec{y})=(\overrightarrow{0}, \vec{y}) \\ & \Longrightarrow g(F(\vec{x}, \vec{y}))=g(\overrightarrow{0}, \vec{y}) \\ & \Longrightarrow(\vec{x}, \vec{y})=g(\overrightarrow{0}, \vec{y})=(u(\overrightarrow{0}, \vec{y}), \vec{y}) \\ & \Longrightarrow \vec{x}=u(\overrightarrow{0}, \vec{y})=H(\vec{y}). \end{aligned}$

Thus $f(\vec{x}, \vec{y})=\overrightarrow{0}$ implies $\vec{x}=H(\vec{y});$ so $H(\vec{y})$ is the only solution for $\vec{x}. \quad \square$

Note 3. $H$ is said to be implicitly defined by the equation $f(\vec{x}, \vec{y})=\overrightarrow{0}.$ In this sense we say that $H(\vec{y})$ is an implicit function, given by $f(\vec{x}, \vec{y})=\overrightarrow{0}$ .

Similarly, under suitable assumptions, $f(\vec{x}, \vec{y})=\overrightarrow{0}$ defines $\vec{y}$ as a function of $\vec{x}.$

Note 4. While $H$ is unique for a given neighborhood $P \times Q$ of $(\vec{p}, \vec{q}),$ another implicit function may result if $P \times Q$ or $(\vec{p}, \vec{q})$ is changed.

For example, let

$f(x, y)=x^{2}+y^{2}-25$

(a polynomial; hence $f \in C D^{1}$ on all of $E^{2}).$ Geometrically, $x^{2}+y^{2}-25=0$ describes a circle.

Solving for $x,$ we get $x=\pm \sqrt{25-y^{2}}.$ Thus we have two functions:

$H_{1}(y)=+\sqrt{25-y^{2}}$

and

$H_{2}(y)=-\sqrt{25-y^{2}}.$

If $P \times Q$ is in the upper part of the circle, the resulting function is $H_{1}.$ Otherwise, it is $H_{2}.$ See Figure 28.

$Screen Shot 2019-06-27 at 2.03.56 PM.png$

V. Implicit Differentiation. Theorem 4 only states the existence (and uniqueness) of a solution, but does not show how to find it, in general.

The knowledge itself that $H \in C D^{1}$ exists, however, enables us to use its derivative or partials and compute it by implicit differentiation, known from calculus.

Examples

(a) Let $f(x, y)=x^{2}+y^{2}-25=0,$ as above.

This time treating $y$ as an implicit function of $x, y=H(x),$ and writing $y^{\prime}$ for $H^{\prime}(x),$ we differentiate both sides of (x^{2}+y^{2}-25=0\) with respect to $x,$ using the chain rule for the term $y^{2}=[H(x)]^{2}$ .

This yields $2 x+2 y y^{\prime}=0,$ whence $y^{\prime}=-x / y$ .

Actually (see Note 4), two functions are involved: $y=\pm \sqrt{25-x^{2}};$ but both satisfy $x^{2}+y^{2}-25=0;$ so the result $y^{\prime}=-x / y$ applies to both.

Of course, this method is possible only if the derivative $y^{\prime}$ is known to exist. This is why Theorem 4 is important.

(b) Let

$f(x, y, z)=x^{2}+y^{2}+z^{2}-1=0, \quad x, y, z \in E^{1}.$

Again $f$ satisfies Theorem 4 for suitable $x, y,$ and $z$ .

Setting $z=H(x, y),$ differentiate the equation $f(x, y, z)=0$ partially with respect to $x$ and $y.$ From the resulting two equations, obtain $\frac{\partial z}{\partial x}$ and $\frac{\partial z}{\partial y}$ .

Definition 1

Theorem 6.7.1\PageIndex{1}

Corollary 6.7.1\PageIndex{1}

Definition 2

Corollary 6.7.2\PageIndex{2}

Theorem 6.7.2\PageIndex{2}

Lemma

Theorem 6.7.3\PageIndex{3} (inverse functions)

Theorem 6.7.4\PageIndex{4} (implicit functions)

Examples

Support Center

How can we help?

Theorem $\PageIndex{1}$

Corollary $\PageIndex{1}$

Corollary $\PageIndex{2}$

Theorem $\PageIndex{2}$

Theorem $\PageIndex{3}$ (inverse functions)

Theorem $\PageIndex{4}$ (implicit functions)