# 8.3: The Derivative

- Page ID
- 6799

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

(click for details)

^n\) be a convex open set, \(f \colon U \to {\mathbb{R}}^m\) a differentiable function, and an \(M\) such that \[\left\lVert {f'(x)} \right\rVert \leq M\] for all \(x \in U\). Then \(f\) is Lipschitz with constant \(M\), that is \[\left\lVert {f(x)-f(y)} \right\rVert \leq M \left\lVert {x-y} \right\rVert\] for all \(x,y \in U\).
Fix \(x\) and \(y\) in \(U\) and note that \((1-t)x+ty \in U\) for all \(t \in [0,1]\) by convexity. Next \[\frac{d}{dt} \Bigl[f\bigl((1-t)x+ty\bigr)\Bigr] = f'\bigl((1-t)x+ty\bigr) (y-x) .\] By mean value theorem above we get \[\left\lVert {f(x)-f(y)} \right\rVert \leq \left\lVert {\frac{d}{dt} \Bigl[ f\bigl((1-t)x+ty\bigr) \Bigr] } \right\rVert \leq \left\lVert {f'\bigl((1-t)x+ty\bigr)} \right\rVert \left\lVert {y-x} \right\rVert \leq M \left\lVert {y-x} \right\rVert . \qedhere\]
If \(U\) is not convex the proposition is not true. To see this fact, take the set \[U = \{ (x,y) : 0.9 < x^2+y^2 < 1.1 \} \setminus \{ (x,0) : x < 0 \} .\] Let \(f(x,y)\) be the angle that the line from the origin to \((x,y)\) makes with the positive \(x\) axis. You can even write the formula for \(f\): \[f(x,y) = 2 \operatorname{arctan}\left( \frac{y}{x+\sqrt{x^2+y^2}}\right) .\] Think spiral staircase with room in the middle. See .
The function is differentiable, and the derivative is bounded on \(U\), which is not hard to see. Thinking of what happens near where the negative \(x\)-axis cuts the annulus in half, we see that the conclusion cannot hold.
Let us solve the differential equation \(f' = 0\).
If \(U \subset {\mathbb{R}}^n\) is connected and \(f \colon U \to {\mathbb{R}}^m\) is differentiable and \(f'(x) = 0\), for all \(x \in U\), then \(f\) is constant.
For any \(x \in U\), there is a ball \(B(x,\delta) \subset U\). The ball \(B(x,\delta)\) is convex. Since \(\left\lVert {f'(y)} \right\rVert \leq 0\) for all \(y \in B(x,\delta)\) then by the theorem, \(\left\lVert {f(x)-f(y)} \right\rVert \leq 0 \left\lVert {x-y} \right\rVert = 0\), so \(f(x) = f(y)\) for all \(y \in B(x,\delta)\).
This means that \(f^{-1}(c)\) is open for any \(c \in {\mathbb{R}}^m\). Suppose \(f^{-1}(c)\) is nonempty. The two sets \[U' = f^{-1}(c), \qquad U'' = f^{-1}({\mathbb{R}}^m\setminus\{c\}) = \bigcup_{\substack{a \in {\mathbb{R}}^m\\a\neq c}} f^{-1}(a)\] are open disjoint, and further \(U = U' \cup U''\). So as \(U'\) is nonempty, and \(U\) is connected, we have that \(U'' = \emptyset\). So \(f(x) = c\) for all \(x \in U\).
Continuously differentiable functions
We say \(f \colon U \subset {\mathbb{R}}^n \to {\mathbb{R}}^m\) is continuously differentiable, or \(C^1(U)\) if \(f\) is differentiable and \(f' \colon U \to L({\mathbb{R}}^n,{\mathbb{R}}^m)\) is continuous.
Let \(U \subset {\mathbb{R}}^n\) be open and \(f \colon U \to {\mathbb{R}}^m\). The function \(f\) is continuously differentiable if and only if all the partial derivatives exist and are continuous.
Without continuity the theorem does not hold. Just because partial derivatives exist does not mean that \(f\) is differentiable, in fact, \(f\) may not even be continuous. See the exercises FIXME.
We have seen that if \(f\) is differentiable, then the partial derivatives exist. Furthermore, the partial derivatives are the entries of the matrix of \(f'(x)\). So if \(f' \colon U \to L({\mathbb{R}}^n,{\mathbb{R}}^m)\) is continuous, then the entries are continuous, hence the partial derivatives are continuous.
To prove the opposite direction, suppose the partial derivatives exist and are continuous. Fix \(x \in U\). If we can show that \(f'(x)\) exists we are done, because the entries of the matrix \(f'(x)\) are then the partial derivatives and if the entries are continuous functions, the matrix valued function \(f'\) is continuous.
Let us do induction on dimension. First let us note that the conclusion is true when \(n=1\). In this case the derivative is just the regular derivative (exercise: you should check that the fact that the function is vector valued is not a problem).
Suppose the conclusion is true for \({\mathbb{R}}^{n-1}\), that is, if we restrict to the first \(n-1\) variables, the conclusion is true. It is easy to see that the first \(n-1\) partial derivatives of \(f\) restricted to the set where the last coordinate is fixed are the same as those for \(f\). In the following we think of \({\mathbb{R}}^{n-1}\) as a subset of \({\mathbb{R}}^n\), that is the set in \({\mathbb{R}}^n\) where \(x^n = 0\). Let \[A = \begin{bmatrix} \frac{\partial f^1}{\partial x^1}(x) & \ldots & \frac{\partial f^1}{\partial x^n}(x) \\ \vdots & \ddots & \vdots \\ \frac{\partial f^m}{\partial x^1}(x) & \ldots & \frac{\partial f^m}{\partial x^n}(x) \end{bmatrix} , \qquad A_1 = \begin{bmatrix} \frac{\partial f^1}{\partial x^1}(x) & \ldots & \frac{\partial f^1}{\partial x^{n-1}}(x) \\ \vdots & \ddots & \vdots \\ \frac{\partial f^m}{\partial x^1}(x) & \ldots & \frac{\partial f^m}{\partial x^{n-1}}(x) \end{bmatrix} , \qquad v = %\frac{\partial f}{\partial x^n}(x) = \begin{bmatrix} \frac{\partial f^1}{\partial x^n}(x) \\ \vdots \\ \frac{\partial f^m}{\partial x^n}(x) \end{bmatrix} .\] Let \(\epsilon > 0\) be given. Let \(\delta > 0\) be such that for any \(k \in {\mathbb{R}}^{n-1}\) with \(\left\lVert {k} \right\rVert < \delta\) we have \[\frac{\left\lVert {f(x+k) - f(x) - A_1k} \right\rVert}{\left\lVert {k} \right\rVert} < \epsilon .\] By continuity of the partial derivatives, suppose \(\delta\) is small enough so that \[\left\lvert {\frac{\partial f^j}{\partial x^n}(x+h) - \frac{\partial f^j}{\partial x^n}(x)} \right\rvert < \epsilon ,\] for all \(j\) and all \(h\) with \(\left\lVert {h} \right\rVert < \delta\).
Let \(h = h_1 + t e_n\) be a vector in \({\mathbb{R}}^n\) where \(h_1 \in {\mathbb{R}}^{n-1}\) such that \(\left\lVert {h} \right\rVert < \delta\). Then \(\left\lVert {h_1} \right\rVert \leq \left\lVert {h} \right\rVert < \delta\). Note that \(Ah = A_1h_1 + tv\). \[\begin{split} \left\lVert {f(x+h) - f(x) - Ah} \right\rVert & = \left\lVert {f(x+h_1 + t e_n) - f(x+h_1) - tv + f(x+h_1) - f(x) - A_1h_1} \right\rVert \\ & \leq \left\lVert {f(x+h_1 + t e_n) - f(x+h_1) -tv} \right\rVert + \left\lVert {f(x+h_1) - f(x) - A_1h_1} \right\rVert \\ & \leq \left\lVert {f(x+h_1 + t e_n) - f(x+h_1) -tv} \right\rVert + \epsilon \left\lVert {h_1} \right\rVert . \end{split}\] As all the partial derivatives exist then by the mean value theorem for each \(j\) there is some \(\theta_j \in [0,t]\) (or \([t,0]\) if \(t < 0\)), such that \[f^j(x+h_1 + t e_n) - f^j(x+h_1) = t \frac{\partial f^j}{\partial x^n}(x+h_1+\theta_j e_n).\] Note that if \(\left\lVert {h} \right\rVert < \delta\) then \(\left\lVert {h_1+\theta_j e_n} \right\rVert \leq \left\lVert {h} \right\rVert < \delta\). So to finish the estimate \[\begin{split} \left\lVert {f(x+h) - f(x) - Ah} \right\rVert & \leq \left\lVert {f(x+h_1 + t e_n) - f(x+h_1) -tv} \right\rVert + \epsilon \left\lVert {h_1} \right\rVert \\ & \leq \sqrt{\sum_{j=1}^m {\left(t\frac{\partial f^j}{\partial x^n}(x+h_1+\theta_j e_n) - t \frac{\partial f^j}{\partial x^n}(x)\right)}^2} + \epsilon \left\lVert {h_1} \right\rVert \\ & \leq \sqrt{m}\, \epsilon \left\lvert {t} \right\rvert + \epsilon \left\lVert {h_1} \right\rVert \\ & \leq (\sqrt{m}+1)\epsilon \left\lVert {h} \right\rVert . \end{split}\]
The Jacobian
Let \(U \subset {\mathbb{R}}^n\) and \(f \colon U \to {\mathbb{R}}^n\) be a differentiable mapping. Then define the Jacobian of \(f\) at \(x\) as \[J_f(x) := \det\bigl( f'(x) \bigr) .\] Sometimes this is written as \[\frac{\partial(f^1,\ldots,f^n)}{\partial(x^1,\ldots,x^n)} .\]
This last piece of notation may seem somewhat confusing, but it is useful when you need to specify the exact variables and function components used.
The Jacobian \(J_f\) is a real valued function, and when \(n=1\) it is simply the derivative. When \(f\) is \(C^1\), then \(J_f\) is a continuous function. From the chain rule it follows that: \[J_{f \circ g} (x) = J_f\bigl(g(x)\bigr) J_g(x) .\]
It can be computed directly that the determinant tells us what happens to area/volume. Suppose we are in \({\mathbb{R}}^2\). Then if \(A\) is a linear transformation, it follows by direct computation that the direct image of the unit square \(A([0,1]^2)\) has area \(\left\lvert {\det(A)} \right\rvert\). Note that the sign of the determinant determines “orientation”. If the determinant is negative, then the two sides of the unit square will be flipped in the image. We claim without proof that this follows for arbitrary figures, not just the square.
Similarly, the Jacobian measures how much a differentiable mapping stretches things locally, and if it flips orientation.
Exercises
Let \(f \colon {\mathbb{R}}^2 \to {\mathbb{R}}\) be given by \(f(x,y) = \sqrt{x^2+y^2}\). Show that \(f\) is not differentiable at the origin.
Define a function \(f \colon {\mathbb{R}}^2 \to {\mathbb{R}}\) by \[f(x,y) := \begin{cases} \frac{xy}{x^2+y^2} & \text{ if $(x,y) \not= (0,0)$}, \\ 0 & \text{ if $(x,y) = (0,0)$}. \end{cases}\] a) Show that partial derivatives \(\frac{\partial f}{\partial x}\) and \(\frac{\partial f}{\partial y}\) exist at all points (including the origin).
b) Show that \(f\) is not continuous at the origin (and hence not differentiable).
Define a function \(f \colon {\mathbb{R}}^2 \to {\mathbb{R}}\) by \[f(x,y) := \begin{cases} \frac{x^2y}{x^2+y^2} & \text{ if $(x,y) \not= (0,0)$}, \\ 0 & \text{ if $(x,y) = (0,0)$}. \end{cases}\] a) Show that partial derivatives \(\frac{\partial f}{\partial x}\) and \(\frac{\partial f}{\partial y}\) exist at all points.
b) Show that \(f\) is continuous at the origin.
c) Show that \(f\) is not differentiable at the origin.
```
Callstack:
at (Bookshelves/Analysis/Book:_Introduction_to_Real_Analysis_(Lebl)/8:_Several_Variables_and_Partial_Derivatives/8.3:_The_Derivative), /content/body/span, line 1, column 1
```