8.1: Metric Spaces
- Page ID
- 6790
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)As mentioned in the introduction, the main idea in analysis is to take limits. In we learned to take limits of sequences of real numbers. And in we learned to take limits of functions as a real number approached some other real number.
We want to take limits in more complicated contexts. For example, we might want to have sequences of points in 3-dimensional space. Or perhaps we wish to define continuous functions of several variables. We might even want to define functions on spaces that are a little harder to describe, such as the surface of the earth. We still want to talk about limits there.
Finally, we have seen the limit of a sequence of functions in . We wish to unify all these notions so that we do not have to reprove theorems over and over again in each context. The concept of a metric space is an elementary yet powerful tool in analysis. And while it is not sufficient to describe every type of limit we can find in modern analysis, it gets us very far indeed.
Let \(X\) be a set and let \(d \colon X \times X \to {\mathbb{R}}\) be a function such that
- [metric:pos] \(d(x,y) \geq 0\) for all \(x, y\) in \(X\),
- [metric:zero] \(d(x,y) = 0\) if and only if \(x = y\),
- [metric:com] \(d(x,y) = d(y,x)\),
- [metric:triang] \(d(x,z) = d(x,y)+ d(y,z)\) (triangle inequality).
Then the pair \((X,d)\) is called a metric space. The function \(d\) is called the metric or sometimes the distance function. Sometimes we just say \(X\) is a metric space if the metric is clear from context.
The geometric idea is that \(d\) is the distance between two points. Items [metric:pos]–[metric:com] have obvious geometric interpretation: distance is always nonnegative, the only point that is distance 0 away from \(x\) is \(x\) itself, and finally that the distance from \(x\) to \(y\) is the same as the distance from \(y\) to \(x\). The triangle inequality [metric:triang] has the interpretation given in
For the purposes of drawing, it is convenient to draw figures and diagrams in the plane and have the metric be the standard distance. However, that is only one particular metric space. Just because a certain fact seems to be clear from drawing a picture does not mean it is true. You might be getting sidetracked by intuition from euclidean geometry, whereas the concept of a metric space is a lot more general.
Let us give some examples of metric spaces.
The set of real numbers \({\mathbb{R}}\) is a metric space with the metric \[d(x,y) := \left\lvert {x-y} \right\rvert .\] Items [metric:pos]–[metric:com] of the definition are easy to verify. The triangle inequality [metric:triang] follows immediately from the standard triangle inequality for real numbers: \[d(x,z) = \left\lvert {x-z} \right\rvert = \left\lvert {x-y+y-z} \right\rvert \leq \left\lvert {x-y} \right\rvert+\left\lvert {y-z} \right\rvert = d(x,y)+ d(y,z) .\] This metric is the standard metric on \({\mathbb{R}}\). If we talk about \({\mathbb{R}}\) as a metric space without mentioning a specific metric, we mean this particular metric.
We can also put a different metric on the set of real numbers. For example take the set of real numbers \({\mathbb{R}}\) together with the metric \[d(x,y) := \frac{\left\lvert {x-y} \right\rvert}{\left\lvert {x-y} \right\rvert+1} .\] Items [metric:pos]–[metric:com] are again easy to verify. The triangle inequality [metric:triang] is a little bit more difficult. Note that \(d(x,y) = \varphi(\left\lvert {x-y} \right\rvert)\) where \(\varphi(t) = \frac{t}{t+1}\) and note that \(\varphi\) is an increasing function (positive derivative) hence \[\begin{split} d(x,z) & = \varphi(\left\lvert {x-z} \right\rvert) = \varphi(\left\lvert {x-y+y-z} \right\rvert) \leq \varphi(\left\lvert {x-y} \right\rvert+\left\lvert {y-z} \right\rvert) \\ & = \frac{\left\lvert {x-y} \right\rvert+\left\lvert {y-z} \right\rvert}{\left\lvert {x-y} \right\rvert+\left\lvert {y-z} \right\rvert+1} = \frac{\left\lvert {x-y} \right\rvert}{\left\lvert {x-y} \right\rvert+\left\lvert {y-z} \right\rvert+1} + \frac{\left\lvert {y-z} \right\rvert}{\left\lvert {x-y} \right\rvert+\left\lvert {y-z} \right\rvert+1} \\ & \leq \frac{\left\lvert {x-y} \right\rvert}{\left\lvert {x-y} \right\rvert+1} + \frac{\left\lvert {y-z} \right\rvert}{\left\lvert {y-z} \right\rvert+1} = d(x,y)+ d(y,z) . \end{split}\] Here we have an example of a nonstandard metric on \({\mathbb{R}}\). With this metric we can see for example that \(d(x,y) < 1\) for all \(x,y \in {\mathbb{R}}\). That is, any two points are less than 1 unit apart.
An important metric space is the \(n\)-dimensional euclidean space \({\mathbb{R}}^n = {\mathbb{R}} \times {\mathbb{R}}\times \cdots \times {\mathbb{R}}\). We use the following notation for points: \(x =(x_1,x_2,\ldots,x_n) \in {\mathbb{R}}^n\). We also simply write \(0 \in {\mathbb{R}}^n\) to mean the vector \((0,0,\ldots,0)\). Before making \({\mathbb{R}}^n\) a metric space, let us prove an important inequality, the so-called Cauchy-Schwarz inequality.
Take \(x =(x_1,x_2,\ldots,x_n) \in {\mathbb{R}}^n\) and \(y =(y_1,y_2,\ldots,y_n) \in {\mathbb{R}}^n\). Then \[{\biggl( \sum_{j=1}^n x_j y_j \biggr)}^2 \leq \biggl(\sum_{j=1}^n x_j^2 \biggr) \biggl(\sum_{j=1}^n y_j^2 \biggr) .\]
Any square of a real number is nonnegative. Hence any sum of squares is nonnegative: \[\begin{split} 0 & \leq \sum_{j=1}^n \sum_{k=1}^n (x_j y_k - x_k y_j)^2 \\ & = \sum_{j=1}^n \sum_{k=1}^n \bigl( x_j^2 y_k^2 + x_k^2 y_j^2 - 2 x_j x_k y_j y_k \bigr) \\ & = \biggl( \sum_{j=1}^n x_j^2 \biggr) \biggl( \sum_{k=1}^n y_k^2 \biggr) + \biggl( \sum_{j=1}^n y_j^2 \biggr) \biggl( \sum_{k=1}^n x_k^2 \biggr) - 2 \biggl( \sum_{j=1}^n x_j y_j \biggr) \biggl( \sum_{k=1}^n x_k y_k \biggr) \end{split}\] We relabel and divide by 2 to obtain \[0 \leq \biggl( \sum_{j=1}^n x_j^2 \biggr) \biggl( \sum_{j=1}^n y_j^2 \biggr) - {\biggl( \sum_{j=1}^n x_j y_j \biggr)}^2 ,\] which is precisely what we wanted.
Let us construct standard metric for \({\mathbb{R}}^n\). Define \[d(x,y) := \sqrt{ {(x_1-y_1)}^2 + {(x_2-y_2)}^2 + \cdots + {(x_n-y_n)}^2 } = \sqrt{ \sum_{j=1}^n {(x_j-y_j)}^2 } .\] For \(n=1\), the real line, this metric agrees with what we did above. Again, the only tricky part of the definition to check is the triangle inequality. It is less messy to work with the square of the metric. In the following, note the use of the Cauchy-Schwarz inequality. \[\begin{split} d(x,z)^2 & = \sum_{j=1}^n {(x_j-z_j)}^2 \\ & = \sum_{j=1}^n {(x_j-y_j+y_j-z_j)}^2 \\ & = \sum_{j=1}^n \Bigl( {(x_j-y_j)}^2+{(y_j-z_j)}^2 + 2(x_j-y_j)(y_j-z_j) \Bigr) \\ & = \sum_{j=1}^n {(x_j-y_j)}^2 + \sum_{j=1}^n {(y_j-z_j)}^2 + \sum_{j=1}^n 2(x_j-y_j)(y_j-z_j) \\ & \leq \sum_{j=1}^n {(x_j-y_j)}^2 + \sum_{j=1}^n {(y_j-z_j)}^2 + 2 \sqrt{ \sum_{j=1}^n {(x_j-y_j)}^2 \sum_{j=1}^n {(y_j-z_j)}^2 } \\ & = {\left( \sqrt{ \sum_{j=1}^n {(x_j-y_j)}^2 } + \sqrt{ \sum_{j=1}^n {(y_j-z_j)}^2 } \right)}^2 = {\bigl( d(x,y) + d(y,z) \bigr)}^2 . \end{split}\] Taking the square root of both sides we obtain the correct inequality.
An example to keep in mind is the so-called discrete metric. Let \(X\) be any set and define \[d(x,y) := \begin{cases} 1 & \text{if $x \not= y$}, \\ 0 & \text{if $x = y$}. \end{cases}\] That is, all points are equally distant from each other. When \(X\) is a finite set, we can draw a diagram, see for example . Things become subtle when \(X\) is an infinite set such as the real numbers.
While this particular example seldom comes up in practice, it is gives a useful “smell test.” If you make a statement about metric spaces, try it with the discrete metric. To show that \((X,d)\) is indeed a metric space is left as an exercise.
[example:msC01] Let \(C([a,b])\) be the set of continuous real-valued functions on the interval \([a,b]\). Define the metric on \(C([a,b])\) as \[d(f,g) := \sup_{x \in [a,b]} \left\lvert {f(x)-g(x)} \right\rvert .\] Let us check the properties. First, \(d(f,g)\) is finite as \(\left\lvert {f(x)-g(x)} \right\rvert\) is a continuous function on a closed bounded interval \([a,b]\), and so is bounded. It is clear that \(d(f,g) \geq 0\), it is the supremum of nonnegative numbers. If \(f = g\) then \(\left\lvert {f(x)-g(x)} \right\rvert = 0\) for all \(x\) and hence \(d(f,g) = 0\). Conversely if \(d(f,g) = 0\), then for any \(x\) we have \(\left\lvert {f(x)-g(x)} \right\rvert \leq d(f,g) = 0\) and hence \(f(x) = g(x)\) for all \(x\) and \(f=g\). That \(d(f,g) = d(g,f)\) is equally trivial. To show the triangle inequality we use the standard triangle inequality. \[\begin{split} d(f,h) & = \sup_{x \in [a,b]} \left\lvert {f(x)-g(x)} \right\rvert = \sup_{x \in [a,b]} \left\lvert {f(x)-h(x)+h(x)-g(x)} \right\rvert \\ & \leq \sup_{x \in [a,b]} ( \left\lvert {f(x)-h(x)} \right\rvert+\left\lvert {h(x)-g(x)} \right\rvert ) \\ & \leq \sup_{x \in [a,b]} \left\lvert {f(x)-h(x)} \right\rvert+ \sup_{x \in [a,b]} \left\lvert {h(x)-g(x)} \right\rvert = d(f,h) + d(h,g) . \end{split}\] When treat \(C([a,b])\) as a metric space without mentioning a metric, we mean this particular metric.
This example may seem esoteric at first, but it turns out that working with spaces such as \(C([a,b])\) is really the meat of a large part of modern analysis. Treating sets of functions as metric spaces allows us to abstract away a lot of the grubby detail and prove powerful results such as Picard’s theorem with less work.
Oftentimes it is useful to consider a subset of a larger metric space as a metric space. We obtain the following proposition, which has a trivial proof.
Let \((X,d)\) be a metric space and \(Y \subset X\), then the restriction \(d|_{Y \times Y}\) is a metric on \(Y\).
If \((X,d)\) is a metric space, \(Y \subset X\), and \(d' := d|_{Y \times Y}\), then \((Y,d')\) is said to be a subspace of \((X,d)\).
It is common to simply write \(d\) for the metric on \(Y\), as it is the restriction of the metric on \(X\). Sometimes we will say that \(d'\) is the subspace metric and that \(Y\) has the subspace topology.
A subset of the real numbers is bounded whenever all its elements are at most some fixed distance from 0. We can also define bounded sets in a metric space. When dealing with an arbitrary metric space there may not be some natural fixed point 0. For the purposes of boundedness it does not matter.
Let \((X,d)\) be a metric space. A subset \(S \subset X\) is said to be bounded if there exists a \(p \in X\) and a \(B \in {\mathbb{R}}\) such that \[d(p,x) \leq B \quad \text{for all $x \in S$}.\] We say that \((X,d)\) is bounded if \(X\) itself is a bounded subset.
For example, the set of real numbers with the standard metric is not a bounded metric space. It is not hard to see that a subset of the real numbers is bounded in the sense of if and only if it is bounded as a subset of the metric space of real numbers with the standard metric.
On the other hand, if we take the real numbers with the discrete metric, then we obtain a bounded metric space. In fact, any set with the discrete metric is bounded.
Exercises
Show that for any set \(X\), the discrete metric (\(d(x,y) = 1\) if \(x\not=y\) and \(d(x,x) = 0\)) does give a metric space \((X,d)\).
Let \(X := \{ 0 \}\) be a set. Can you make it into a metric space?
Let \(X := \{ a, b \}\) be a set. Can you make it into two distinct metric spaces? (define two distinct metrics on it)
Let the set \(X := \{ A, B, C \}\) represent 3 buildings on campus. Suppose we wish to our distance to be the time it takes to walk from one building to the other. It takes 5 minutes either way between buildings \(A\) and \(B\). However, building \(C\) is on a hill and it takes 10 minutes from \(A\) and 15 minutes from \(B\) to get to \(C\). On the other hand it takes 5 minutes to go from \(C\) to \(A\) and 7 minutes to go from \(C\) to \(B\), as we are going downhill. Do these distances define a metric? If so, prove it, if not say why not.
Suppose that \((X,d)\) is a metric space and \(\varphi \colon [0,\infty] \to {\mathbb{R}}\) is an increasing function such that \(\varphi(t) \geq 0\) for all \(t\) and \(\varphi(t) = 0\) if and only if \(t=0\). Also suppose that \(\varphi\) is subadditive, that is \(\varphi(s+t) \leq \varphi(s)+\varphi(t)\). Show that with \(d'(x,y) := \varphi\bigl(d(x,y)\bigr)\), we obtain a new metric space \((X,d')\).
Let \((X,d_X)\) and \((Y,d_Y)\) be metric spaces.
a) Show that \((X \times Y,d)\) with \(d\bigl( (x_1,y_1), (x_2,y_2) \bigr) := d_X(x_1,x_2) + d_Y(y_1,y_2)\) is a metric space.
b) Show that \((X \times Y,d)\) with \(d\bigl( (x_1,y_1), (x_2,y_2) \bigr) := \max \{ d_X(x_1,x_2) , d_Y(y_1,y_2) \}\) is a metric space.
Let \(X\) be the set of continuous functions on \([0,1]\). Let \(\varphi \colon [0,1] \to (0,\infty)\) be continuous. Define \[d(f,g) := \int_0^1 \left\lvert {f(x)-g(x)} \right\rvert\varphi(x)~dx .\] Show that \((X,d)\) is a metric space.
Let \((X,d)\) be a metric space. For nonempty bounded subsets \(A\) and \(B\) let \[d(x,B) := \inf \{ d(x,b) : b \in B \} \qquad \text{and} \qquad d(A,B) := \sup \{ d(a,B) : a \in A \} .\] Now define the Hausdorff metric as \[d_H(A,B) := \max \{ d(A,B) , d(B,A) \} .\] Note: \(d_H\) can be defined for arbitrary nonempty subsets if we allow the extended reals.
a) Let \(Y \subset {\mathcal{P}}(X)\) be the set of bounded nonempty subsets. Show that \((Y,d_H)\) is a metric space. b) Show by example that \(d\) itself is not a metric. That is, \(d\) is not always symmetric.