Processing math: 87%
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Mathematics LibreTexts

9.1: Vector Spaces, linear Mappings, and Convexity

( \newcommand{\kernel}{\mathrm{null}\,}\)

The euclidean space Rn has already made an appearance in the metric space chapter. In this chapter, we will extend the differential calculus we created for one variable to several variables. The key idea in differential calculus is to approximate functions by lines and linear functions. In several variables we must introduce a little bit of linear algebra before we can move on. So let us start with vector spaces and linear functions on vector spaces. While it is common to use x or the bold x for elements of Rn, especially in the applied sciences, we use just plain x, which is common in mathematics. That is xRn is a vector, which means that x=(x1,x2,,xn) is an n-tuple of real numbers. We use upper indices for identifying components, leaving us the lower index for sequences of vectors. For example, we can have vectors x1 and x2 in Rn and then x1=(x11,x21,,xn1) and x2=(x12,x22,,xn2). It is common to write vectors as column vectors, that is, n×1 matrices:

x=(x1,x2,,xn)= \scriptsize [x1x2xn] .

We will do so when convenient. We call real numbers scalars to distinguish them from vectors. Let X be a set together with operations of addition, +:X×XX, and multiplication, :R×XX, (we write ax instead of ax). X is called a vector space (or a real vector space) if the following conditions are satisfied: (Addition is associative) If u,v,wX, then u+(v+w)=(u+v)+w. (Addition is commutative) If u,vX, then u+v=v+u. (Additive identity) There is a 0X such that v+0=v for all vX. (Additive inverse) For every vX, there is a vX, such that v+(v)=0. (Distributive law) If aR, u,vX, then a(u+v)=au+av. (Distributive law) If a,bR, vX, then (a+b)v=av+bv. (Multiplication is associative) If a,bR, vX, then (ab)v=a(bv). (Multiplicative identity) 1v=v for all vX. Elements of a vector space are usually called vectors, even if they are not elements of Rn (vectors in the “traditional” sense). An example vector space is Rn, where addition and multiplication by a constant is done componentwise: if αR and x,yRn, then x+y:=(x1,x2,,xn)+(y1,y2,,yn)=(x1+y1,x2+y2,,xn+yn),αx:=α(x1,x2,,xn)=(αx1,αx2,,αxn). In this book we mostly deal with vector spaces that can be regarded as subsets of Rn, but there are other vector spaces that are useful in analysis. For example, the space C([0,1],R) of continuous functions on the interval [0,1] is a vector space. A trivial example of a vector space (the smallest one in fact) is just X={0}. The operations are defined in the obvious way. You always need a zero vector to exist, so all vector spaces are nonempty sets. It is also possible to use other fields than R in the definition (for example it is common to use the complex numbers C), but let us stick with the real numbers1. A function f:XY, when Y is not R is often called a mapping or a map rather than a function. Linear combinations and dimension If we have vectors x1,,xkRn and scalars a1,,akR, then a1x1+a2x2++akxk is called a linear combination of the vectors x1,,xk. If YRn is a set then the span of Y, or in notation span(Y), is the set of all linear combinations of some finite number of elements of Y. We also say Y spans span(Y). Let Y:={(1,1)}R2. Then

span(Y)={(x,x)R2:xR}.

That is, span(Y) is the line through the origin and the point (1,1). [example:vecspr2span] Let Y:={(1,1),(0,1)}R2. Then

span(Y)=R2,

as any point (x,y)R2 can be written as a linear combination (x,y)=x(1,1)+(yx)(0,1). A sum of two linear combinations is again a linear combination, and a scalar multiple of a linear combination is a linear combination, which proves the following proposition. Let X be a vector space. For any YX, the set span(Y) is a vector space itself. If Y is already a vector space then span(Y)=Y. A set of vectors {x1,x2,,xk} is linearly independent, if the only solution to

a1x1+a2x2++akxk=0

is the trivial solution a1=a2==ak=0. A set that is not linearly independent, is linearly dependent. A linearly independent set B of vectors such that span(B)=X is called a basis of X. For example the set Y of the two vectors in is a basis of R2. If a vector space X contains a linearly independent set of d vectors, but no linearly independent set of d+1 vectors then we say the dimension or dimX:=d. If for all dN the vector space X contains a set of d linearly independent vectors, we say X is infinite dimensional and write dimX:=. Clearly for the trivial vector space, dim{0}=0. We will see in a moment that any vector space that is a subset of Rn has a finite dimension, and that dimension is less than or equal to n. If a set is linearly dependent, then one of the vectors is a linear combination of the others. In other words, if aj0, then we can solve for xj

xj=a1ajx1++aj1ajxj1+aj+1ajxj+1++akakxk.

Clearly then the vector xj has at least two different representations as linear combinations of {x1,x2,,xk}. If B={x1,x2,,xk} is a basis of a vector space X, then every point yX has a unique representation of the form y=kj=1αjxj for some numbers α1,α2,,αk. Every yX is a linear combination of elements of B since X is the span of B. For uniqueness suppose y=kj=1αjxj=kj=1βjxj, then kj=1(αjβj)xj=0. By linear independence of the basis αj=βj for all j. For Rn we define e1:=(1,0,0,,0),e2:=(0,1,0,,0),,en:=(0,0,0,,1), and call this the standard basis of Rn. We use the same letters ej for any Rn, and which space Rn we are working in is understood from context. A direct computation shows that {e1,e2,,en} is really a basis of Rn; it is easy to show that it spans Rn and is linearly independent. In fact,

x=(x1,x2,,xn)=nj=1xjej.

Let X be a vector space. If X is spanned by d vectors, then dimXd. dimX=d if and only if X has a basis of d vectors (and so every basis has d vectors). In particular, dimRn=n. If YX is a vector space and dimX=d, then dimYd. If dimX=d and a set T of d vectors spans X, then T is linearly independent. If dimX=d and a set T of m vectors is linearly independent, then there is a set S of dm vectors such that TS is a basis of X. Let us start with (i). Suppose S={x1,x2,,xd} spans X, and T={y1,y2,,ym} is a set of linearly independent vectors of X. We wish to show that md. Write y1=dk=1αk1xk, which we can do as S spans X. One of the αk1 is nonzero (otherwise y1 would be zero), so suppose without loss of generality that this is α11. Then we can solve x1=1α11y1dk=2αk1α11xk. In particular {y1,x2,,xd} span X, since x1 can be obtained from {y1,x2,,xd}. Next, y2=α12y1+dk=2αk2xk. As T is linearly independent, we must have that one of the αk2 for k2 must be nonzero. Without loss of generality suppose α220. Proceed to solve for x2=1α22y2α12α22y1dk=3αk2α22xk. In particular {y1,y2,x3,,xd} spans X. The astute reader will think back to linear algebra and notice that we are row-reducing a matrix. We continue this procedure. If m<d, then we are done. So suppose md. After d steps we obtain that {y1,y2,,yd} spans X. Any other vector v in X is a linear combination of {y1,y2,,yd}, and hence cannot be in T as T is linearly independent. So m=d. Let us look at (ii). First notice that if we have a set T of k linearly independent vectors that do not span X, then we can always choose a vector vXspan(T). The set T{v} is linearly independent (exercise). If dimX=d, then there must exist some linearly independent set of d vectors T, and it must span X, otherwise we could choose a larger set of linearly independent vectors. So we have a basis of d vectors. On the other hand if we have a basis of d vectors, it is linearly independent and spans X. By (i) we know there is no set of d+1 linearly independent vectors, so dimension must be d. For (iii) notice that {e1,e2,,en} is a basis of Rn. To see (iv), suppose Y is a vector space and YX, where dimX=d. As X cannot contain d+1 linearly independent vectors, neither can Y. For (v) suppose T is a set of m vectors that is linearly dependent and spans X. Then one of the vectors is a linear combination of the others. Therefore if we remove it from T we obtain a set of m1 vectors that still span X and hence dimXm1. For (vi) suppose T={x1,,xm} is a linearly independent set. We follow the procedure above in the proof of (ii) to keep adding vectors while keeping the set linearly independent. As the dimension is d we can add a vector exactly dm times. Linear mappings A mapping A:XY of vector spaces X and Y is linear (or a linear transformation) if for every aR and x,yX we have A(ax)=aA(x)A(x+y)=A(x)+A(y). We usually write Ax instead of A(x) if A is linear. If A is one-to-one an onto then we say A is invertible and we denote the inverse by A1. If A:XX is linear then we say A is a linear operator on X. We write L(X,Y) for the set of all linear transformations from X to Y, and just L(X) for the set of linear operators on X. If a,bR and A,BL(X,Y), define the transformation aA+bB (aA+bB)(x)=aAx+bBx. If AL(Y,Z) and BL(X,Y), define the transformation AB as ABx:=A(Bx). Finally denote by IL(X) the identity: the linear operator such that Ix=x for all x. It is not hard to see that aA+bBL(X,Y), and that ABL(X,Z). In particular, L(X,Y) is a vector space. It is obvious that if A is linear then A0=0. If A:XY is invertible, then A1 is linear. Let aR and yY. As A is onto, then there is an x such that y=Ax, and further as it is also one-to-one A1(Az)=z for all zX. So

A1(ay)=A1(aAx)=A1(A(ax))=ax=aA1(y).

Similarly let y1,y2Y, and x1,x2X such that Ax1=y1 and Ax2=y2, then A1(y1+y2)=A1(Ax1+Ax2)=A1(A(x1+x2))=x1+x2=A1(y1)+A1(y2).\qedhere If A:XY is linear then it is completely determined by its values on a basis of X. Furthermore, if B is a basis, then any function ˜A:BY extends to a linear function on X. For infinite dimensional spaces, the proof is essentially the same, but a little trickier to write, so let us stick with finitely many dimensions. We leave the infinite dimensional case to the reader. Let {x1,x2,,xn} be a basis and suppose A(xj)=yj. Then every xX has a unique representation x=nj=1bjxj for some numbers b1,b2,,bn. Then by linearity Ax=Anj=1bjxj=nj=1bjAxj=nj=1bjyj. The “furthermore” follows by defining the extension Ax=nj=1bjyj, and noting that this is well defined by uniqueness of the representation of x. If X is a finite dimensional vector space and A:XX is linear, then A is one-to-one if and only if it is onto. Let {x1,x2,,xn} be a basis for X. Suppose A is one-to-one. Now suppose

nj=1cjAxj=Anj=1cjxj=0.

As A is one-to-one, the only vector that is taken to 0 is 0 itself. Hence, 0=nj=1cjxj and so cj=0 for all j. Therefore, {Ax1,Ax2,,Axn} is linearly independent. By an above proposition and the fact that the dimension is n, we have that {Ax1,Ax2,,Axn} span X. As any point xX can be written as x=nj=1ajAxj=Anj=1ajxj, so A is onto. Now suppose A is onto. As A is determined by the action on the basis we see that every element of X has to be in the span of {Ax1,,Axn}. Suppose Anj=1cjxj=nj=1cjAxj=0. By the same proposition as {Ax1,Ax2,,Axn} span X, the set is independent, and hence cj=0 for all j. This means that A is one-to-one. If Ax=Ay, then A(xy)=0 and so x=y. Convexity A subset U of a vector space is convex if whenever x,yU, the line segment from x to y lies in U. That is, if the convex combination (1t)x+ty is in U for all t[0,1]. See . Note that in R, every connected interval is convex. In R2 (or higher dimensions) there are lots of nonconvex connected sets. For example the set R2{0} is not convex but it is connected. To see this simply take any xR2{0} and let y:=x. Then (\nicefrac12)x+(\nicefrac12)y=0, which is not in the set. On the other hand, the ball B(x,r)Rn (using the standard metric on Rn) is always convex by the triangle inequality. Show that in Rn any ball B(x,r) for xRn and r > 0 is convex. Any subspace V of a vector space X is convex. A somewhat more complicated example is given by the following. Let C([0,1],{\mathbb{R}}) be the vector space of continuous real valued functions on {\mathbb{R}}. Let X \subset C([0,1],{\mathbb{R}}) be the set of those f such

\int_0^1 f(x)~dx \leq 1 \qquad \text{and} \qquad f(x) \geq 0 \text{ for all $x \in [0,1]$} .

Then X is convex. Take t \in [0,1] and note that if f,g \in X then t f(x) + (1-t) g(x) \geq 0 for all x. Furthermore \int_0^1 \bigl(tf(x) + (1-t)g(x)\bigr) ~dx = t \int_0^1 f(x) ~dx + (1-t)\int_0^1 g(x) ~dx \leq 1 . Note that X is not a subspace of C([0,1],{\mathbb{R}}). The intersection two closed sets is convex. In fact, If \{ C_\lambda \}_{\lambda \in I} is an arbitrary collection of convex sets, then

C := \bigcap_{\lambda \in I} C_\lambda

is convex. The proof is easy. If x, y \in C, then x,y \in C_\lambda for all \lambda \in I, and hence if t \in [0,1], then tx + (1-t)y \in C_\lambda for all \lambda \in I. Therefore tx + (1-t)y \in C and C is convex. Let T \colon V \to W be a linear mapping between two vector spaces and let C \subset V be a convex set. Then T(C) is convex. Take any two points p,q \in T(C). Then pick x,y \in C such that T(x) = p and T(y)=q. As C is convex then for all t \in [0,1] we have tx+(1-t)y \in C, so T\bigl(tx+(1-t)y\bigr) = tT(x)+(1-t)T(y) = tp+(1-t)q is in T(C). For completeness, let us A very useful construction is the convex hull. Given any set S \subset V of a vector space, define the convex hull of S, by

\operatorname{co}(S) := \bigcap \{ C \subset V : S \subset C, \text{ and $C$ is convex} \} .

That is, the convex hull is the smallest convex set containing S. Note that by a proposition above, the intersection of convex sets is convex and hence, the convex hull is convex. The convex hull of 0 and 1 in {\mathbb{R}} is [0,1]. Proof: Any convex set containing 0 and 1 must contain [0,1]. The set [0,1] is convex, therefore it must be the convex hull. Exercises Verify that {\mathbb{R}}^n is a vector space. Let X be a vector space. Prove that a finite set of vectors \{ x_1,\ldots,x_n \} \subset X is linearly independent if and only if for every j=1,2,\ldots,n \operatorname{span}( \{ x_1,\ldots,x_{j-1},x_{j+1},\ldots,x_n \}) \subsetneq \operatorname{span}( \{ x_1,\ldots,x_n \}) . That is, the span of the set with one vector removed is strictly smaller. Prove that C([0,1],{\mathbb{R}}) is an infinite dimensional vector space where the operations are defined in the obvious way: s=f+g and m=fg are defined as s(x) := f(x)+g(x) and m(x) := f(x)g(x). Hint: for the dimension, think of functions that are only nonzero on the interval (\nicefrac{1}{n+1},\nicefrac{1}{n}). Let k \colon [0,1]^2 \to {\mathbb{R}} be continuous. Show that L \colon C([0,1],{\mathbb{R}}) \to C([0,1],{\mathbb{R}}) defined by Lf(y) := \int_0^1 k(x,y)f(x)~dx is a linear operator. That is, show that L is well defined (that Lf is continuous), and that L is linear.


9.1: Vector Spaces, linear Mappings, and Convexity is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

Support Center

How can we help?