Loading [MathJax]/jax/output/HTML-CSS/jax.js
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Mathematics LibreTexts

1.5: Linear and Affine Functions

( \newcommand{\kernel}{\mathrm{null}\,}\)

One of the central themes of calculus is the approximation of nonlinear functions by linear functions, with the fundamental concept being the derivative of a function. This section will introduce the linear and affine functions which will be key to understanding derivatives in the chapters ahead.

Linear functions

In the following, we will use the notation f:RmRn to indicate a function whose domain is a subset of Rm and whose range is a subset of Rn. In other words, f takes a vector with m coordinates for input and returns a vector with n coordinates. For example, the function

f(x,y,z)=(sin(x+y),2x2+z)

is a function from R3 to R2.

Definition 1.5.1

We say a function L:RmRm is linear if (1) for any vectors x and y in Rm,

L(x+y)=L(x)+L(y),

and (2) for any vector x in Rm and scalar a,

L(ax)=aL(x).

Example 1.5.1

Suppose f:RR is defined by f(x)=3x. Then for any x and y in R,

f(x+y)=3(x+y)=3x+3y=f(x)+f(y),

and for any scalar a,

f(ax)=3ax=af(x).

Thus f is linear.

Example 1.5.2

Suppose L:R2R3 is defined by

L(x1,x2)=(2x1+3x2,x1x2,4x2).

Then if x=(x1,x2) and y=(y1,y2) are vectors in R2,

L(x+y)=L(x1+y1,x2+y2)=(2(x1+y1)+3(x2+y2),x1+y1(x2+y2),4(x2+y2))=(2x1+3x2,x1x2,4x2)+(2y1+3y2,y1y2,4y2)=L(x1,x2)+L(y1,y2)=L(x)+L(y).

Also, for x=(x1,x2) and any scalar a, we have

L(ax)=L(ax1,ax2)=(2ax1+3ax2,ax1ax2,4ax2)=a(2x2+3x2,x1x2,4x2)=aL(x).

Thus L is linear.

Now suppose L:RR is a linear function and let a=L(1). Then for any real number x,

L(x)=L(1x)=xL(1)=ax.

Since any function L:RR defined by L(x)=ax, where a is a scalar, is linear (see Exercise 1), it follows that the only functions L:RR which are linear are those of the form L(x)=ax for some real number a. For example, f(x)=5x is a linear function, but g(x)=sin(x) is not.

Next, suppose L:RmR is linear and let a1=L(e1),a2=L(e2),,am=L(em). If x=(x1,x2,,xm) is a vector in Rm, then we know that

x=x1e1+x2e2++xmem.

Thus

L(x)=L(x1e1+x2e2++xmem)=L(x1e1)+L(x2e2)++L(xmem)=x1L(e1)+x2L(e2)++xmL(em)=x1a1+x2a2++xmam=ax,

where a=(a1,a2,,am). Since for any vector a in Rm, the function L(x)=ax is linear (see Exercise 1), it follows that the only functions L:RmR which are linear are those of the form L(x)=ax for some fixed vector a in Rm. For example,

f(x,y)=(2,3)(x,y)=2x3y

is a linear function from R2 to R, but

f(x,y,z)=x2y+sin(z)

is not a linear function from R3 to R.

Now consider the general case where L:RmRn is a linear function. Given a vector x in Rm, let Lk(x) be the kth coordinate of L(x),k=1,2,,n. That is,

L(x)=(L1(x),L2(x),,Ln(x)).

Since L is linear, for any x and y in Rm we have

L(x+y)=L(x)+L(y),

or, in terms of the coordinate functions,

(L1(x+y),L2(x+y),,Ln(x+y))=(L1(x),L2(x),,Ln(x))+(L1(y),L2(y),,Ln(y))=(L1(x)+L1(y),L2(x)+L2(y),Ln(x)+Ln(y)).

Hence Lk(x+y)=Lk(x)+Lk(y) for k=1,2,,n. Similarly, if x is in Rm and a is a scalar, then L(ax)=aL(x), so

(L1(ax),L2(ax),,Ln(ax)=a(L1(x),L2(x),,Ln(x))=(aL1(x),aL2(x),,aLn(x)).

Hence Lk(ax)=aLk(x) for k=1,2,,n. Thus for each k=1,2,,n,Lk:RmR is a linear function. It follows from our work above that, for each k=1,2,,n, there is a fixed vector ak in Rm such that Lk(x)=akx for all x in Rm. Hence we have

L(x)=(a1x,a2x,,anx)

for all x in Rm. Since any function defined as in (???) is linear (see Exercise 1 again), it follows that the only linear functions from Rm to Rn must be of this form.

Theorem 1.5.1

If L:RmRn is linear, then there exist vectors a1,a2,,an in Rm such that

L(x)=(a1x,a2x,,anx)

for all x in Rm.

Example 1.5.3

In a previous example, we showed that the function L:R2R3 defined by

L(x1,x2)=(2x1+3x2,x1x2,4x2)

is linear. We can see this more easily now by noting that

L(x1,x2)=((2,3)(x1,x2),(1,1)(x1,x2),(0,4)(x1,x2)).

Example 1.5.4

The function

f(x,y,z)=(x+y,sin(x+y+z))

is not linear since it cannot be written in the form of (???). In particular, the function f2(x,y,z)=sin(x+y+z) is not linear; from our work above, it follows that f is not linear.

Matrix Notation

We will now develop some notation to simplify working with expressions such as (???). First, we define an n×m matrix to be to be an array of real numbers with n rows and m columns. For example,

M=[231104]

is a 3×2 matrix. Next, we will identify a vector x=(x1,x2,,xm) in Rm with the m×1 matrix

x=[x1x2xm],

which is called a column vector. Now define the product Mx of an n×m matrix M with an m×1 column vector x to be the n×1 column vector whose kth entry, k=1,2,,n, is the dot product of the kth row of M with x. For example,

[231104][21]=[4+3210+4]=[714].

In fact, for any vector x=(x1,x2) in R2,

[231104][x1x2]=[2x1+3x2x1x24x2].

In other words, if we let

L(x1,x2)=(2x1+3x2,x1x2,4x2),

as in a previous example, then, using column vectors, we could write

L(x1,x2)=[231104][x1x2].

In general, consider a linear function L:RmRn defined by

L(x)=(a1x,a2x,,anx)

for some vectors a1,a2,,an in Rm. If we let M be the n×m matrix whose kth row is ak,k=1,2,,n, then

L(x)=Mx

for any x in Rm. Now, from our work above,

ak=(Lk(e1),Lk(e2),,Lk(em),

which means that the jth column of M is

[L1(ej)L2(ej)Ln(ej)],

j=1,2,,m. But (???) is just L(ej) written as a column vector. Hence M is the matrix whose columns are given by the column vectors L(e1),L(e2),,L(em).

Theorem 1.5.2

Suppose L:RmRn is a linear function and M is the n×m matrix whose jth column is L(ej),j=1,2,,m. Then for any vector x in Rm,

L(x)=Mx.

Example 1.5.5

Suppose L:R3R2 is defined by

L(x,y,z)=(3x2y+z,4x+y).

Then

L(e1)=L(1,0,0)=(3,4),L(e2)=L(0,1,0)=(2,1),

and

L(e3)=L(0,0,1)=(1,0).

So if we let

M=[321410],

then

L(x,y,z)=[321410][xyz].

For example,

L(1,1,3)=[321410][113]=[3+2+341+0]=[83].

Example 1.5.6

Let Rθ:R2R2 be the function that rotates a vector x in R2 counterclockwise through an angle θ, as shown in Figure 1.5.1. Geometrically, it seems reasonable that Rθ is a linear function; that is, rotating the vector x+y through an angle θ should give the same result as first rotating x and y separately through an angle θ and then adding, and rotating a vector ax through an angle θ should give the same result as first rotating x through an angle θ and then multiplying by a. Now, from the definition of cos(θ) and sin(θ),

Rθ(e1)=Rθ(1,0)=(cos(θ),sin(θ))

(see Figure 1.5.2), and, since e2 is e1 rotated, counterclockwise, through an angle π2,

Rθ(e2)=Rθ+π2(e1)=(cos(θ+π2),sin(θ+π2))=(sin(θ),cos(θ)).

Hence

Rθ(x,y)=[cos(θ)sin(θ)sin(θ)cos(θ)][xy].

Screen Shot 2021-07-19 at 09.44.11.png
Figure 1.5.1: Rotating a vector in the plane
Screen Shot 2021-07-19 at 09.56.23.png
Figure 1.5.2: Rotating e1 through an angle θ

You are asked in Exercise 9 to verify that the linear function defined in (???) does in fact rotate vectors through an angle θ in the counterclockwise direction. Note that, for example, when θ=π2, we have

Rπ2(x,y)=[0110][xy].

In particular, note that Rπ2(1,0)=(0,1) and Rπ2(0,1)=(1,0); that is, Rπ2 takes e1 to e2 and e2 to e1. For another example, if θ=π6, then

Rπ6(x,y)=[32121232][xy].

In particular,

Rπ6(1,2)=[32121232][12]=[32112+3]=[3221+232].

Affine functions

Definition 1.5.2

We say a function A:RmRn is affine if there is a linear function L:RmRn and a vector b in Rn such that

A(x)=L(x)+b

for all x in Rm.

An affine function is just a linear function plus a translation. From our knowledge of linear functions, it follows that if A:RmRn is affine, then there is an n×m matrix M and a vector b in Rn such that

A(x)=Mx+b

for all x in Rm. In particular, if f:RR is affine, then there are real numbers m and b such that

f(x)=mx+b

for all real numbers x.

Example 1.5.7

The function

A(x,y)=(2x+3,y4x+1)

is an affine function from R2 to R2 since we may write it in the form

A(x,y)=L(x,y)+(3,1),

where L is the linear function

L(x,y)=(2x,y4x).

Note that L(1,0)=(2,4) and L(0,1)=(0,1), so we may also write A in the form

A(x,y)=[2041][xy]+[31].

Example 1.5.8

The affine function

A(x,y)=[12121212][xy]+[12]

first rotates a vector, counterclockwise, in R2 through an angle of π4 and then translates it by the vector (1,2).


This page titled 1.5: Linear and Affine Functions is shared under a CC BY-NC-SA 1.0 license and was authored, remixed, and/or curated by Dan Sloughter via source content that was edited to the style and standards of the LibreTexts platform.

Support Center

How can we help?