1.1: Systems of Linear Equations
- Page ID
- 70182
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)- Understand the definition of \(\mathbb{R}^n\text{,}\) and what it means to use \(\mathbb{R}^n\) to label points on a geometric object.
- Pictures: solutions of systems of linear equations, parameterized solution sets.
- Vocabulary words: consistent, inconsistent, solution set.
During the first half of this textbook, we will be primarily concerned with understanding the solutions of systems of linear equations.
An equation in the unknowns \(x,y,z,\ldots\) is called linear if both sides of the equation are a sum of (constant) multiples of \(x,y,z,\ldots\text{,}\) plus an optional constant.
For instance,
\[ \begin{split} 3x + 4y &= 2z \\ -x - z &= 100 \end{split} \nonumber \]
are linear equations, but
\[ \begin{split} 3x + yz &= 3 \\ \sin(x) - \cos(y) &= 2 \end{split} \nonumber \]
are not.
We will usually move the unknowns to the left side of the equation, and move the constants to the right.
A system of linear equations is a collection of several linear equations, like
\[\label{eq:1}\left\{\begin{array}{rrrrrrr} x &+& 2y &+& 3z &=& 6\\ 2x &-& 3y &+& 2z &=& 14\\ 3x &+& y &-& z &=& -2. \end{array}\right.\]
- A solution of a system of equations is a list of numbers \(x, y, z, \ldots\) that make all of the equations true simultaneously.
- The solution set of a system of equations is the collection of all solutions.
- Solving the system means finding all solutions with formulas involving some number of parameters.
A system of linear equations need not have a solution. For example, there do not exist numbers \(x\) and \(y\) making the following two equations true simultaneously:
\[\left\{\begin{array}{rrrrc} x &+& 2y &=& 3 \\ x &+& 2y &=& -3.\end{array}\right.\nonumber\]
In this case, the solution set is empty. As this is a rather important property of a system of equations, it has its own name.
A system of equations is called inconsistent if it has no solutions. It is called consistent otherwise.
A solution of a system of equations in \(n\) variables is a list of \(n\) numbers. For example, \((x,y,z) = (1,-2,3)\) is a solution of \(\eqref{eq:1}\). As we will be studying solutions of systems of equations throughout this text, now is a good time to fix our notions regarding lists of numbers.
Line, Plane, Space, Etc.
We use \(\mathbb{R}\) to denote the set of all real numbers, i.e., the number line. This contains numbers like \(0, \frac 32, -\pi, 104, \ldots\)
Let \(n\) be a positive whole number. We define
\[ \mathbb{R}^n = \text{all ordered \(n\)-tuples of real numbers }(x_1,x_2,x_3,\ldots,x_n). \nonumber \]
An \(n\)-tuple of real numbers is called a point of \(\mathbb{R}^n\).
In other words, \(\mathbb{R}^n\) is just the set of all (ordered) lists of \(n\) real numbers. We will draw pictures of \(\mathbb{R}^n\) in a moment, but keep in mind that this is the definition. For example, \((0, \frac 32, -\pi)\) and \((1,-2,3)\) are points of \(\mathbb{R}^3\).
When \(n=1\text{,}\) we just get \(\mathbb{R}\) back: \(\mathbb{R}^1=\mathbb{R}\). Geometrically, this is the number line.
Figure \(\PageIndex{1}\)
When \(n=2\text{,}\) we can think of \(\mathbb{R}^2\) as the \(xy\)-plane. We can do so because every point on the plane can be represented by an ordered pair of real numbers, namely, its \(x\)- and \(y\)-coordinates.
Figure \(\PageIndex{2}\)
When \(n=3\text{,}\) we can think of \(\mathbb{R}^3\) as the space we (appear to) live in. We can do so because every point in space can be represented by an ordered triple of real numebrs, namely, its \(x\)-, \(y\)-, and \(z\)-coordinates.
Figure \(\PageIndex{3}\)
Figure \(\PageIndex{4}\): A point in 3-space, and its coordinates. Click and drag the point, or move the sliders.
So what is \(\mathbb{R}^4\text{?}\) or \(\mathbb{R}^5\text{?}\) or \(\mathbb{R}^n\text{?}\) These are harder to visualize, so you have to go back to the definition: \(\mathbb{R}^n\) is the set of all ordered \(n\)-tuples of real numbers \((x_1,x_2,x_3,\ldots,x_n)\).
They are still “geometric” spaces, in the sense that our intuition for \(\mathbb{R}^2\) and \(\mathbb{R}^3\) often extends to \(\mathbb{R}^n\).
We will make definitions and state theorems that apply to any \(\mathbb{R}^n\text{,}\) but we will only draw pictures for \(\mathbb{R}^2\) and \(\mathbb{R}^3\).
The power of using these spaces is the ability to label various objects of interest, such as geometric objects and solutions of systems of equations, by the points of \(\mathbb{R}^n\).
All colors you can see can be described by three quantities: the amount of red, green, and blue light in that color. (Humans are trichromatic.) Therefore, we can use the points of \(\mathbb{R}^3\) to label all colors: for instance, the point \((.2, .4, .9)\) labels the color with \(20\%\) red, \(40\%\) green, and \(90\%\) blue intensity.
Figure \(\PageIndex{5}\)
In the Overview, we could have used \(\mathbb{R}^4\) to label the amount of traffic \((x,y,z,w)\) passing through four streets. In other words, if there are \(10,5,3,11\) cars per hour passing through roads \(x,y,z,w\text{,}\) respectively, then this can be recorded by the point \((10,5,3,11)\) in \(\mathbb{R}^4\). This is useful from a psychological standpoint: instead of having four numbers, we are now dealing with just one piece of data.
Figure \(\PageIndex{6}\)
A QR code is a method of storing data in a grid of black and white squares in a way that computers can easily read. A typical QR code is a \(29 \times 29\) grid. Reading each line left-to-right and reading the lines top-to-bottom (like you read a book) we can think of such a QR code as a sequence of \(29 \times 29 = 841\) digits, each digit being 1 (for white) or 0 (for black). In such a way, the entire QR code can be regarded as a point in \(\mathbb{R}^{841}\). As in the previous Example \(\PageIndex{6}\), it is very useful from a psychological perspective to view a QR code as a single piece of data in this way.
In the above examples, it was useful from a psychological perspective to replace a list of four numbers (representing traffic flow) or of 841 numbers (representing a QR code) by a single piece of data: a point in some \(\mathbb{R}^n\). This is a powerful concept; starting in Section 2.2, we will almost exclusively record solutions of systems of linear equations in this way.
Pictures of Solution Sets
Before discussing how to solve a system of linear equations below, it is helpful to see some pictures of what these solution sets look like geometrically.
Consider the linear equation \(x+y=1\). We can rewrite this as \(y = 1-x\text{,}\) which defines a line in the plane: the slope is \(-1\text{,}\) and the \(x\)-intercept is \(1\).
Figure \(\PageIndex{8}\)
For our purposes, a line is a ray that is straight and infinite in both directions.
Consider the linear equation \(x+y+z=1\). This is the implicit equation for a plane in space.
Figure \(\PageIndex{9}\)
A plane is a flat sheet that is infinite in all directions.
The equation \(x+y+z+w=1\) defines a “\(3\)-plane” in \(4\)-space, and more generally, a single linear equation in \(n\) variables defines an “\((n-1)\)-plane” in \(n\)-space. We will make these statements precise in Section 2.7.
Now consider the system of two linear equations
\[\left\{\begin{array}{rrrrr}\color{Violet}{x}&\color{Violet}{-}&\color{Violet}{3y}&\color{Violet}{=}&\color{Violet}{-3.} \\ \color{Green}{2x}&\color{Green}{+}&\color{Green}{y}&\color{Green}{=}&\color{Green}{8.}\end{array}\right.\nonumber\]
Each equation individually defines a line in the plane, pictured below.
Figure \(\PageIndex{10}\)
A solution to the system of both equations is a pair of numbers \((x,y)\) that makes both equations true at once. In other words, it as a point that lies on both lines simultaneously. We can see in the picture above that there is only one point where the lines intersect: therefore, this system has exactly one solution. (This solution is \((3,2)\text{,}\) as the reader can verify.)
Usually, two lines in the plane will intersect in one point, but of course this is not always the case. Consider now the system of equations
\[\left\{\begin{array}{rrrrr}\color{Violet}{x}&\color{Violet}{-}&\color{Violet}{3y}&\color{Violet}{=}&\color{Violet}{-3.} \\ \color{Green}{x}&\color{Green}{-}&\color{Green}{3y}&\color{Green}{=}&\color{Green}{3.}\end{array}\right.\nonumber\]
These define parallel lines in the plane.
Figure \(\PageIndex{11}\)
The fact that that the lines do not intersect means that the system of equations has no solution. Of course, this is easy to see algebraically: if \(x-3y=-3\text{,}\) then it is cannot also be the case that \(x-3y=3\).
There is one more possibility. Consider the system of equations
\[\left\{\begin{array}{rrrrr}\color{Violet}{x}&\color{Violet}{-}&\color{Violet}{3y}&\color{Violet}{=}&\color{Violet}{-3.} \\ \color{Green}{2x}&\color{Green}{-}&\color{Green}{6y}&\color{Green}{=}&\color{Green}{-6.}\end{array}\right.\nonumber\]
The second equation is a multiple of the first, so these equations define the same line in the plane.
Figure \(\PageIndex{12}\)
In this case, there are infinitely many solutions of the system of equations.
Consider the system of two linear equations
\[\left\{\begin{array}{rrrrr}\color{Violet}{x}&\color{Violet}{+}&\color{Violet}{y}&\color{Violet}{+}&\color{Violet}{z}&\color{Violet}{=}&\color{Violet}{1} \\ \color{Green}{x}&\color{Green}{}&\color{Green}{}&\color{Green}{-}&\color{Green}{z}&\color{Green}{=}&\color{Green}{0.}\end{array}\right.\nonumber\]
Each equation individually defines a plane in space. The solutions of the system of both equations are the points that lie on both planes. We can see in the picture below that the planes intersect in a line. In particular, this system has infinitely many solutions.
In general, the solutions of a system of equations in \(n\) variables is the intersection of “\((n-1)\)-planes” in \(n\)-space. This is always some kind of linear space, as we will discuss in Section 2.4.
Parametric Description of Solution Sets
According to the Definition for Solution Sets, Definition \(\PageIndex{2}\), solving a system of equations means writing down all solutions in terms of some number of parameters. We will give a systematic way of doing so in Section 1.3; for now we give parametric descriptions in the examples of the previous Subsection, Pictures of Solution Sets.
Consider the linear equation \(x+y=1\) of Example \(\PageIndex{8}\). In this context, we call \(x+y=1\) an implicit equation of the line. We can write the same line in parametric form as follows:
\[ (x, y) = (t,\, 1-t) \quad\text{for any}\quad t \in \mathbb{R}. \nonumber \]
This means that every point on the line has the form \((t,\, 1-t)\) for some real number \(t\). In this case, we call \(t\) a parameter, as it parameterizes the points on the line.
Figure \(\PageIndex{14}\)
Now consider the system of two linear equations
\[\left\{\begin{array}{rrrrr}\color{Violet}{x}&\color{Violet}{+}&\color{Violet}{y}&\color{Violet}{+}&\color{Violet}{z}&\color{Violet}{=}&\color{Violet}{1} \\ \color{Green}{x}&\color{Green}{}&\color{Green}{}&\color{Green}{-}&\color{Green}{z}&\color{Green}{=}&\color{Green}{0.}\end{array}\right.\nonumber\]
of Example \(\PageIndex{11}\). These collectively form the implicit equations for a line in \(\mathbb{R}^3\). (At least two equations are needed to define a line in space.) This line also has a parametric form with one parameter \(t\text{:}\)
\[ (x,\, y,\, z) = (t,\, 1-2t,\, t). \nonumber \]
Note that in each case, the parameter \(t\) allows us to use \(\mathbb{R}\) to label the points on the line. However, neither line is the same as the number line \(\mathbb{R}\text{:}\) indeed, every point on the first line has two coordinates, like the point \((0,1)\text{,}\) and every point on the second line has three coordinates, like \((0,1,0)\).
Consider the linear equation \(x+y+z=1\) of Example \(\PageIndex{9}\). This is an implicit equation of a plane in space. This plane has an equation in parametric form: we can write every point on the plane as
\[ (x,\, y,\, z) = (1-t-w,\, t,\, w) \quad\text{for any}\quad t,w\in\mathbb{R}. \nonumber \]
In this case, we need two parameters \(t\) and \(w\) to describe all points on the plane.
Note that the parameters \(t,w\) allow us to use \(\mathbb{R}^2\) to label the points on the plane. However, this plane is not the same as the plane \(\mathbb{R}^2\text{:}\) indeed, every point on this plane has three coordinates, like the point \((0,0,1)\).
When there is a unique solution, as in Example \(\PageIndex{10}\), it is not necessary to use parameters to describe the solution set.