4.7: Comments and solutions
- Page ID
- 81422
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)92. Answer: Humour aside, this is a common situation.
We know , , , rather than the values of d, b, n.
The key is to exploit the symmetry in the given data, rather than solving blindly. Adding all three two-way totals gives whence . We can then subtract the given value of to get the value of .
93.
(a) Let the numbers at the three vertices be A, B, C. Adding shows that
a + b + c = 2 ( A + B + C )
so
etc.
(c) Note: We postpone the “solution” of part (b), and address part (c) first. Let the numbers at the five vertices be A, B, C, D, E. Adding shows that
so
etc.
(b) The second part is different. The four given edge-values do not determine the four unknown vertex-values. It may look as though four pieces of information should suffice to find four unknowns; but there is a catch: the sum of the numbers on the two opposite edges AB and CD is just the sum of the numbers at the four vertices, and so is equal to the sum of the numbers on the edges BC and DA. Hence one of the given edge-values is determined by the other three.
Note: This distinction between polygons with an odd and an even number of vertices would arise in exactly the same way if each edge was labelled with the average (“half the sum”) of the numbers at its two end vertices.
94. ; ; . Hence
so
So
etc.
95.
(a) Let
The shift, or vector, from (a, b) to (c, d) goes
“along c – a in the x-direction” and “up d – b in the y-direction”.
Draw the ordinate through Y and the abscissa through Z, to meet at P, so creating a right angled triangle with legs Y P of length and PZ of length . The midpoint of Y P clearly lies halfway along Y P at
and the midpoint of PZ clearly lies halfway up PZ at
Then and are both right-angled triangles and are congruent (by RHS congruence). Hence , so M is the midpoint of Y Z.
(b)
so vector
(c) Note: We use the result from part (b), but not the method from part (b). By part (b) applied to , PQ is half the length of AC and parallel to AC. By part (b) applied to , SR is half the length of AC and parallel to AC. Hence PQ is parallel to SR.
Similarly one can prove (applying part (b) twice – first to , and then to ) that PS is parallel to QR.
Hence PQRS is a parallelogram.
96.
(a) , so
Hence
(b)
so
Hence
but there is no obvious way to pin down (x + y + z).
In fact different quadrilaterals may give rise to the same four “midpoints”. (It is an interesting exercise to identify the family of quadrilaterals corresponding to a given set of four midpoints.)
(c) As in parts (a) and (b),
Hence
so
.
97.
(a)(i) As in Problems 93-95 we instinctively add to get
so
Hence
(ii) The same idea (replacing addition by multiplication) leads to
so
Note: Alternatively, we may notice that u, v, w are either all positive, or all negative. If we restrict in the first instance to purely positive solutions, then we may set u = 2x, v = 2y, w = 2z, translate (ii) into (i), and conclude that (x, y, z) = (1, 0, 2), so that (u, v, w) = (2, 1, 4). We must then remember the negative solution (u, v, w) = (−2, −1, −4).
(b) (i) As in (a)(i) we add to get 2(x + y + z) = 9, so . Hence
(ii) The same idea leads to
so
Hence
Either u, v, w are all positive, or all negative.
(iii) The same idea leads to
so
Either u, v, w are all positive, or all negative.
(iv) We could of course repeat the same method.
Or we could again look in the first instance for positive solutions, notice that 4 = 22, 8 = 23, 16 = 24, and take logs (to base 2). Then
so (from part (i)) any positive solution satisfies
so
We must then remember to include
98. The simplest idea is to take logs, and reduce the system to a familiar linear system:
Multiplying the first equation by c and subtracting it from the second equation multiplied by a gives:
Multiplying the first equation by d and subtracting b times the second equation gives:
If the numerators and denominators are expressed in determinant form, we get the 2 × 2 version of Cramer’s Rule. The original unknowns u, v can then be obtained by taking suitable powers.
What emerges looks interesting:
but it is not clear how it generalises.
99.
(a) x + y + z = 3 is the equation of a plane through the three points (3, 0, 0), (0, 3, 0), (0,0, 3).
x2 + y2 + z2 = c is the equation of a sphere, centered at the origin, with radius . The sphere misses the plane completely when c < 3, meets the plane in a single point when c = 3, and cuts the plane in a circle C when c > 3 (the circle lying in the positive octant provided c < 9).
If xy + yz + zx = b meets this intersection at all, then any permutation of the three coordinates x, y, z produces another point which also satisfies the other two equations (since they are both symmetrical). Hence for the system to have a unique solution, the circle C must contain a point with x = y = z. Hence c = 3, and b = 3, and the unique solution is
(b) We must have
is the equation of a plane through the three points (a, 0, 0), (0, a, 0), (0, 0, a).
is the equation of a sphere, centre the origin, with radius , which misses the 2 2 plane completely when , meets the plane in a single point when , and cuts the plane in a circle C when (the circle lying in the positive octant provided c < a2).
If
meets this intersection at all, then any permutation of the three coordinates x, y, z produces another point which also satisfies the other two equations (since they are both symmetrical). Hence for the system to have a unique solution, the circle C must contain a point with x = y = z. Hence that point is , so , and the unique solution is
100. is never negative. If
Note: We need to learn to see both x and − x as algebraic entities, with x as a placeholder (which may well be negative, in which case − x would be positive).
101.
(a) The interval [0.1, 0.2). We have marked exactly of the interval [0,1).
(b) This needs a little thought. First we mark the interval [0.1, 0.2), of length . Then we mark 9 smaller intervals
of total length . Then 92 smaller intervals
of total length . And so on.
It would seem that a vast number of points are left unm,a,rked - namely, every point whose decimal representation uses only 0s, 2s, 3s, 4s, 5s, 6s, 7s, 8s, and 9s. However, the total length of the marked intervals is given by adding:
That is an infinite geometric series with first term and common ratio , and hence with sum = 1. In other words, the total length of what remains unmarked is zero.
(c) (0, 1): every real number except 0 has an expansion in base 2 with a “1” in some position. So this time nothing is left unmarked, (except 0). Hence the complement of the set of marked points consists simply of one point, namely {0}. So it is not surprising that the total of all the marked intervals has length 1.
(d) First we mark the interval [0.1,0.2), of length . Then we mark 2 smaller intervals
of total length . Then 22 smaller intervals
of total length . And so on.
The set of marked points is the complement of the famous Cantor set (Georg Cantor (1845–1918)) and has total length
This is an infinite geometric series with first term and common ratio , and so has sum = 1.
Hence, the total length of what remains unmarked is zero.
Note: The set described in (d) leaves as its complement a collection of points – the Cantor set – which consists of the “endpoints” of the intervals that have been removed; these are points whose base 3 expansion involves only 0s and 2s. This complement:
(i) is “the same size” as the whole interval [0,1] (since if we interpret the 2s in the base 3 expansion as 1s, we get a correspondence between the set of “endpoints” and the set of all possible base 2 expansions for real numbers in [0,1]);
(ii) is “nowhere dense” (since every pair of points in the complement is separated by some interval)
(iii) has total length = 0.
102. (2, 3]. Each inequality implies all the ones before it. Hence the two which are true must be the first two. Hence
103. If , then we must solve ; so ; if , we must solve , so .
Note: The fact that denotes the positive value of the pair {x, − x} can be rephrased as: is equal to the distance from x to 0.
In the same way, denotes the positive member of the pair
so is equal to the distance from to 0 (i.e. the distance from x to 5). This is a very important way to think about expressions like .
104.
(a) . (Each inequality implies all those that go before it. So we need solutions to and , which satisfy
(b) (4, 6]. (Each inequality implies the one before it. To see this, think in terms of distances: we want points x whose distance from 1 is > 1, whose distance from 2 is > 2, etc.. So we need to find points x which solve the first two inequalities, but not the third. Points in the half line satisfy all seven inequalities, so we are left with (4, 6].)
105. . (We need all points x for which
“the distance from x to −1” plus “the distance from x to −2”
equals 2. This excludes all points between −2 and −1, for which the sum is equal to 1; for points between and the sum is < 2; for points in or the sum is > 2.)
106. , . (For solutions to exist, we must have b > 0. The solutions of the given inequality then consist of all x such that
“the distance from x to a is less than b”
that is, all x in the interval (a − b, a + b). Hence a − b = −1, a + b = 2.)
107.
(a) “The difference between the x- and y-coordinates is < 3”, means that the point (x, y) lies in the infinite strip between the lines x − y = −3 and x − y = 3.
(b) Shifting the origin of coordinates to (−5, 0) changes the x-coordinate to “X = x + 5” and leaves the y-coordinate unaffected (so Y = y). In this new frame we want “|X − Y| < 3”, so the required points lie in the infinite strip between the lines X − Y = −3 and X − Y = 3; that is, between the lines and .
(b) x > 0 and y > 0, or x < 0 and y < 0. (For any solution at all, we must have , which excludes points on the line x + y = 0. Divide both sides by and simplify to get
In other words:
.
If y > 0, then x + y > 0, so 2x + 2y > 2y, whence x > 0 (so “x > 0 and y > 0”).
If y < 0, then x + y < 0, so 2x + 2y < 2y, whence x < 0 (so “x < 0 and y < 0”).
If x > 0 and y > 0, or x < 0 and y < 0, then clearly < .)
108. Let
(i) Since x < y, it follows that
so ; moreover , so .
(ii) Since and , we can multiply both sides by bd to get ad < bc. Therefore
and
, and (since b, d, and b + d are all > 0, so we can divide the first inequality by b(b + d) and the second by d(b + d)).
109.
(a)
(b) (i) It is tempting simply to consider the decimals
in order to conclude that these fractions miss the first and last subinterval, and then fall one in each of the remaining subintervals. In preparation for part (ii), it is better to observe that
* and , so none of the 9ths land up in the first or last subintervals;
* then rewrite
and notice that
so that, for
hence exactly one 9th goes in each of the other eight subintervals.
(ii) Notice that
* and , so none of the nths land up in the first or last subintervals;
* then rewrite
and notice that
so, for
hence exactly one nth goes in each of the other n − 1 subintervals.
(iii) Suppose two (or more) fractions are inserted between and . Then these two fractions would have to be successive multiples of ; but then they would have a multiple of between them (by part (ii)), and this would be a term of the Farey series of order n sitting between and . Since there is no such term, at most one fraction can be inserted between and .
(c) Note 1: This problem was included because the idea of Farey series seems so simple, and their curious properties are so intriguing. While this remains true, it turns out that Farey series also have something different, and slightly unexpected, to teach us about “the essence of mathematics”. Part of us expects that simple-looking results should have short and accessible proofs - even though we know that Fermat’s Last Theorem shows otherwise. In the case of Farey series, the relevant properties can be proved in ways that should be accessible (in principle); but the proofs are not easy. So do not be upset if, after all your efforts, you land up trying to absorb the solution given here - and the underlying idea that
simple objects and “elementary” proofs can sometimes be more intricate than one anticipates.
Note 2: If
are consecutive terms in a Farey series, then “bc − ad” must be an integer > 0. The fact that this difference is always equal to 1 is easily checked in any particular case, but it is unclear exactly why this is necessarily true (rather than an accident) - or even how one would go about proving it. Every treatment of Farey series has to find its own way round this difficulty. We give the simplest proof we can (in the sense that it assumes no more than we have already used: a little about numbers and some algebra). But it is not at all “easy”. We indicate a different approach in the “Notes” at the end of part (d).
(i) [The fact that, except for the two end intervals, we have bd > n will be needed in the proof of part (c)(ii).]
We proceed by mathematical induction on n (the “order” of the Farey series) - a technique which we have already met in Chapter 2 (Problems 54-59, 76) and which will be addressed more fully in Chapter 6.
* When n = 1, the Farey series of order n is just:
and this subinterval is both the first and the last, so the claim is “vacuously true” (because there is “nothing to check”). When n = 2, the Farey series of order n is:
and again the only subintervals are the first and the last, so again there is nothing to check.
* We now suppose that we know the claim is true for the Farey series of order k, for some k > 1, and show that it must then also be true for the Farey series of order k + 1. (Since we know it is true for n = 2, it will then be true for n = 3; and once we know it is true for n = 3, it must then be true for n = 4; and so on.)
To show that the claim is true for the Farey series of order k + 1, we consider any adjacent pair of fractions
(other than the first pair and the last pair) in the Farey series of order k + 1.
Claim bd > k + 1.
Proof Note first that, since we are avoiding the two end subintervals, both b and d are > 1.
Suppose first that the pair are not adjacent in the previous Farey series of order k. Then at least one of the two fractions has been inserted in creating the Farey series of order k + 1, and so has denominator = k + 1. (The fractions inserted are precisely those with denominator “k + 1” which cannot be reduced by cancelling.) Hence the product
Thus we may assume that the pair a were already adjacent in the Farey series of order k. But then by our “induction hypothesis” (namely that the desired result is already known to be true for the Farey series of order k), we know that bd > k. If bd > k + 1, then the pair satisfies the required condition. Hence we only have to worry about the possibility that bd = k + 1. Suppose that bd = k + 1. Then the interval has length
for some positive integer r = bc − ad.
If r > 1, then the interval would have length > , so would not be successive terms in the series (for we would have inserted some additional term when moving from the Farey series of order k to the Farey series of order k + 1).
Hence we can be sure that r = 1, that the subinterval has length exactly . Now successive fractions with denominator k + 1 differ by exactly , so some fraction with denominator k + 1 must lie in this subinterval. Since no additional fraction is inserted between them in passing from the series of order k to the series of order k + 1, and must both be “cancelled versions” of two successive fractions with denominator k + 1. But then, by part (b)(ii), there would have to be a fraction with denominator k in the interval - which is not the case.
Therefore the possibility bd = k + 1 does not in fact occur.
So we can be sure that in every case, bd > k + 1.
Hence whenever the result is true for the Farey series of order k, it must then also be true for the Farey series of order k + 1.
It follows that the result is true for the Farey series of order n, for all
(ii) We proceed by induction on n.
* If are adjacent fractions in the Farey series of order 1, then and , so bc − ad = 1.
* Now suppose that, for some
Let be adjacent fractions in the Farey series of order k + 1.
If were already adjacent fractions in the Farey series of order k (i.e. if no fraction has been inserted between and in passing from the series of order k to the series of order k + 1), then we already know (by the induction hypothesis) that bc − ad = 1.
Thus we may concentrate on the case where are not adjacent fractions in the Farey series of order k. By (b)(iii), at most one fraction with denominator k + 1 is inserted between any two adjacent fractions in the Farey series of order k, so we have either
with being adjacent fractions in the Farey series of order k (so be − af = 1), or
with being adjacent fractions in the Farey series of order k (so fc − ed = 1). We consider the first of these possibilities (the second is entirely similar).
Suppose
with being adjacent fractions in the Farey series of order k. By part (i) we know that
Let bc − ad = r > 0, and ed − fc = s > 0.
Then sa + re = c, and sb + rf = d.
In particular, HCF(r,s) = 1 (since HCF(c,d) = 1).
Hence belongs to the family
Since everything is positive, easy algebra shows that
so every element of S lies between and .
As long as we choose x, y such that HCF(x,y) = 1, any common factor of xa + ye and xb + yf would also divide both
and
Hence
so each element of S is in lowest terms (i.e. no further cancelling is possible).
We have shown that “ belongs to the family S”, and that all elements of S fit between and ; which are adjacent fractions in the Farey series of order k. So none of the elements of S can have arisen before the series of order k + 1. But each fraction in S arises at some stage in a Farey series.
And the first to arise (because it has the smallest denominator) is ""
Hence
so r = s = 1, and bc − ad = 1 as required. QED
(d) Let
be three successive terms in any Farey sequence. By (c) we know that bc−ad = 1, and that de − cf = 1. In particular, bc - ad = de - cf, so
Note 1: It may not be clear why we are proving this result “again” - since it appeared in the final line of the solution to part (c). However, in part (c) the statement that
was arrived at within the induction step, and so was subject to other assumptions. In contrast, now that the result in part (c) has been clearly established, we can use it to prove part (d) without any hidden assumptions.
Note 2: If we represent each fraction in the Farey series of order n by the point (b, a), then each point lies in the right angled triangle joining (0, 0), (n, 0), and (n,n), and each fraction in the Farey series is equal to the gradient of the line, or vector, joining the origin to the integer lattice point (b,a). The ordering of the fractions in the Farey series corresponds to the sequence of increasing gradients, from up to . If and are adjacent fractions in some Farey series, then the result in (d) says that the next fraction to be inserted between them is corresponding to the vector sum of (b, a) and (f, e). And the result in (c) says that the area of the parallelogram with vertices (0, 0), (b, a), (f, e), (b + f, a + e) is equal to 1 (see Problem 57(b)). Hence the result in (c) reduces to the fact that
Theorem Any parallelogram, whose vertices are integer lattice points (i.e. points (b,a) where both coordinates are integers), and with no additional lattice points inside the parallelogram or on the four sides, has area 1.
110.
(a) Suppose that x satisfies . Then (or is not defined).
If x > 0, then x2 − 2x + 1 = (x − 1)2 which has no solutions.
in which case x satisfies , so every x < 0 is a solution of the original inequality
(b) Suppose x satisfies
(i) If x > 0, then x2 − x − 2 = (x − 2)(x + 1) < 0.
(and x > 0); hence
(ii) If x < 0, then , so .
, or
(c) Suppose x satisfies .
,so and all such x satisfy the original inequality.
111.
(a) If a + b = 5 and ab = 7, then a, b are solutions of
But the roots of this quadratic equation are
so a and b cannot be “positive reals”.
(b) We abbreviate the “arithmetic mean” by AM, the “geometric mean” by GM, the “harmonic mean” by HM, and the “quadratic mean” by QM.
so
therefore
and
Also
so
whence
Therefore
112. [This delightful problem was devised by Oleksiy Yevdokimov.] We need to find something which remains constant, or which does not increase, when we replace two terms a, b by .
Idea: If the two terms a, b were replaced each time by their sum a + b, then the sum of all the numbers in the list would be unchanged, so we could be sure that the final number after 199 such moves would have to be
This doesn’t work here. However, in the spirit of this section on inequalities, one may ask:
What happens to the sum of the squares of the terms in the list aftereach move?
When we move from one list to the next, only two terms are affected, and for these two terms, the previous sum of squares is replaced by . How does this affect the sum of all squares on the list?
We know that
So when we replace two terms a, b by , the sum of the squares of all the terms in the list never increases. Hence the final term is less than or equal to the square root of the initial sum of squares
113.
(a) (i) 3 = 22 - 1.
(ii) It seems hard to find another.
(b) (i) 2 = 12 + 1.
(ii) 5 = 22 + 1 (or 17 = 42 + 1; or 37 = 62 + 1; or 101 = 102 + 1; or …). In other words, there seem to be lots.
Note: At first sight primes of this form “keep on coming”. Given that we now know (see Problem 76) that the list of all prime numbers “goes on for ever”, it is natural to ask: Are there infinitely many prime numbers “one more than a square”? Or does the list run out?
This is one of the simplest questions one can ask to which the answer is not yet known!
(c) (i) 7 = 23 - 1.
(ii) It seems hard to find another.
(d) (i) 2 = 13 + 1.
(ii) It seems hard to find another.
Note: Parts (a), (c) and (d) should make one suspicious - provided one notices that:
(a) 63 = 7 × 9, 143 = 11 × 13, 323 = 17 × 19;
(c) 511 = 7 × 73, 1727 = 11 × 157;
(d) 217 = 7 × 31, 513 = 9 × 57, 1001 = 7 × 143.
This problem is so instructive that its solution is discussed in the main text following Problem 115.
114.
(Suppose
It is natural to try b = d = 1 in order to make the constant term bd = 1, and then to try c = −a (so that the coefficients of x3 and of x are both 0). It then remains to choose the value of a so that the total coefficient “2 − a2” of all terms in x2 is equal to 0: that is, ).
115.
(a)(i)
(ii)
(iii)
Note: The general factorisation
provides a fresh slant on the test for divisibility by 9 in base 10, or in general for divisibility by b − 1 in base b (see Problem 51):
“an integer written in base b is divisible by b − 1 precisely when its digit sum is divisible by b − 1”.
(b) (i)
(ii)
(iii)
116.
(a) Replace a by 1, b by r, and n by n + 1 in the answer to 115(a)(iii), to see that:
(b) Multiply the closed formula in (a) by “a” to see that:
117. When x = 40,
is not prime. So the sequence of prime outputs must stop some time before f (40). But it in fact keeps going as long as it possibly could, so that
are all prime. (This may explain Euler’s delight.)
Note: The links between polynomials with integer coefficients (even lowly quadratics) and prime numbers are still not fully understood. For example, you might like to look up Ulam’s spiral. (Ulam (1909–1984) plotted the positive integers in a square spiral and found the primes arranging themselves in curious patterns that we still do not fully understand.)
Interest in the connections between polynomials and primes was revived in the second half of the 20th century. It was eventually proved that there exists a polynomial in 10 variables, with integer coefficients, which takes both positive and negative values when the variables run through all possible non-negative integer values, but which does so in such a way that it generates all the primes as the set of positive outputs.
118.
(a) (i) For
to be prime, the smaller factor must be = 1, so a = 2.
If n is not prime, we can factorise n = rs, with r, s > 1. Then
Hence 2n 1 also factorises, so could not be prime. Hence n must be prime.
(ii) 22 − 1 = 3, 23 − 1 = 7, 25 − 1 = 31, 27 − 1 = 127 are all prime; 211 − 1 = 2047 = 23 × 89 is not.
Note: This is a simple example of the need to distinguish carefully between the statement
“if 2n − 1 is prime, then n is prime” (which is true),
and its converse
“if n is prime, then 2n − 1 is prime” (which is false).
(b)(i) Suppose that a > 1. Then an + 1 > 2; so for an + 1 to be prime, it must be odd, so a must be even.
If n has an odd factor m > 1, we can write n = km. Then
Since m is odd and > 1, we have
And since a > 1, neither factor = 1, so an + 1 can never be prime. Hence n can have no odd factor > 1, which is the same as saying that n = 2r must be a power of 2.
(ii) 21 + 1 = 3, 22 + 1 = 5, 24 + 1 = 17, 28 + 1 = 257, 216 + 1 = 65 537 are all prime. (The very next such expression
is not prime.)
Note: The sad tale of Fermat’s claim that “all Fermat numbers are prime” shows that mathematicians are not exempt from the obligation to distinguish carefully between a statement and its converse!
119.
(a) x2 − 3x + 2 = (x − 2)(x − 1) = 0 precisely when one of the brackets = 0; that is, x = 2, or x = 1.
(b) x2 − 1 = (x − 1)(x + 1) = 0 precisely when x = 1 or x = −1.
(c) x2 − 2x + 1 = (x − 1)2 = 0 precisely when x = 1 (a repeated root).
(d) requires us
− to complete the square
so
− or to use the quadratic formula:
(e) requires us
− to complete the square
so
− or to use the quadratic formula:
(f) x2 + 1 = 0 yields
(g) yields
120. . (There is no obvious magic method here. However, it should be natural to try to insert a term in q(x) to “resolve” the term in p(x); and the familiar cancelling of cross terms in (a + b)(a − b) should then suggest the possible benefit of trying )
Note: p(x)q(x) = x4 + 1 (see Problem 114).
121.
(a) Let the two unknown numbers be α and β Then , and . “The square of half the sum" .
Subtracting produces whose “square root” will be either , or − whichever is positive
Adding this to “half the sum” gives one root; subtracting gives the other root.
(b) Let the length of one side be x. We are told that x2 + bx = c.
“Take half of b, square it, and add the result to c”
translates as:
“Rewrite the equation as: .”
That is, we have “completed the square” . If we now take the (positive) square root and subtract , we get a single value for x, which determines the side length of my square as required.
If the same method is applied to the general quadratic equation
with the extra initial step of “multiply through by ”, we produce first
then
then
which leads to the familiar quadratic formula.
(c) See Problem 3(c)(iv). AD : CB = DX : BX, so x : 1 = 1 : (x − 1). Hence x2 − x − 1 = 0. If we use the quadratic formula derived in the answer to part (b) above, and realise that x > 1, then we obtain .
Note: The procedure given in (a) dates back to the ancient Babylonians (~ 1700 BC) and later to the ancient Greeks (~ 300 BC). Both cultures worked without algebra. The Babylonians gave their verbal procedures as recipes in words, in the context of particular examples. The Greeks expressed everything geometrically. In modern language, if we denote the unknown numbers by α and β, then
Being told the sum and product is therefore the same as being given the coefficients of a quadratic equation, and being asked to find the two roots.
Our method for factorizing a quadratic involves a mental process of ‘inverse arithmetic’, where we juggle possibilities in search of α and β, when all we know are the coefficients (that is, the sum α + β, and the product αβ).
The procedure in (b) also dates back to the ancient Babylonians, and is essentially our process of completing the square. It was given as a procedure, without our algebraic notation. The Babylonians seem not to have been hampered (as the Greeks were) by the fact that it makes no sense to add a length and an area! They worded things geometrically, but seem to have understood that they were really playing numerical games (an idea which European mathematicians found elusive right up to the time of Descartes (1590–1656)).
Similarly, the modern use of symbols - allowing one to represent either positive or negative quantities - was widely resisted right into the nineteenth century. What we would write as a single family of quadratic equations, ax2 + bx + c = 0, had to be split into separate cases where two positive quantities were equated. For example, the groundbreaking book Ars Magna in which Cardano (1501–1576) explained how to solve cubic and quartic equations begins with quadratics - where his procedure distinguishes four different cases: “squares equal to numbers”, “squares equal to things”, “squares and things equal to numbers”, “squares and numbers equal to things”.
122.
(a)(i) .
(ii) .
(iii) We rearrange
[Alternatively: , etc.]
(b) (i) [Cf 121(a).] .
Therefore
and
(ii)
and
(iii) .
Therefore
and
123.
(a)(i) and
and
Hence
(ii) 5 = 2 + 3, and 24 = 4 × 2 × 3;
Therefore
(which is easy to check).
(b) (i) Claim If
Proof and
and
(ii) Simplify and .
5 = 4 + 1 and 16 = 4 × 4 × 1, so .
Actually, there is a simpler solution
6 = 5 + 1 and 20 = 4 × 5 × 1, so .
124.
(a) Let . Then . Hence , so α satisfies the quadratic polynomial equation x2 − 2x − 1 = 0.
Note: Observe that the resulting polynomial is equal to
In other words, to rationalize the coefficients, we need a polynomial which has both and its “conjugate” as roots.
(b) Let . Then . Hence , so α satisfies the quadratic polynomial equation .
Note: Observe that the resulting polynomial is equal to
In other words, to rationalize the coefficients, we need a polynomial which has both and its “conjugate” as roots.
(c) Let . Then , so , and . Hence α satisfies the quartic polynomial equation .
Note: Observe that the resulting polynomial is equal to
In other words, the roots are: (as required), and also , and .
(d) Let . Then
so
and α satisfies the quartic polynomial equation
Note:
so the roots are:
125. A direct approach can be made to work in both cases (but see the Notes).
(a) Suppose to the contrary that , for some integers p, q with HCF(p, q) = 1. Then , so is rational, and we can write with HCF(r,s) = 1. But then 6s2 = r2 ; hence r = 2t must be even; so 3s2 = 2t2, but then s must be even - contradicting HCF(r,s) = 1. Hence cannot be rational.
Note: It is slightly easier to rewrite the initial equation in the form
before squaring to get
which would imply that is rational.
(b) Suppose to the contrary that , for some integers p, q with HCF(p,q) − 1. Then
so is rational. Squaring then gives that
is rational. Subtracting then shows that is rational, and we can proceed as in part (a) to obtain a contradiction. Hence cannot be rational.
Note: It is simpler to rewrite the original equation in the form
before squaring to obtain
whence is rational, and we may proceed as in part (a).
126.
(i) We just have to fill in the missing bits of the partial factorisation
To produce the required term x10 we first insert x7. This then creates an unwanted term “− x 7”, so we add +x4 to cancel this out. This in turn creates an unwanted term “− x 4”, so we add +x to cancel this out. Hence the quotient is x7 + x4 + x, and the remainder is “x + 1”:
Note: It is worth noting a short cut. The factorised term of the form (x3 − 1) (x7 + …) is equal to zero when x3 = 1.
So one way to get the remainder is to “treat x3 as if it were equal to 1”. Then
is just like 1 · x, and x10 + 1 behaves as if it were equal to remainder.
(ii)
so the remainder = x + 1.
Note: If we treat x2 “as if it were equal to 1”, then
behaves as if it were equal to 1 · x + 1
(iii) Apply the Euclidean algorithm to m and n in order to write m = qn + r, where
Then
So the remainder is xr + 1.
Note: If we treat xn - 1 as if were 0 - that is, if we treat xn as if it were equal to 1 − then
which behaves like .
127. Suppose , where deg(r(x)) < 2. Then
Now
so the remainder r(x) = 2.
Note: If x satisfies x2 + x + 1 = 0, then x3 − 1 = 0 and .
behaves just like 1671 + 1 = 2, so r(x) = 2.
128.
(a)
(b)
(Suppose that the quadratic equation
with real coefficients c, d, has x = a + ib as a root. Then take the complex conjugate of the equation p(x) = 0 to see that x = a − ib is also a rooti of
Therefore
so c = −2a, and d = a2 + b2.)
129. Let the two unknown numbers be α and β. Then 10 = α + β, and 40 = αβ, so α and β are roots of the quadratic equation x2 − 10x + 40 = 0. Hence
130.
(a) Applying a simple rearrangement:
(by the usual addition formula: Problem 35)
(b) By part (a),
Hence
Etc.
Note: This should really be presented as a “proof by mathematical induction”, where (having established the initial cases) we “suppose the result holds for powers n = 1, 2, 3,…, k”, and then conclude that
(c)
131.
(a) We factorise: x3 − 1 = (x − 1) (x2 + x + 1, so the roots are x = 1; and
that is, the other two roots are
and
(b) We factorise:
so the roots are x = 1, x = −1, x = i, x = −i.
(c) We factorise:
so the roots are
− x = 1, x = − 1, and
− four further values of x satisfying : that is,
and
and
and
(d) We factorise:
so the roots are
− x = 1, x = −1;
− x = i, x = − i, and
− the roots of and , which happen to be
and
and
and
132. [In Problem 114 you were left to work out the required factorisation with your bare hands - and a bit of inspired guesswork. The suggested approach here is more systematic.]
The roots of x4 + 1 = 0 are complex numbers whose fourth power is equal to that is,
and
and
and
The first two are complex conjugates and give rise to two linear factors whose product is ; the other two are complex conjugates and give rise to two linear factors whose product is . Hence
133.
(a) The roots of x5 − 1 = 0 are precisely the five complex numbers of the form
that is,
From Problem 3(c) we know that
(b) The linear factor is clearly (x − 1). Each quadratic factor arises as the product of two of two conjugate linear factors. We saw in Problem 128(b) that two linear factors corresponding to roots a + bi and a - bi produce the quadratic factor . Hence the two quadratic factors are:
(whose product is equal to x4 + x3 + x2 + x + 1).
134.
(a) Put a = 1, y = x + 1: then x3 + 3x2 − 4 = 0 becomes y3 − 3y = 2.
(b) Divide through by a (which we may assume is non zero, since otherwise it would not be a cubic equation), to obtain a cubic equation
If we now put , then y3 incorporates both the x3 and the x2 terms, and the equation reduces to:
135. Given the equation x3 + 3x2 − 4 = 0. Let y = x + 1.
(i) Then y3 = x3 + 3x2 + 3x + 1, so 0 = x3 + 3x2 − 4 = y3 − 3y − 2.
(ii) Set y = u + v and use the fact that
is an identity, and so holds for all u and v.
(iii) Solve “3uv = 3”, “u3 + v3 = 2”. Substitute from the first equation into the second to get the quadratic equation in : that is, , So u3 = 1.
(iv) Hence u = 1 is certainly a solution. (We know there are also complex cube roots of 1; these lead to the other two solutions of the original cubic, but to “solve the equation” it is enough to find one solution.) Hence v = 1, so y = u + v = 2, and x = 2.
136. The Euclidean algorithm for ordinary integers arises by repeating the division algorithm:
given integers , there exists unique integers q, r such that m = qn + r where0 ⩽ r < n .
Here q is the quotient (the integer part of the division ), and r is the remainder. If we then replace the initial pair (m, n) by the new pair (n, r) and repeat until we obtain the remainder 0, then the last non-zero remainder is equal to HCF(m,n) (see Problem 6). The same idea also works for polynomials with integer coefficients (see Problem 126).
We start by clarifying what we mean by divisibility for Gaussian integers. Given two Gaussian integers, m = a + bi and n = c + di, we say that n = c + di divides m = a + bi (exactly) precisely
when there exists some other Gaussian integer q = e + fi such that m = qn: that is, a + bi = (e + fi)(c + di).
For example, 2 + 3i divides −4 + 7i because (1 + 2i)(2 + 3i) = −4 + 7i.
If m = a + bi and n = c + di are any old Gaussian integers, then it will not in general be true that “n divides m”, but we can imitate the division algorithm. The important idea here when carrying out particular calculations is to realize that “divide by c + di” is the same as “multiply by ”
- first carry out the division
- then take the “nearest” Gaussian integer q = e + fi, and let the difference m − qn = r be the remainder.
As for ordinary integers, any Gaussian integer that is a “common factor of m and n” is then automatically a common factor of n and of r = m − qn, and conversely. That is, the common factors of m and n are precisely the same as the common factors of n and r. So we can repeat the process replacing m, n by n, r. Provided the “remainder” r is in some sense “smaller” than n, we can continue until we reach a stage where the remainder r = 0 - at which point, the last non-zero remainder is equal to the HCF(m,n) (that is, the Gaussian integer which is the HCF of the two initial Gaussian integers m, n).
The feature of the remainders that gets progressively smaller is their norm (see Problem 25, and Problem 54). As so often, this becomes clearer when we look at an example.
Let us try to find the HCF of the two Gaussian integers m = 14−42i and n = 4 −7i.
- First do the division
- What is meant by the nearest Gaussian integer may require an element of judgment; but it is clear that the answer is fairly close to 5 − i = q, where qn = 13 − 39i, with remainder r = m − qn = 1 − 3i.
- Now repeat the process with n, r:
- The nearest Gaussian integer is not well-defined, but the answer is fairly close to . So , with remainder .
- Now repeat the step with the pair r = 1 − 3i and , to discover that with remainder 0. Hence
H C F ( 14 − 42 i , 4 − 7 i ) .
Note: One way to picture the process is to learn to “see” the Gaussian integers geometrically. Every Gaussian integer (such as a + bi) can be written as an integer combination of the two basic Gaussian integers “1” and “i” - namely
Since 1 and i are both of length 1 and perpendicular to each other, this represents the set of all Gaussian integers as the dots in a “square dot lattice” generated by translations in the x- and y- directions of the basic unit square spanned by 0, 1, i, and 1 + i.
Any other given Gaussian integer, such as n = c + di, then generates a “stretched and rotated” square lattice, which consists of all “Gaussian multiples” of c + di - generated by the basic square which is spanned by
Every Gaussian integer (or rather the point, or dot, which corresponds to it) lies either on the boundary, or inside, one of these larger “stretched and rotated” squares: if the diagonal of one of these larger squares has length 2k, then any other Gaussian integer m = a + bi lies inside one of these larger squares, and so lies within distance k (that is, half a diagonal) of some (Gaussian) multiple qn of n = c + di. And the difference m − qn is precisely the required remainder r.
Extra: We interpret . Prove that
(where ≈ denotes “approximately equal to”).