Loading [MathJax]/extensions/TeX/mathchoice.js
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Mathematics LibreTexts

8.6.1: Singular Value Decompositions

( \newcommand{\kernel}{\mathrm{null}\,}\)

An Application to Linear Codes over Finite Fields

For centuries mankind has been using codes to transmit messages. In many cases, for example transmitting financial, medical, or military information, the message is disguised in such a way that it cannot be understood by an intruder who intercepts it, but can be easily “decoded” by the intended receiver. This subject is called cryptography and, while intriguing, is not our focus here. Instead, we investigate methods for detecting and correcting errors in the transmission of the message.

The stunning photos of the planet Saturn sent by the space probe are a very good example of how successful these methods can be. These messages are subject to “noise” such as solar interference which causes errors in the message. The signal is received on Earth with errors that must be detected and corrected before the high-quality pictures can be printed. This is done using error-correcting codes. To see how, we first discuss a system of adding and multiplying integers while ignoring multiples of a fixed integer.

Modular Arithmetic

We work in the set Z={0,±1,±2,±3,} of integers, that is the set of whole numbers. Everyone is familiar with the process of “long division” from arithmetic. For example, we can divide an integer a by 5 and leave a remainder “modulo 5” in the set {0,1,2,3,4}. As an illustration

19=35+4

so the remainder of 19 modulo 5 is 4. Similarly, the remainder of 137 modulo 5 is 2 because we have 137=275+2. This works even for negative integers: For example,

17=(4)5+3

so the remainder of 17 modulo 5 is 3.

This process is called the division algorithm. More formally, let n2 denote an integer. Then every integer a can be written uniquely in the form

a=qn+r where q and r are integers and 0rn1

Here q is called the quotient of a modulo n, and r is called the remainder of a modulo n. We refer to n as the modulus. Thus, if n=6, the fact that 134=226+2 means that 134 has quotient 22 and remainder 2 modulo 6.

Our interest here is in the set of all possible remainders modulo n. This set is denoted

Zn={0,1,2,3,,n1}

and is called the set of integers modulo n. Thus every integer is uniquely represented in Zn by its remainder modulo n.

We are going to show how to do arithmetic in Zn by adding and multiplying modulo n. That is, we add or multiply two numbers in Zn by calculating the usual sum or product in Z and taking the remainder modulo n. It is proved in books on abstract algebra that the usual laws of arithmetic hold in Zn for any modulus n2. This seems remarkable until we remember that these laws are true for ordinary addition and multiplication and all we are doing is reducing modulo n.

To illustrate, consider the case n=6, so that Z6={0,1,2,3,4,5}. Then 2+5=1 in Z6 because 7 leaves a remainder of 1 when divided by 6. Similarly, 25=4 in Z6, while 3+5=2, and 3+3=0. In this way we can fill in the addition and multiplication tables for Z6; the result is:

Tables for Z6

+012345001234511234502234501334501244501235501234×012345000000010123452024024303030340420425054321

Calculations in Z6 are carried out much as in Z. As an illustration, consider the familiar “distributive law” a(b+c)=ab+ac from ordinary arithmetic. This holds for all a, b, and c in Z6; we verify a particular case:

3(5+4)=35+34 in Z6

In fact, the left side is 3(5+4)=33=3, and the right side is (35)+(34)=3+0=3 too. Hence doing arithmetic in Z6 is familiar. However, there are differences. For example, 34=0 in Z6, in contrast to the fact that ab=0 in Z can only happen when either a=0 or b=0. Similarly, 32=3 in Z6, unlike Z.

Note that we will make statements like 30=19 in Z7; it means that 30 and 19 leave the same remainder 5 when divided by 7, and so are equal in Z7 because they both equal 5. In general, if n2 is any modulus, the operative fact is that

a=b in Zn if and only if ab is a multiple of n

In this case we say that a and b are equal modulo n, and write a=b(\funcmodn).

Arithmetic in Zn is, in a sense, simpler than that for the integers. For example, consider negatives. Given the element 8 in Z17, what is 8? The answer lies in the observation that 8+9=0 in Z17, so 8=9 (and 9=8). In the same way, finding negatives is not difficult in Zn for any modulus n.

Finite Fields

In our study of linear algebra so far the scalars have been real (possibly complex) numbers. The set R of real numbers has the property that it is closed under addition and multiplication, that the usual laws of arithmetic hold, and that every nonzero real number has an inverse in R. Such a system is called a field. Hence the real numbers R form a field, as does the set C of complex numbers. Another example is the set Q of all rational numbers (fractions); however the set Z of integers is not a field—for example, 2 has no inverse in the set Z because 2x=1 has no solution x in Z.

Our motivation for isolating the concept of a field is that nearly everything we have done remains valid if the scalars are restricted to some field: The gaussian algorithm can be used to solve systems of linear equations with coefficients in the field; a square matrix with entries from the field is invertible if and only if its determinant is nonzero; the matrix inversion algorithm works in the same way; and so on. The reason is that the field has all the properties used in the proofs of these results for the field R, so all the theorems remain valid.

It turns out that there are finite fields—that is, finite sets that satisfy the usual laws of arithmetic and in which every nonzero element a has an inverse, that is an element b in the field such that ab=1. If n2 is an integer, the modular system Zn certainly satisfies the basic laws of arithmetic, but it need not be a field. For example we have 23=0 in Z6 so 3 has no inverse in Z6 (if 3a=1 then 2=21=2(3a)=0a=0 in Z6, a contradiction). The problem is that 6=23 can be properly factored in Z.

An integer p2 is called a prime if p cannot be factored as p=ab where a and b are positive integers and neither a nor b equals 1. Thus the first few primes are 2,3,5,7,11,13,17,. If n2 is not a prime and n=ab where 2a, bn1, then ab=0 in Zn and it follows (as above in the case n=6) that b cannot have an inverse in Zn, and hence that Zn is not a field. In other words, if Zn is a field, then n must be a prime. Surprisingly, the converse is true:

026280 If p is a prime, then Zp is a field using addition and multiplication modulo p.

The proof can be found in books on abstract algebra.1 If p is a prime, the field Zp is called the field of integers modulo p.

For example, consider the case n=5. Then Z5={0,1,2,3,4} and the addition and multiplication tables are:

+01234001234112340223401334012440123×01234000000101234202413303142404321

Hence 1 and 4 are self-inverse in Z5, and 2 and 3 are inverses of each other, so Z5 is indeed a field. Here is another important example.

026292 If p=2, then Z2={0,1} is a field with addition and multiplication modulo 2 given by the tables

+01001110 and ×01000101

This is binary arithmetic, the basic algebra of computers.

While it is routine to find negatives of elements of Zp, it is a bit more difficult to find inverses in Zp. For example, how does one find 141 in Z17? Since we want 14114=1 in Z17, we are looking for an integer a with the property that a14=1 modulo 17. Of course we can try all possibilities in Z17 (there are only 17 of them!), and the result is a=11 (verify). However this method is of little use for large primes p, and it is a comfort to know that there is a systematic procedure (called the euclidean algorithm) for finding inverses in Zp for any prime p. Furthermore, this algorithm is easy to program for a computer. To illustrate the method, let us once again find the inverse of 14 in Z17.

026308 Find the inverse of 14 in Z17.

The idea is to first divide p=17 by 14:

17=114+3

Now divide (the previous divisor) 14 by the new remainder 3 to get

14=43+2

and then divide (the previous divisor) 3 by the new remainder 2 to get

3=12+1

It is a theorem of number theory that, because 17 is a prime, this procedure will always lead to a remainder of 1. At this point we eliminate remainders in these equations from the bottom up:

1=312since 3=12+1=31(1443)=53114since 2=1443=5(17114)114=517614since 3=17114

Hence (6)14=1 in Z17, that is, 1114=1. So 141=11 in Z17.

As mentioned above, nearly everything we have done with matrices over the field of real numbers can be done in the same way for matrices with entries from Zp. We illustrate this with one example. Again the reader is referred to books on abstract algebra.

026324 Determine if the matrix A=[1465] from Z7 is invertible and, if so, find its inverse.

Working in Z7 we have detA=1564=53=20 in Z7, so A is invertible. Hence Example [exa:004261] gives A1=21[5461]. Note that 21=4 in Z7 (because 24=1 in Z7). Note also that 4=3 and 6=1 in Z7, so finally A1=4[5311]=[6544]. The reader can verify that indeed [1465][6544]=[1001] in Z7.

While we shall not use them, there are finite fields other than Zp for the various primes p. Surprisingly, for every prime p and every integer n1, there exists a field with exactly pn elements, and this field is unique.2 It is called the Galois field of order pn, and is denoted GF(pn).

Error Correcting Codes

Coding theory is concerned with the transmission of information over a channel that is affected by noise. The noise causes errors, so the aim of the theory is to find ways to detect such errors and correct at least some of them. General coding theory originated with the work of Claude Shannon (1916–2001) who showed that information can be transmitted at near optimal rates with arbitrarily small chance of error.

Let F denote a finite field and, if n1, let

Fn denote the F-vector space of 1×n row matrices over F

with the usual componentwise addition and scalar multiplication. In this context, the rows in Fn are called words (or n-words) and, as the name implies, will be written as [a b c d]=abcd. The individual components of a word are called its digits. A nonempty subset C of Fn is called a code (or an n-code), and the elements in C are called code words. If F=Z2, these are called binary codes.

If a code word w is transmitted and an error occurs, the resulting word v is decoded as the code word “closest” to v in Fn. To make sense of what “closest” means, we need a distance function on Fn analogous to that in Rn (see Theorem [thm:014954]). The usual definition in Rn does not work in this situation. For example, if w=1111 in (Z2)4 then the square of the distance of w from 0 is

(10)2+(10)2+(10)2+(10)2=0

even though w0.

However there is a satisfactory notion of distance in Fn due to Richard Hamming (1915–1998). Given a word w=a1a2an in Fn, we first define the Hamming weight wt(w) to be the number of nonzero digits in w:

wt(w)=wt(a1a2an)=|{iai0}|

Clearly, 0wt(w)n for every word w in Fn. Given another word v=b1b2bn in Fn, the Hamming distance d(v,w) between v and w is defined by

d(v,w)=wt(vw)=|{ibiai}|

In other words, d(v,w) is the number of places at which the digits of v and w differ. The next result justifies using the term distance for this function d.

026377 Let u, v, and w denote words in Fn. Then:

  1. d(v,w)0.
  2. d(v,w)=0 if and only if v=w.
  3. d(v,w)=d(w,v).
  4. d(v,w)d(v,u)+d(u,w)

(1) and (3) are clear, and (2) follows because wt(v)=0 if and only if v=0. To prove (4), write x=vu and y=uw. Then (4) reads wt(x+y)wt(x)+wt(y). If x=a1a2an and y=b1b2bn, this follows because ai+bi0 implies that either ai0 or bi0.

Given a word w in Fn and a real number r>0, define the ball Br(w) of radius r (or simply the r-ball) about w as follows:

Br(w)={xFnd(w,x)r}

Using this we can describe one of the most useful decoding methods.

Nearest Neighbour Decodingneighbourdecoding Let C be an n-code, and suppose a word v is transmitted and w is received. Then w is decoded as the code word in C closest to it. (If there is a tie, choose arbitrarily.)

Using this method, we can describe how to construct a code C that can detect (or correct) t errors. Suppose a code word c is transmitted and a word w is received with s errors where 1st. Then s is the number of places at which the c- and w-digits differ, that is, s=d(c,w). Hence Bt(c) consists of all possible received words where at most t errors have occurred.

Assume first that C has the property that no code word lies in the t-ball of another code word. Because w is in Bt(c) and wc, this means that w is not a code word and the error has been detected. If we strengthen the assumption on C to require that the t-balls about code words are pairwise disjoint, then w belongs to a unique ball (the one about c), and so w will be correctly decoded as c.

To describe when this happens, let C be an n-code. The minimum distance d of C is defined to be the smallest distance between two distinct code words in C; that is,

d=\funcmin{d(v,w)v and w in C;vw}

026416 Let C be an n-code with minimum distance d. Assume that nearest neighbour decoding is used. Then:

  1. If t<d, then C can detect t errors.
  2. If 2t<d, then C can correct t errors.
  1. Let c be a code word in C. If wBt(c), then d(w,c)t<d by hypothesis. Thus the t-ball Bt(c) contains no other code word, so C can detect t errors by the preceding discussion.
  2. If 2t<d, it suffices (again by the preceding discussion) to show that the t-balls about distinct code words are pairwise disjoint. But if cc are code words in C and w is in Bt(c)Bt(c), then Theorem [thm:026377] gives

    d(c,c)d(c,w)+d(w,c)t+t=2t<d

026436 If F=Z3={0,1,2}, the 6-code {111111,111222,222111} has minimum distance 3 and so can detect 2 errors and correct 1 error.

Let c be any word in Fn. A word w satisfies d(w,c)=r if and only if w and c differ in exactly r digits. If |F|=q, there are exactly \binom{n}{r}(q - 1)^r such words where \binom{n}{r} is the binomial coefficient. Indeed, choose the r places where they differ in \binom{n}{r} ways, and then fill those places in \mathbf{w} in (q - 1)^{r} ways. It follows that the number of words in the t-ball about \mathbf{c} is

|B_{t}(\mathbf{c})| = \textstyle \binom{n}{0} + \binom{n}{1}(q - 1) + \binom{n}{2}(q - 1)^2 + \dots + \binom{n}{t}(q - 1)^t = \sum_{i = 0}^{t} \binom{n}{i}(q - 1)^i \nonumber

This leads to a useful bound on the size of error-correcting codes.

Hamming Bound026447 Let C be an n-code over a field F that can correct t errors using nearest neighbour decoding. If |F| = q, then

|C| \le \frac{q^n}{\sum_{i = 0}^{t} \binom{n}{i}(q - 1)^i} \nonumber

Write k = \sum_{i = 0}^{t} \binom{n}{i}(q - 1)^i. The t-balls centred at distinct code words each contain k words, and there are |C| of them. Moreover they are pairwise disjoint because the code corrects t errors (see the discussion preceding Theorem [thm:026416]). Hence they contain k \cdot |C| distinct words, and so k \cdot |C| \leq |F^{n}| = q^{n}, proving the theorem.

A code is called perfect if there is equality in the Hamming bound; equivalently, if every word in F^{n} lies in exactly one t-ball about a code word. For example, if F = \mathbb{Z}_2, n = 3, and t = 1, then q = 2 and \binom{3}{0} + \binom{3}{1} = 4, so the Hamming bound is \frac{2^3}{4} = 2. The 3-code C = \{000, 111\} has minimum distance 3 and so can correct 1 error by Theorem [thm:026416]. Hence C is perfect.

Linear Codes

Up to this point we have been regarding any nonempty subset of the F-vector space F^{n} as a code. However many important codes are actually subspaces. A subspace C \subseteq F^n of dimension k \geq 1 over F is called an (n, k)-linear code, or simply an (n, k)-code. We do not regard the zero subspace (that is, k = 0) as a code.

026469 If F = \mathbb{Z}_2 and n \geq 2, the n-parity-check code is constructed as follows: An extra digit is added to each word in F^{n-1} to make the number of 1s in the resulting word even (we say such words have even parity). The resulting (n, n - 1)-code is linear because the sum of two words of even parity again has even parity.

Many of the properties of general codes take a simpler form for linear codes. The following result gives a much easier way to find the minimal distance of a linear code, and sharpens the results in Theorem [thm:026416].

026475 Let C be an (n, k)-code with minimum distance d over a finite field F, and use nearest neighbour decoding.

  1. d = \func{min} \{ wt(\mathbf{w}) \mid \mathbf{0} \neq \mathbf{w} \in C\}.
  2. C can detect t \geq 1 errors if and only if t < d.
  3. C can correct t \geq 1 errors if and only if 2t < d.
  4. If C can correct t \geq 1 errors and |F| = q, then

    \textstyle \binom{n}{0} + \binom{n}{1}(q - 1) + \binom{n}{2}(q - 1)^2 + \dots + \binom{n}{t}(q - 1)^t \le q^{n-k} \nonumber

  1. Write d^{\prime} = \func{min}\{wt(\mathbf{w}) \mid \mathbf{0} \neq \mathbf{w} \mbox{ in } C\}. If \mathbf{v} \neq \mathbf{w} are words in C, then d(\mathbf{v}, \mathbf{w}) = wt(\mathbf{v} - \mathbf{w}) \geq d^{\prime} because \mathbf{v} - \mathbf{w} is in the subspace C. Hence d \geq d^{\prime}. Conversely, given \mathbf{w} \neq \mathbf{0} in C then, since \mathbf{0} is in C, we have wt(\mathbf{w}) = d(\mathbf{w}, \mathbf{0}) \geq d by the definition of d. Hence d^{\prime} \geq d and (1) is proved.
  2. Assume that C can detect t errors. Given \mathbf{w} \neq \mathbf{0} in C, the t-ball B_{t}(\mathbf{w}) about \mathbf{w} contains no other code word (see the discussion preceding Theorem [thm:026416]). In particular, it does not contain the code word \mathbf{0}, so t < d(\mathbf{w}, \mathbf{0}) = wt(\mathbf{w}). Hence t < d by (1). The converse is part of Theorem [thm:026416].
  3. Proof. If wt(\mathbf{c}) \leq t, then \mathbf{c} itself is in B_{t}(\mathbf{0}) \cap B_{t}(\mathbf{c}). So assume t < wt(\mathbf{c}) \leq 2t. Then \mathbf{c} has more than t nonzero digits, so we can form a new word \mathbf{w} by changing exactly t of these nonzero digits to zero. Then d(\mathbf{w}, \mathbf{c}) = t, so \mathbf{w} is in B_{t}(\mathbf{c}). But wt(\mathbf{w}) = wt(\mathbf{c}) - t \leq t, so \mathbf{w} is also in B_{t}(\mathbf{0}). Hence \mathbf{w} is in B_{t}(\mathbf{0}) \cap B_{t}(\mathbf{c}), proving the Claim.

    If C corrects t errors, the t-balls about code words are pairwise disjoint (see the discussion preceding Theorem [thm:026416]). Hence the claim shows that wt(\mathbf{c}) > 2t for all \mathbf{c} \neq \mathbf{0} in C, from which d > 2t by (1). The other inequality comes from Theorem [thm:026416].

  4. We have |C| = q^{k} because dim \;_{F} C = k, so this assertion restates Theorem [thm:026447].

026513 If F = \mathbb{Z}_2, then

C = \{0000000, \ 0101010, \ 1010101, \ 1110000, \ 1011010, \ 0100101, \ 0001111, \ 1111111\} \nonumber

is a (7, 3)-code; in fact C = span \;\{0101010, 1010101, 1110000\}. The minimum distance for C is 3, the minimum weight of a nonzero word in C.

Matrix Generators

Given a linear n-code C over a finite field F, the way encoding works in practice is as follows. A message stream is blocked off into segments of length k \leq n called messages. Each message \mathbf{u} in F^{k} is encoded as a code word, the code word is transmitted, the receiver decodes the received word as the nearest code word, and then re-creates the original message. A fast and convenient method is needed to encode the incoming messages, to decode the received word after transmission (with or without error), and finally to retrieve messages from code words. All this can be achieved for any linear code using matrix multiplication.

Let G denote a k \times n matrix over a finite field F, and encode each message \mathbf{u} in F^{k} as the word \mathbf{u}G in F^{n} using matrix multiplication (thinking of words as rows). This amounts to saying that the set of code words is the subspace C = \{\mathbf{u}G \mid \mathbf{u} \mbox{ in } F^{k}\} of F^{n}. This subspace need not have dimension k for every k \times n matrix G. But, if \{\mathbf{e}_{1}, \mathbf{e}_{2}, \dots, \mathbf{e}_{k}\} is the standard basis of F^{k}, then \mathbf{e}_{i}G is row i of G for each I and \{\mathbf{e}_{1}G, \mathbf{e}_{2}G, \dots, \mathbf{e}_{k}G\} spans C. Hence dim \; C = k if and only if the rows of G are independent in F^{n}, and these matrices turn out to be exactly the ones we need. For reference, we state their main properties in Lemma [lem:026536] below (see Theorem [thm:015711]).

026536 The following are equivalent for a k \times n matrix G over a finite field F:

  1. rank \; G = k.
  2. The columns of G span F^{k}.
  3. The rows of G are independent in F^{n}.
  4. The system GX = B is consistent for every column B in \mathbb{R}^k.
  5. GK = I_{k} for some n \times k matrix K.

(1) \Rightarrow (2). This is because dim \;(\func{col} G) = k by (1).

(2) \Rightarrow (4). G \left[ \begin{array}{ccc} x_{1} & \cdots & x_{n} \end{array} \right]^{T} = x_{1}\mathbf{c}_{1} + \cdots + x_{n}\mathbf{c}_{n} where \mathbf{c}_{j} is column j of G.

(4) \Rightarrow (5). G \left[ \begin{array}{ccc} \mathbf{k}_{1} & \cdots & \mathbf{k}_{k} \end{array} \right] = \left[ \begin{array}{ccc} G\mathbf{k}_{1} & \cdots & G\mathbf{k}_{k} \end{array} \right] for columns \mathbf{k}_{j}.

(5) \Rightarrow (3). If a_{1}R_{1} + \cdots + a_{k}R_{k} = 0 where R_{i} is row i of G, then \left[ \begin{array}{ccc} a_{1} & \cdots & a_{k} \end{array} \right] G = 0, so by (5), \left[ \begin{array}{ccc} a_{1} & \cdots & a_{k} \end{array} \right] = 0. Hence each a_{i} = 0, proving (3).

(3) \Rightarrow (1). rank \; G = dim \;(\func{row} G) = k by (3).

Note that Theorem [thm:015711] asserts that, over the real field \mathbb{R}, the properties in Lemma [lem:026536] hold if and only if GG^{T} is invertible. But this need not be true in general. For example, if F = \mathbb{Z}_2 and G = \left[ \begin{array}{rrrr} 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 \end{array}\right], then GG^{T} = 0. The reason is that the dot product \mathbf{w} \cdot \mathbf{w} can be zero for \mathbf{w} in F^{n} even if \mathbf{w} \neq \mathbf{0}. However, even though GG^{T} is not invertible, we do have GK = I_{2} for some 4 \times 2 matrix K over F as Lemma [lem:026536] asserts (in fact, K = \left[ \begin{array}{rrrr} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \end{array}\right]^T is one such matrix).

Let C \subseteq F^n be an (n, k)-code over a finite field F. If \{\mathbf{w}_{1}, \dots, \mathbf{w}_{k}\} is a basis of C, let G = \left[ \begin{array}{c} \mathbf{w}_{1}\\ \vdots \\ \mathbf{w}_{k} \end{array}\right] be the k \times n matrix with the \mathbf{w}_{i} as its rows. Let \{\mathbf{e}_{1}, \dots, \mathbf{e}_{k}\} is the standard basis of F^{k} regarded as rows. Then \mathbf{w}_{i} = \mathbf{e}_{i}G for each i, so C = span \;\{\mathbf{w}_{1}, \dots, \mathbf{w}_{k}\} = span \;\{\mathbf{e}_{1}G, \dots, \mathbf{e}_{k}G\}. It follows (verify) that

C = \{\mathbf{u}G \mid \mathbf{u} \mbox{ in } F^k \} \nonumber

Because of this, the k \times n matrix G is called a generator of the code C, and G has rank k by Lemma [lem:026536] because its rows \mathbf{w}_{i} are independent.

In fact, every linear code C in F^{n} has a generator of a simple, convenient form. If G is a generator matrix for C, let R be the reduced row-echelon form of G. We claim that C is also generated by R. Since G \to R by row operations, Theorem [thm:005294] shows that these same row operations \left[ \begin{array}{cc} G & I_k \end{array} \right] \to \left[ \begin{array}{cc} R & W \end{array} \right], performed on \left[ \begin{array}{cc} G & I_k \end{array} \right], produce an invertible k \times k matrix W such that R = WG. Then C = \{\mathbf{u}R \mid \mathbf{u} \mbox{ in } F^{k}\}. [In fact, if \mathbf{u} is in F^{k}, then \mathbf{u}G = \mathbf{u}_{1}R where \mathbf{u}_{1} = \mathbf{u}W^{-1} is in F^{k}, and \mathbf{u}R = \mathbf{u}_{2}G where \mathbf{u}_{2} = \mathbf{u}W is in F^{k}]. Thus R is a generator of C, so we may assume that G is in reduced row-echelon form.

In that case, G has no row of zeros (since rank \; G = k) and so contains all the columns of I_{k}. Hence a series of column interchanges will carry G to the block form G^{\prime\prime} = \left[\begin{array}{cc} I_{k} & A \end{array}\right] for some k \times (n - k) matrix A. Hence the code C^{\prime\prime} = \{\mathbf{u}G^{\prime\prime} \mid \mathbf{u} \mbox{ in } F^k\} is essentially the same as C; the code words in C^{\prime\prime} are obtained from those in C by a series of column interchanges. Hence if C is a linear (n, k)-code, we may (and shall) assume that the generator matrix G has the form

G = \left[\begin{array}{cc} I_{k} & A \end{array}\right] \quad \mbox{ for some } \quad k \times (n - k) \mbox{ matrix } A \nonumber

Such a matrix is called a standard generator, or a systematic generator, for the code C. In this case, if \mathbf{u} is a message word in F^{k}, the first k digits of the encoded word \mathbf{u}G are just the first k digits of \mathbf{u}, so retrieval of \mathbf{u} from \mathbf{u}G is very simple indeed. The last n - k digits of \mathbf{u}G are called parity digits.

Parity-Check Matrices

We begin with an important theorem about matrices over a finite field.

026632 Let F be a finite field, let G be a k \times n matrix of rank k, let H be an (n - k) \times n matrix of rank n - k, and let C = \{\mathbf{u}G \mid \mathbf{u} \mbox{ in } F^{k}\} and D = \{\mathbf{v}H \mid \mathbf{V} \mbox{ in } F^{n-k}\} be the codes they generate. Then the following conditions are equivalent:

  1. GH^{T} = 0.
  2. HG^{T} = 0.
  3. C = \{\mathbf{w} \mbox{ in } F^{n} \mid \mathbf{w}H^{T} = \mathbf{0}\}.
  4. D = \{\mathbf{w} \mbox{ in } F^{n} \mid \mathbf{w}G^{T} = \mathbf{0}\}.

First, (1) \Leftrightarrow (2) holds because HG^{T} and GH^{T} are transposes of each other.

(1) \Rightarrow (3) Consider the linear transformation T : F^{n} \to F^{n-k} defined by T(\mathbf{w}) = \mathbf{w}H^{T} for all \mathbf{w} in F^{n}. To prove (3) we must show that C = \text{ker} T. We have C \subseteq \text{ker} T by (1) because T(\mathbf{u}G) = \mathbf{u}GH^{T} = \mathbf{0} for all \mathbf{u} in F^{k}. Since dim \; C = rank \; G = k, it is enough (by Theorem [thm:019525]) to show dim \;(\text{ker} T) = k. However the dimension theorem (Theorem [thm:021499]) shows that dim \;(\text{ker} T) = n - dim \;(im \; T), so it is enough to show that dim \;(im \; T) = n - k. But if R_{1}, \dots, R_{n} are the rows of H^{T}, then block multiplication gives

im \; T = \{ \mathbf{w}H^T \mid \mathbf{w} \mbox{ in } \mathbb{R}^n \} = span \;\{R_{1}, \dots, R_{n} \} = \func{row}(H^T) \nonumber

Hence dim \;(im \; T) = rank \; (H^{T}) = rank \; H = n - k, as required. This proves (3).

(3) \Rightarrow (1) If \mathbf{u} is in F^{k}, then \mathbf{u}G is in C so, by (3), \mathbf{u}(GH^{T}) = (\mathbf{u}G)H^{T} = \mathbf{0}. Since \mathbf{u} is arbitrary in F^{k}, it follows that GH^{T} = 0.

(2) \Leftrightarrow (4) The proof is analogous to (1) \Leftrightarrow (3).

The relationship between the codes C and D in Theorem [thm:026632] will be characterized in another way in the next subsection.

If C is an (n, k)-code, an (n - k) \times n matrix H is called a parity-check matrix for C if C = \{\mathbf{w} \mid \mathbf{w}H^{T} = \mathbf{0}\} as in Theorem [thm:026632]. Such matrices are easy to find for a given code C. If G = \left[ \begin{array}{cc} I_k & A \end{array} \right] is a standard generator for C where A is k \times (n - k), the (n - k) \times n matrix

H = \left[\begin{array}{cc} -A^T & I_{n-k} \end{array}\right] \nonumber

is a parity-check matrix for C. Indeed, rank \; H = n - k because the rows of H are independent (due to the presence of I_{n - k}), and

GH^T = \left[\begin{array}{cc} I_{k} & A \end{array}\right]\left[\begin{array}{c} -A \\ I_{n-k} \end{array}\right] = -A + A = 0 \nonumber

by block multiplication. Hence H is a parity-check matrix for C and we have C = \{\mathbf{w} \mbox{ in } F^{n} \mid \mathbf{w}H^{T} = \mathbf{0}\}. Since \mathbf{w}H^{T} and H\mathbf{w}^{T} are transposes of each other, this shows that C can be characterized as follows:

C = \{\mathbf{w} \mbox{ in } F^n \mid H \mathbf{w}^T = \mathbf{0} \} \nonumber

by Theorem [thm:026632].

This is useful in decoding. The reason is that decoding is done as follows: If a code word \mathbf{c} is transmitted and \mathbf{v} is received, then \mathbf{z} = \mathbf{v} - \mathbf{c} is called the error. Since H\mathbf{c}^{T} = \mathbf{0}, we have H\mathbf{z}^{T} = H\mathbf{v}^{T} and this word

\mathbf{s} = H\mathbf{z}^T = H\mathbf{v}^T \nonumber

is called the syndrome. The receiver knows \mathbf{v} and \mathbf{s} = H\mathbf{v}^{T}, and wants to recover \mathbf{c}. Since \mathbf{c} = \mathbf{v} - \mathbf{z}, it is enough to find \mathbf{z}. But the possibilities for \mathbf{z} are the solutions of the linear system

H\mathbf{z}^T = \mathbf{s} \nonumber

where \mathbf{s} is known. Now recall that Theorem [thm:002849] shows that these solutions have the form \mathbf{z} = \mathbf{x} + \mathbf{s} where \mathbf{x} is any solution of the homogeneous system H\mathbf{x}^{T} = \mathbf{0}, that is, \mathbf{x} is any word in C (by Lemma [lem:026536]). In other words, the errors \mathbf{z} are the elements of the set

C + \mathbf{s} = \{\mathbf{c} + \mathbf{s} \mid \mathbf{c} \mbox{ in } C\} \nonumber

The set C + \mathbf{s} is called a coset of C. Let |F| = q. Since |C + \mathbf{s}| = |C| = q^{n - k} the search for \mathbf{z} is reduced from q^{n} possibilities in F^{n} to q^{n - k} possibilities in C + \mathbf{s}. This is called syndrome decoding, and various methods for improving efficiency and accuracy have been devised. The reader is referred to books on coding for more details.3

Orthogonal Codes

Let F be a finite field. Given two words \mathbf{v} = a_{1} a_{2} \cdots a_{n} and \mathbf{w} = b_{1} b_{2} \cdots b_{n} in F^{n}, the dot product \mathbf{v} \cdot \mathbf{w} is defined (as in \mathbb{R}^n) by

\mathbf{v}\bullet \mathbf{w} = a_{1}b_{1} + a_{2}b_{2} + \dots + a_{n}b_{n} \nonumber

Note that \mathbf{v} \cdot \mathbf{w} is an element of F, and it can be computed as a matrix product: \mathbf{v}\cdot\mathbf{w} = \mathbf{v}\mathbf{w}^{T}.

If C \subseteq F^{n} is an (n, k)-code, the orthogonal complement C^{\perp} is defined as in \mathbb{R}^n:

C^\perp = \{\mathbf{v} \mbox{ in } F^n \mid \mathbf{v}\bullet \mathbf{c} = 0 \mbox{ for all } \mathbf{c} \mbox{ in } C \} \nonumber

This is easily seen to be a subspace of F^{n}, and it turns out to be an (n, n - k)-code. This follows when F = \mathbb{R} because we showed (in the projection theorem) that n = dim \; U^{\perp} + dim \; U for any subspace U of \mathbb{R}^n. However the proofs break down for a finite field F because the dot product in F^{n} has the property that \mathbf{w} \cdot \mathbf{w} = 0 can happen even if \mathbf{w} \neq \mathbf{0}. Nonetheless, the result remains valid.

026725 Let C be an (n, k)-code over a finite field F, let G = \left[ \begin{array}{cc} I_k & A \end{array} \right] be a standard generator for C where A is k \times (n - k), and write H = \left[ \begin{array}{cc} -A^{T} & I_{n - k} \end{array} \right] for the parity-check matrix. Then:

  1. H is a generator of C^{\perp}.
  2. dim \;(C^{\perp}) = n - k = rank \; H.
  3. C^{\perp\perp} = C and dim \;(C^{\perp}) + dim \; C = n.

As in Theorem [thm:026632], let D = \{\mathbf{v}H \mid \mathbf{v} \mbox{ in } F^{n - k}\} denote the code generated by H. Observe first that, for all \mathbf{w} in F^{n} and all \mathbf{u} in F^{k}, we have

\mathbf{w}\bullet (\mathbf{u}G) = \mathbf{w}(\mathbf{u}G)^T= \mathbf{w}(G^T\mathbf{u}^T) = (\mathbf{w}G^T)\bullet \mathbf{u} \nonumber

Since C = \{\mathbf{u}G \mid \mathbf{u} \mbox{ in } F^{k}\}, this shows that \mathbf{w} is in C^{\perp} if and only if (\mathbf{w}G^{T}) \cdot \mathbf{u} = 0 for all \mathbf{u} in F^{k}; if and only if4 \mathbf{w}G^{T} = \mathbf{0}; if and only if \mathbf{w} is in D (by Theorem [thm:026632]). Thus C^{\perp} = D and a similar argument shows that D^{\perp} = C.

  1. H generates C^{\perp} because C^{\perp} = D = \{\mathbf{v}H \mid \mathbf{v} \mbox{ in } F^{n - k} \}.
  2. This follows from (1) because, as we observed above, rank \; H = n - k.
  3. Since C^{\perp} = D and D^{\perp} = C, we have C^{\perp\perp} = (C^{\perp})^{\perp} = D^{\perp} = C. Finally the second equation in (3) restates (2) because dim \; C = k.

We note in passing that, if C is a subspace of \mathbb{R}^k, we have C + C^{\perp} = \mathbb{R}^k by the projection theorem (Theorem [thm:023885]), and C \cap C^{\perp} = \{\mathbf{0}\} because any vector \mathbf{x} in C \cap C^{\perp} satisfies \|\mathbf{x}\|^{2} = \mathbf{x} \cdot \mathbf{x} = 0. However, this fails in general. For example, if F = \mathbb{Z}_2 and C = span \;\{1010, 0101\} in F^{4} then C^{\perp} = C, so C + C^{\perp} = C = C \cap C^{\perp}.

We conclude with one more example. If F = \mathbb{Z}_2, consider the standard matrix G below, and the corresponding parity-check matrix H:

G = \left[ \begin{array}{rrrrrrr} 1 & 0 & 0 & 0 & 1 & 1 & 1\\ 0 & 1 & 0 & 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 1 & 0 & 1 & 1 \end{array}\right] \quad \mbox{ and } \quad H = \left[ \begin{array}{rrrrrrr} 1 & 1 & 1 & 0 & 1 & 0 & 0\\ 1 & 1 & 0 & 1 & 0 & 1 & 0 \\ 1 & 0 & 1 & 1 & 0 & 0 & 1 \end{array}\right] \nonumber

The code C = \{\mathbf{u}G \mid \mathbf{u} \mbox{ in } F^{4}\} generated by G has dimension k = 4, and is called the Hamming (7, 4)-code. The vectors in C are listed in the first table below. The dual code generated by H has dimension n - k = 3 and is listed in the second table.

\begin{array}{rrc|c} && \mathbf{u} & \mathbf{u}G \\ \cline{3-4} && 0000 & 0000000 \\ && 0001 & 0001011 \\ && 0010 & 0010101 \\ && 0011 & 0011110 \\ && 0100 & 0100110 \\ && 0101 & 0101101 \\ && 0110 & 0110011 \\ C: && 0111 & 0111000 \\ && 1000 & 1000111 \\ && 1001 & 1001100 \\ && 1010 & 1010010 \\ && 1011 & 1011001 \\ && 1100 & 1100001 \\ && 1101 & 1101010 \\ && 1110 & 1110100 \\ && 1111 & 1111111 \end{array} \quad \quad \begin{array}{rrc|c} && \mathbf{v} & \mathbf{v}H \\ \cline{3-4} && 000 & 0000000 \\ && 001 & 1011001 \\ && 010 & 1101010 \\ C^\perp: && 011 & 0110011 \\ && 100 & 1110100 \\ && 101 & 0101101 \\ && 110 & 0011110 \\ && 111 & 1000111 \end{array} \nonumber

Clearly each nonzero code word in C has weight at least 3, so C has minimum distance d = 3. Hence C can detect two errors and correct one error by Theorem [thm:026475]. The dual code has minimum distance 4 and so can detect 3 errors and correct 1 error.

Exercises for 1

solutions

2

Find all a in \mathbb{Z}_{10} such that:

  1. a^{2} = a.
  2. a has an inverse (and find the inverse).
  3. a^{k} = 0 for some k \geq 1.
  4. a = 2^{k} for some k \geq 1.
  5. a = b^{2} for some b in \mathbb{Z}_{10}.
  1. 1^{-1} = 1, 9^{-1} = 9, 3^{-1} = 7, 7^{-1} = 3.
  2. 2^{1} = 2, 2^{2} = 4, 2^{3} = 8, 2^{4} = 16 = 6, 2^{5} = 12 = 2, 2^{6} = 2^{2} \dots so a = 2^k if and only if a = 2, 4, 6, 8.

  1. Show that if 3a = 0 in \mathbb{Z}_{10}, then necessarily a = 0 in \mathbb{Z}_{10}.
  2. Show that 2a = 0 in \mathbb{Z}_{10} holds in \mathbb{Z}_{10} if and only if a = 0 or a = 5.
  1. If 2a = 0 in \mathbb{Z}_{10}, then 2a = 10k for some integer k. Thus a = 5k.

Find the inverse of:

8 in \mathbb{Z}_{13}; 11 in \mathbb{Z}_{19}.

  1. 11^{-1} = 7 in \mathbb{Z}_{19}.

If ab = 0 in a field F, show that either a = 0 or b = 0.

Show that the entries of the last column of the multiplication table of \mathbb{Z}_n are

0, n - 1, n - 2, \dots, 2, 1 \nonumber

in that order.

In each case show that the matrix A is invertible over the given field, and find A^{-1}.

  1. A = \left[ \begin{array}{rr} 1 & 4 \\ 2 & 1 \end{array}\right] over \mathbb{Z}_5.

  2. A = \left[ \begin{array}{rr} 5 & 6 \\ 4 & 3 \end{array}\right] over \mathbb{Z}_7.

  1. \det A = 15 - 24 = 1 + 4 = 5 \neq 0 in \mathbb{Z}_{7}, so A^{-1} exists. Since 5^{-1} = 3 in \mathbb{Z}_{7}, we have A^{-1} = 3\left[ \begin{array}{rr} 3 & -6 \\ 3 & 5 \end{array}\right] = 3\left[ \begin{array}{rr} 3 & 1 \\ 3 & 5 \end{array}\right] = \left[ \begin{array}{rr} 2 & 3 \\ 2 & 1 \end{array}\right].

Consider the linear system \arraycolsep=1.5pt \begin{array}{rrrrrrr} 3x & + & y & + & 4z & = & 3 \\ 4x & + & 3y & + & z & = & 1 \end{array}. In each case solve the system by reducing the augmented matrix to reduced row-echelon form over the given field:

\mathbb{Z}_5 \mathbb{Z}_7

  1. We have 5 \cdot 3 = 1 in \mathbb{Z}_{7} so the reduction of the augmented matrix is:

    \begin{aligned} \left[ \begin{array}{rrrr} 3 & 1 & 4 & 3 \\ 4 & 3 & 1 & 1 \end{array}\right] & \rightarrow \left[ \begin{array}{rrrr} 1 & 5 & 6 & 1 \\ 4 & 3 & 1 & 1 \end{array}\right] \\ & \rightarrow \left[ \begin{array}{rrrr} 1 & 5 & 6 & 1 \\ 0 & 4 & 5 & 4 \end{array}\right] \\ & \rightarrow \left[ \begin{array}{rrrr} 1 & 5 & 6 & 1 \\ 0 & 1 & 3 & 1 \end{array}\right] \\ & \rightarrow \left[ \begin{array}{rrrr} 1 & 0 & 5 & 3 \\ 0 & 1 & 3 & 1 \end{array}\right].\end{aligned} \nonumber

    Hence x = 3 + 2t, y = 1 + 4t, z = t; t in \mathbb{Z}_{7}.

Let K be a vector space over \mathbb{Z}_2 with basis \{1, t\}, so K = \{a + bt \mid a, b, \mbox{ in } \mathbb{Z}_2\}. It is known that K becomes a field of four elements if we define t^{2} = 1 + t. Write down the multiplication table of K.

Let K be a vector space over \mathbb{Z}_3 with basis \{1, t\}, so K = \{a + bt \mid a, b, \mbox{ in } \mathbb{Z}_3\}. It is known that K becomes a field of nine elements if we define t^{2} = -1 in \mathbb{Z}_3. In each case find the inverse of the element x of K:

x = 1 + 2t x = 1 + t

  1. (1 + t)^{-1} = 2 + t.

[ex:8_7_10] How many errors can be detected or corrected by each of the following binary linear codes?

  1. C = \{0000000, 0011110, 0100111, 0111001, \\ \hspace*{2em} 1001011, 1010101, 1101100, 1110010\}
  2. C = \{0000000000, 0010011111, 0101100111,\\ \hspace*{2em} 0111111000, 1001110001, 1011101110,\\ \hspace*{2em} 1100010110, 1110001001\}
  1. The minimum weight of C is 5, so it detects 4 errors and corrects 2 errors.
  1. If a binary linear (n, 2)-code corrects one error, show that n \geq 5. [Hint: Hamming bound.]
  2. Find a (5, 2)-code that corrects one error.
  1. \{00000, 01110, 10011, 11101\}.
  1. If a binary linear (n, 3)-code corrects two errors, show that n \geq 9. [Hint: Hamming bound.]
  2. If G = \left[ \begin{array}{rrrrrrrrrr} 1 & 0 & 0 & 1 & 1 & 1 & 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 1 & 1 & 0 & 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 1 & 0 & 1 & 0 & 1 & 1 & 1 \end{array}\right], show that the binary (10, 3)-code generated by G corrects two errors. [It can be shown that no binary (9, 3)-code corrects two errors.]

  1. The code is \{0000000000, 1001111000, 0101100110,
    0011010111, 1100011110, 1010101111,
    0110110001, 1111001001\}. This has minimum distance 5 and so corrects 2 errors.

  1. Show that no binary linear (4, 2)-code can correct single errors.
  2. Find a binary linear (5, 2)-code that can correct one error.
  1. \{00000, 10110, 01101, 11011\} is a (5, 2)-code of minimal weight 3, so it corrects single errors.

Find the standard generator matrix G and the parity-check matrix H for each of the following systematic codes:

  1. \{00000, 11111\} over \mathbb{Z}_2.
  2. Any systematic (n, 1)-code where n \geq 2.
  3. The code in Exercise [ex:8_7_10](a).
  4. The code in Exercise [ex:8_7_10](b).
  1. G = \left[ \begin{array}{cc} 1 & \mathbf{u} \end{array} \right] where \mathbf{u} is any nonzero vector in the code. H = \left[ \begin{array}{c} \mathbf{u} \\ I_{n-1} \end{array}\right].

Let \mathbf{c} be a word in F^{n}. Show that B_{t}(\mathbf{c}) = \mathbf{c} + B_{t}(\mathbf{0}), where we write

\mathbf{c} + B_{t}(\mathbf{0}) = \{\mathbf{c} + \mathbf{v} \mid \mathbf{v} \mbox{ in } B_{t}(\mathbf{0})\} \nonumber

If a (n, k)-code has two standard generator matrices G and G_{1}, show that G = G_{1}.

Let C be a binary linear n-code (over \mathbb{Z}_2). Show that either each word in C has even weight, or half the words in C have even weight and half have odd weight. [Hint: The dimension theorem.]


  1. See, for example, W. Keith Nicholson, Introduction to Abstract Algebra, 4th ed., (New York: Wiley, 2012).↩
  2. See, for example, W. K. Nicholson, Introduction to Abstract Algebra, 4th ed., (New York: Wiley, 2012).↩
  3. For an elementary introduction, see V. Pless, Introduction to the Theory of Error-Correcting Codes, 3rd ed., (New York: Wiley, 1998).↩
  4. If \mathbf{v} \cdot \mathbf{u} = 0 for every \mathbf{u} in F^{k}, then \mathbf{v} = \mathbf{0}—let \mathbf{u} range over the standard basis of F^{k}.↩

This page titled 8.6.1: Singular Value Decompositions is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by W. Keith Nicholson (Lyryx Learning Inc.) via source content that was edited to the style and standards of the LibreTexts platform.

Support Center

How can we help?