9.6: Alphabets and words

Last updated
Save as PDF

Page ID: 90681

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Definition: Alphabet

any set can be considered an alphabet

Definition: Letters

the elements of an alphabet set

Definition: Word

a finite-length, ordered list of letters

Definition: \(\Sigma^{\ast}\)

the set of words using alphabet set \(\Sigma\)

Remark \(\PageIndex{1}\)

Even if the alphabet set \(\Sigma\) is the usual English-language alphabet, we do not restrict ourselves to actual English-language words — nonsense words are allowed.

Example \(\PageIndex{1}\): English is not a full set of words

Using \(\Sigma = \Sigma = \{a, b, \ldots, y, z\} \text{,}\) words

\begin{equation*} \mathrm{math}, \; \mathrm{qwerty}, \; \mathrm{aabbccddijzuuu} \end{equation*}
are examples of elements in \(\Sigma^{\ast}\text{.}\) So, ignoring punctuation, hyphenation, and capitalization, the English language is a proper subset of \(\Sigma^{\ast}\text{.}\)

Example \(\PageIndex{2}\): If digits are letters then numbers are words

Using alphabet \(\Sigma = \{ 0,\, 1,\, 2,\, \dotsc,\, 9 \}\text{,}\) then \(\mathbb{N} \subsetneqq \Sigma^{\ast}\text{.}\)

Checkpoint \(\PageIndex{1}\)

Why is \(\mathbb{N} \neq \Sigma^{\ast}\) in Example \(\PageIndex{2}\)?

In computing science, a certain set of words is of particular importance.

Definition: binary word

a word using alphabet \(\{0,1\}\)

Definition: binary string

synonym for binary word

Warning \(\PageIndex{1}\)

Order matters! For example, using the alphabet

\begin{equation*} \Sigma = \{a, b, \ldots, y, z\} \text{,} \end{equation*}

the words \(\mathrm{ab}\) and \(\mathrm{ba}\) are different words in \(\Sigma^{\ast}\text{.}\)

Definition: length (of a word)

given \(w \in \Sigma^{\ast}\text{,}\) the length of \(w\) is the number of elements from \(\Sigma\) used to form \(w\text{,}\) counting repetition

Definition: \(\vert w \vert\)

length of the word \(w \in \Sigma^{\ast}\)

Example \(\PageIndex{5}\)

Using alphabet \(\Sigma = \Sigma = \{a, b, \ldots, y, z\}\text{,}\) we have

\begin{align*} \vert \mathrm{qwerty} \vert & = 6 \text{,} & \vert \mathrm{aabab} \vert & = 5 \text{.} \end{align*}

The concept of length allows us to identify some special subsets and a special element of \(\Sigma^{\ast}\text{.}\)

Definition: \(\Sigma^{\ast}_n\)

for \(n \in \mathbb{N}\text{,}\) the subset of \(\Sigma\) consisting of all words of length \(n\)

Definition: empty word

given an alphabet \(\Sigma\text{,}\) we always consider \(\Sigma^{\ast}\) to contain a unique word of length \(0\)

Definition: \(Ø\)

the empty word