1.5: More than the sum of their parts
- Page ID
- 54890
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)We motivate this first chapter by noticing that while many real-world structures are compositional, the results of observing them are often not. The reason is that observation is inherently “lossy”: in order to extract information from something, one must drop the details. For example, one stores a real number by rounding it to some precision. But if the details are actually relevant in a given system operation, then the observed result of that operation will not be as expected. This is clear in the case of roundoff error, but it also shows up in non-numerical domains: observing a complex system is rarely enough to predict its behavior because the observation is lossy.
A central theme in category theory is the study of structures and structure-preserving maps. A map f : X → Y is a kind of observation of object X via a specified relationship it has with another object, Y. For example, think of X as the subject of an experiment and Y as a meter connected to X, which allows us to extract certain features of X by looking at the reaction of Y.
Asking which aspects of X one wants to preserve under the observation f becomes the question “what category are you working in?.” As an example, there are many functions f from \(\mathbb{R}\) to \(\mathbb{R}\), and we can think of them as observations: rather than view x “directly”, we only observe f (x). Out of all the functions f : \(\mathbb{R}\) →\(\mathbb{R}\), only some of them preserve the order of numbers, only some of them preserve the distance between numbers, only some of them preserve the sum of numbers, etc. Let’s check in with an exercise; a solution can be found in Chapter 1.
Some terminology: a function f : \(\mathbb{R}\) → \(\mathbb{R}\) is said to be
(a) order-preserving if x ≤ y implies f (x) ≤ f (y), for all x, y ∈ R;1
(b) metric-preserving if |x - y| = |f (x) - f (y)|;
(c) addition-preserving if f (x + y) = f (x) + f (y).
For each of the three properties defined above call it foo find an f that is foo preserving and an example of an f that is not foo preserving. ♦
In category theory we want to keep control over which aspects of our systems are being preserved under various observations. As we said above, the less structure is preserved by our observation of a system, the more “surprises” occur when we observe its operations. One might call these surprises generative effects.
In using category theory to explore generative effects, we follow the basic ideas from work by Adam [Ada17]. He goes much more deeply into the issue than we can here; see Section 1.5. But as mentioned above, we must also use this chapter to give an order-theoretic warm-up for the full-fledged category theory to come.
first look at generative effects
To explore the notion of a generative effect we need a sort of system, a sort of observation, and a system-level operation that is not preserved by the observation. Let’s start with a simple example.
A simple system. Consider three points; we’ll call them •, ◦ and ∗. In this example, a system will simply be a way of connecting these points together. We might think of our points as sites on a power grid, with a system describing connection by power lines, or as people susceptible to some disease, with a system describing interactions that can lead to contagion. As an abstract example of a system, there is a system where • and ◦ are connected, but neither are connected to ∗. We shall draw this like so:
Connections are symmetric, so if a is connected to b, then b is connected to a. Connections are also transitive, meaning that if a is connected to b, and b is connected to c, then a is connected to c; that is, all a, b, and c are connected. Friendship is not transitive my friend’s friend is not necessarily my friend but possible communication of a concept or a disease is.
Here we depict two more systems, one in which none of the points are connected, and one in which all three points are connected.
There are five systems in all, and we depict them just below.
Now that we have defined the sort of system we want to discuss, suppose that Alice is observing this system. Her observation of interest, which we call Φ, extracts a single feature from a system, namely whether the point • is connected to the point ∗; this is what she wants to know. Her observation of the system will be an assignment of either true or false; she assigns true if • is connected to ∗, and false otherwise. So Φ assigns the value true to the following two systems:
and Φ assigns the value false to the three remaining systems:
The last piece of setup is to give a sort of operation that Alice wants to perform on the systems themselves. It’s a very common operation one that will come up many times throughout the book called join. If the reader has been following the story arc, the expectation here is that Alice’s connectivity observation will not be compositional with respect to the operation of system joining; that is, there will be generative effects. Let’s see what this means.
Joining our simple systems. Joining two systems A and B is performed simply by combining their connections. That is, we shall say the join of systems A and B, denote it A \(\lor\) B, has a connection between points x and y if there are some points z1, . . . , zn such that each of the following are true in at least one of A or B: x is connected to z1, zi is connected to zi+1, and zn is connected to y. In a three-point system, the above definition is overkill, but we want to say something that works for systems with any number of elements. The high-level way to say it is “take the transitive closure of the union of the connections in A and B.” In our three-element system, it means for example that
and
What is the result of joining the following two systems?
- We are now ready to see the generative effect. We don’t want to build it up too much this example has been made as simple as possible but we will see that Alice’s observation fails to preserve the join operation. We’ve been denoting her observation measuring whether • and ∗ are connected by the symbol Φ; it returns a boolean result, either true or false.
We see above in Eq.(1.2) that \(\Phi\left(\odot^{\circ}\right)=\Phi\left(^{\circ} \%\right)=\text { false }\): in both cases •isnotconnected to ∗. On the other hand, when we join these two systems as in Eq. (1.3), we see that \(\Phi\left(\Omega^{\circ} \vee^{\circ} \theta\right)=\Phi\left(^{\circ}\right)=\text { true: }\) in the joined system, • is connected to ∗. The question that Alice is interested in, that of Φ, is inherently lossy with respect to join, and there is no way to fix it without a more detailed observation, one that includes not only ∗ and • but also ◦.
While this was a simple example, it should be noted that whether the potential for such effects exist—i.e. determining whether an observation is operation-preserving— can be incredibly important information to know. For example, Alice could be in charge of putting together the views of two local authorities regarding possible contagion between an infected person • and a vulnerable person ∗. Alice has noticed that if they separately extract information from their raw data and combine the results, it gives a different answer than if they combine their raw data and extract information from it.
Ordering systems
Category theory is all about organizing and layering structures. In this section we will explain how the operation of joining systems can be derived from a more basic structure: order. We will see that while joining is not preserved by Alice’s connectivity observation Φ, order is.
To begin, we note that the systems themselves are ordered in a hierarchy. Given systems A and B, we say that A ≤ B if, whenever x is connected to y in A, then x is connected to y in B. For example,
This notion of ≤ leads to the following diagram:
where an arrow from system A to system B means A ≤ B. Such diagrams are known as Hasse diagrams.
As we were saying above, the notion of join is derived from this order. Indeed for any two systems A and B in the Hasse diagram (1.5), the joined system A ∨ B is the smallest system that is bigger than both A and B. That is, A ≤ (A \(\lor\) B) and B ≤ (A \(\lor\) B), and for any C, if A ≤ C and B ≤ C then (A \(\lor\) B) ≤ C. Let’s walk through this with an exercise.
- Write down all the partitions of a two element set {•, ∗}, order them as above, and draw the Hasse diagram.
- Now do the same thing for a four element-set, say {1, 2, 3, 4}. There should be 15 partitions.
Choose any two systems in your 15-element Hasse diagram, call them A and B.
3. What is A ∨ B, using the definition given in the paragraph above Eq. (1.3)?
4. Is it true that A ≤ (A \(\lor\) B) and B ≤ (A \(\lor\) B)?
5. What are all the systems C for which both A ≤ C and B ≤ C.
6. Is it true that in each case, (A \(\lor\) B) ≤ C? ♦
The set \(\mathbb{B}\) = {true, false} of booleans also has an order, false ≤ true:
Thus false ≤ false, false ≤ true, and true ≤ true, but true ≠ false. In other words, A ≤ B if A implies B.\(^{2}\)
For any A, B in B, we can again write A \(\lor\) B to mean the least element that is greater than both A and B.
Using the order false ≤ true on \(\mathbb{B}\) = {true, false}, what is:
- true \(\lor\) false?
- false \(\lor\) true?
- true \(\lor\) true?
- false \(\lor\) false? ♦
Let’s return to our systems with •, ◦, and ∗, and Alice’s “• is connected to ∗” function, which we called Φ. It takes any such system and returns either true or false. Note that the map Φ preserves the ≤ order: if A ≤ B and there is a connection between • and∗ in A, then there is such a connection in B too. The possibility of a generative effect is captured in the inequality
\(\Phi(A) \vee \Phi(B) \leq \Phi(A \vee B)\). (1.8)
We saw on page 4 that this can be a strict inequality: we showed two systems A and B with Φ(A) = Φ(B) = false, so Φ(A) ∨ Φ(B) = false, but where Φ(A \(\lor\) B) = true. In this case, a generative effect exists.
These ideas capture the most basic ideas in category theory. Most directly, we have seen that the map Φ preserves some structure but not others: it preserves order but not join. In fact, we have seen here hints of more complex notions from category theory, without making them explicit; these include the notions of category, functor, colimit, and adjunction. In this chapter we will explore these ideas in the elementary setting of ordered sets.