3.4: Adjunctions and Data Migration
- Page ID
- 54906
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)We have talked about how set-valued functors on a schema can be understood as filling that schema with data. But there are also functors between schemas. When the two sorts of functors are composed, data is migrated. This is the simplest form of data migration; more complex ways to migrate data come from using adjoints. All of the above is the subject of this section.
Pulling back data along a functor
To begin, we will migrate data between the graph-indexing schema Gr and the loop schema, which we call DDS, shown below

We begin by writing down a sample instance I : DDS → Set on this schema:

We call the schema DDS to stand for discrete dynamical system. Indeed, we may think of the data in the DDS-instance of Eq. (3.65) as listing the states and movements of a deterministic machine: at every point in time the machine is in one of the listed states, and given the machine in one of the states, in the next instant it moves to a uniquely determined next state.
Our goal is to migrate the data in Eq. (3.65) to data on Gr; this will give us the data of a graph and so allow us to visualise our machine.
We will use a functor connecting these schemas in order to move data between them. The reader can create any functor she likes, but we will use a specific functor F : Gr → DDS to migrate data in a way that makes sense to us, the authors. Here we draw F, using colors to hopefully aid understanding:

The functor F sends both objects of Gr to the ‘State’ object of DDS (as it must). On morphisms, it sends the ‘source’ morphism to the identity morphism on ‘State’, and the ‘target’ morphism to the morphism ‘next’.
A sample database instance on DDS was given in Eq. (3.65); recall this is a functor I : DDS → Set. So now we have two functors as follows:
\(\mathbf{G r} \stackrel{F}{\longrightarrow} \mathbf{D D S} \stackrel{I}{\longrightarrow} \text { Set. }\) (3.66)
Objects in Gr are sent by F to objects in DDS, which are sent by I to objects in Set, which are sets. Morphisms in Gr are sent by F to morphisms in DDS, which are sent by I to morphisms in Set, which are functions. This defines a composite functor F ; I : Gr → Set. Both F and I respect identities and composition, so F ; I does too. Thus we have obtained an instance on Gr, i.e. we have converted our discrete dynamical system from Eq. (3.65) into a graph! What graph is it?
For an instance on Gr, we need to fill an Arrow table and a Vertex table. Both of these are sent by F to State, so let’s fill both with the rows of State in Eq. (3.65). Similarly, since F sends ‘source’ to the identity and sends ‘target’ to ‘next’, we obtain the following tables:

Now that we have a graph, we can draw it.

Each arrow is labeled by its source vertex, as if to say, “What I do next is determined by what I am now.”
Consider the functor G : Gr → DDS given by sending ‘source’ to ‘next’ and sending ‘target’ to the identity on ‘State’. Migrate the same data, called I in Eq. (3.65), using the functor G. Write down the tables and draw the corresponding graph. ♦
We refer to the above procedure—basically just composing functors as in Eq. (3.66) as “pulling back data along a functor.” We just now pulled back data I along functor F.
Let C and D be categories and let F : C → D be a functor. For any set-valued functor I : D → Set, we refer to the composite functor F ; I : C → Set as the pullback of I along F.
Given a natural transformation α : I ⇒ J, there is a natural transformation \(α_{F}\) : F ; I ⇒ F ; J, whose component (F ; I)(c)→(F ; J)(c) for any c \(\in\) Ob(C) is given by (\(α_{F}\))\(_{c}\) := \(α_{Fc}\)

This uses the data of F to define a functor ∆F : D-Inst → C-Inst.
Note that the term pullback is also used for a certain sort of limit, for more details see Remark 3.100.
Adjunctions
In Section 1.4 we discussed Galois connections, which are adjunctions between pre- orders. Now that we’ve defined categories and functors, we can discuss adjunctions in general. The relevance to databases is that the data migration functor ∆ from Defi- nition 3.68 always has two adjoints of its own: a left adjoint which we denote Σ and a right adjoint which we denote Π.
Recall that an adjunction between preorders P and Q is a pair of monotone maps f : P → Q and g : Q → P that are almost inverses: we have
f(p) ≤ q if and only if p ≤ g(q). (3.69)
Recall from Section 3.2.3 that in a preorder P, a hom-set P(a, b) has one element when a ≤ b, and no elements otherwise. We can thus rephrase Eq. (3.69) as an isomorphism of sets Q( f (p), q) \(\cong\) P(p, g(q)): either both are one-element sets or both are 0-element sets. This suggests how to define adjunctions in the general case.
Let C and D be categories, and L: C → D and R: D → C be functors. We say that L is left adjoint to R (and that R is right adjoint to L) if, for any c \(\in\) C and d \(\in\) D, there is an isomorphism of hom-sets
\(\alpha_{c, d}: \mathcal{C}(c, R(d)) \stackrel{\cong}{\rightarrow} \mathcal{D}(L(c), d)\)
that is natural in c and d.\(^{8}\)
Given a morphism f : c → R(d) in C, its image g := \(α_{c,d}\)(f) is called its mate. Similarly, the mate of g : L(c) → d is f .
To denote an adjunction we write L ⊣ R, or in diagrams,

with the ⇒ in the direction of the left adjoint.
Recall that every preorder P can be regarded as a category. Galois connections between preorders and adjunctions between the corresponding categories are exactly the same thing.
Let B \(\in\) Ob(Set) be any set. There is an adjunction called ‘currying B,’ after the logician Haskell Curry:

Abstractly we write it as on the left, but what this means is that for any sets A, C, there is a natural isomorphism as on the right.
To explain this, we need to talk about exponential objects in Set. Suppose that B and C are sets. Then the set of functions B → C is also a set; let’s denote it CB. It’s written this way because if C has 10 elements and B has 3 elements then CB has 103 elements, and more generally for any two finite sets |\(C^{B}\)| = |C|\(_{|B|}\).
The idea of currying is that given sets A, B, and C, there is a one-to-one correspondence between functions (A × B) → C and functions A → \(C^{B}\). Intuitively, if I have a function f of two variables a, b, I can “put off” entering the second variable: if you give me just a, I’ll return a function B → C that’s waiting for the B input. This is the curried version of f . As one might guess, there is a formal connection between exponential objects and what we called hom-elements b -o c in Definition 2.79.
In Example 3.72, we discussed an adjunction between functors − × B and \((−)^{B}\). But we only said how these functors worked on objects: for an arbitrary set X, they return sets X × B and X\(^{B}\) respectively.

- Given a morphism f : X → Y, what morphism should − × B: X × B → Y × B return?
- Given a morphism f : X → Y, what morphism should (−)\(^{B}\) : \(^X_{B}\) → \(Y^{B}\) return?
- Consider the function +: \(\mathbb{N}\) × \(\mathbb{N}\) → \(\mathbb{N}\), which sends (a, b) → a + b. Currying +, we get a certain function p : \(\mathbb{N}\) → \(\mathbb{N}\)\(^{\mathbb{N}}\). What is p(3)? ♦
If you know some abstract algebra or topology, here are some other examples of adjunctions.
- Free constructions: given any set you get a free group, free monoid, free ring, free vector space, etc.; each of these is a left adjoint. The corresponding right adjoint takes a group, a monoid, a ring, a vector space etc. and forgets the algebraic structure to return the underlying set.
- Similarly,givenagraphyougetafreepreorderorafreecategory,aswediscussed in Section 3.2.3; each is a left adjoint. The corresponding right adjoint is the underlying graph of a preorder or of a category.
- Discrete things: given any set you get a discrete preorder, discrete graph, discrete metric space, discrete category, discrete topological space; each of these is a left adjoint. The corresponding right adjoint is again underlying set.
- Codiscrete things: given any set you get a codiscrete preorder, complete graph, codiscrete category, codiscrete topological space; each of these is a right adjoint. The corresponding left adjoint is the underlying set.
- Given a group, you can quotient by its commutator subgroup to get an abelian group; this is a left adjoint. The right adjoint is the inclusion of abelian groups into groups.
Left and right pushforward functors, \(\Sigma\) and Π
Given F: C → D, the data migration functor ∆F turns D-instances into C-instances. This functor has both a left and a right adjoint:

Using the names \(\Sigma\) and Π in this context is fairly standard in category theory. In the case of databases, they have the following helpful mnemonic:

Just like we used ∆F to pull back any discrete dynamical system along F : Gr → DDS and get a graph, the migration functors ΣF and ΠF can be used to turn any graph into a discrete dynamical system. That is, given an instance J : Gr → Set, we can get instances \(\Sigma\)F(J) and ΠF(J) on DDS. This, however, is quite technical, and we leave it to the adventurous reader to compute an example, with help perhaps from [Spi14a], which explores the definitions of Σ and Π in detail. A less technical shortcut is simply to code up the computation in the open-source FQL software.
To get the basic idea across without getting mired in technical details, here we shall instead discuss a very simple example. Recall the schemas from Eq. (3.5). We can set up a functor between them, the one sending black dots to black dots and white dots to white dots:

With this functor F in hand, we can transform any B-instance into an A-instance using ∆F. Whereas ∆ was interesting in the case of turning discrete dynamical systems into graphs in Section 3.4.1, it is not very interesting in this case. Indeed, it will just copy ∆ for duplicate—the rows in Airline seat into both Economy and First Class.
∆F has two adjoints, \(\Sigma\)F and ΠF, both of which transform any A-instance I into a B-instance. The functor \(\Sigma\)F does what one would most expect from reading the names on each object: it will put into Airline Seat the union of Economy and First Class:
\(\sigma\)F (I)(Airline Seat) = I (Economy) ⊔ I (First Class).
The functor ΠF puts into Airline Seat the set of those pairs (e, f) where e is an Economy seat, f is a First Class seat, and e and f have the same price and position.
In this particular example, one imagines that there should be no such seats in a valid instance I, in which case ΠF (I)(Airline Seat) would be empty. But in other uses of these same schemas, ΠF can be a useful operation. For example, in the schema A replace the label ‘Economy’ by ‘Rewards Program’, and in B replace ‘Airline Seat’ by ‘First Class Seats’. Then the operation ΠF finds those first class seats that are also rewards program seats. This operation is a kind of database query; querying is the operation that databases are built for.
The moral is that complex data migrations can be specified by constructing functors F between schemas and using the “induced” functors ∆F , \(\Sigma\)F , and ΠF . Indeed, in practice essentially all useful migrations can be built up from these. Hence the language of categories provides a framework for specifying and reasoning about data migrations.
Single set summaries of databases
To give a stronger idea of the flavor of \(\Sigma\) and Π, we consider another special case, namely where the target category D is equal to 1; see Exercise 3.12. In this case, there is exactly one functor C → 1 for any C; let’s denote it
!: C → 1. (3.75)
Describe the functor !: C → 1 from Eq. (3.75). Where does it send each object? What about each morphism? ♦
We want to consider the data migration functors \(\Sigma\)! : C-Inst → 1-Inst and Π! : C-Inst → 1-Inst. In Example 3.53, we saw that an instance on 1 is the same thing as a set. So let’s identify 1-Inst with Set, and hence discuss
\(\Sigma\)! : C-Inst → Set and Π! : C-Inst → Set.
Given any schema C and instance I : C → Set, we will get sets \(\Sigma\)!(I) and Π!(I). Thinking of these sets as database instances, each corresponds to a single one-column table a controlled vocabulary summarizing an entire database instance on the schema C.
Consider the following schema

Here’s a sample instance I : G → Set:

Note that G from Eq.(3.77) is isomorphic to the schema Gr. In Section 3.3.5 we saw that instances on Gr are graphs.
Draw the above instance I as a graph. ♦
Now we have a unique functor !: G → 1, and we want to say what \(\Sigma\)!(I) and Π!(I) give us as single-set summaries. First, \(\Sigma\)!(I) tells us all the emailing groups the “connected components” in I:

This form of summary, involving identifying entries into common groups, or quotients, is typical of \(\Sigma\)-operations. The functor Π!(I) lists the emails from I which were sent from a person to her- or him-self.

This is again a sort of query, selecting the entries that fit the criterion of self-to-self emails. Again, this is typical of Π-operations.
Where do these facts—that Π! and Σ! act the way we said—come from? Everything follows from the definition of adjoint functors (3.70): indeed we hope this, together with the examples given in Example 3.74, give the reader some idea of how general and useful adjunctions are, both in mathematics and in database theory.
One more point: while we will not spell out the details, we note that these operations are also examples of constructions known as colimits and limits in Set. We end this chapter with bonus material, exploring these key category theoretic constructions. The reader should keep in mind that, in general and not just for functors to 1, Σ-operations are built from colimits in Set, and Π-operations are built from limits in Set.

