4.2: Hamiltonian Circuits and the Traveling Salesman Problem
In the last section, we considered optimizing a walking route for a postal carrier. How is this different than the requirements of a package delivery driver? While the postal carrier needed to walk down every street (edge) to deliver the mail, the package delivery driver instead needs to visit every one of a set of delivery locations. Instead of looking for a circuit that covers every edge once, the package deliverer is interested in a circuit that visits every vertex once.
A Hamiltonian circuit is a circuit that visits every vertex once with no repeats. Being a circuit, it must start and end at the same vertex. A Hamiltonian path also visits every vertex once with no repeats, but does not have to start and end at the same vertex.
Hamiltonian circuits are named for William Rowan Hamilton who studied them in the 1800’s.
One Hamiltonian circuit is shown on the graph below. There are several other Hamiltonian circuits possible on this graph. Notice that the circuit only has to visit every vertex once; it does not need to use every edge.
Solution
This circuit could be notated by the sequence of vertices visited, starting and ending at the same vertex: ABFGCDHMLKJEA. Notice that the same circuit could be written in reverse order, or starting and ending at a different vertex.
Unlike with Euler circuits, there is no nice theorem that allows us to instantly determine whether or not a Hamiltonian circuit exists for all graphs.[1]
Does a Hamiltonian path or circuit exist on the graph below?
Solution
We can see that once we travel to vertex E there is no way to leave without returning to C, so there is no possibility of a Hamiltonian circuit. If we start at vertex E we can find several Hamiltonian paths, such as ECDAB and ECABD.
Depending on the problem being solved, sometimes weights are assigned to the edges. The weights could represent the distance between two locations, the travel time, or the travel cost.
With Hamiltonian circuits, our focus will not be on existence, but on the question of optimization; given a graph where the edges have weights, can we find the optimal Hamiltonian circuit; the one with lowest total weight.
This problem is called the Traveling salesman problem (TSP) because the question can be framed like this: Suppose a salesman needs to give sales pitches in four cities. He looks up the airfares between each city, and puts the costs in a graph. In what order should he travel to visit each city once then return home with the lowest cost?
To answer this question of how to find the lowest cost Hamiltonian circuit, we will consider some possible approaches. The first option that might come to mind is to just try all different possible circuits.
- List all possible Hamiltonian circuits
- Find the length of each circuit by adding the edge weights
- Select the circuit with minimal total weight.
Apply the Brute force algorithm to find the minimum cost Hamiltonian circuit on the graph below.
Solution
To apply the Brute force algorithm, we list all possible Hamiltonian circuits and calculate their weight:
\(\begin{array}{|l|l|}
\hline \textbf { Circuit } & \textbf { Weight } \\
\hline \text { ABCDA } & 4+13+8+1=26 \\
\hline \text { ABDCA } & 4+9+8+2=23 \\
\hline \text { ACBDA } & 2+13+9+1=25 \\
\hline
\end{array}\)
Note: These are the unique circuits on this graph. All other possible circuits are the reverse of the listed ones or start at a different vertex, but result in the same weights.
From this we can see that the second circuit, ABDCA, is the optimal circuit.
The Brute force algorithm is optimal; it will always produce the Hamiltonian circuit with minimum weight. Is it efficient? To answer that question, we need to consider how many Hamiltonian circuits a graph could have. For simplicity, let’s look at the worst-case possibility, where every vertex is connected to every other vertex. This is called a complete graph. In figure A, there are examples of complete graphs with different numbers of vertices.
Out of convenience, mathematicians sometimes use specific notation for complete graphs based on the number of vertices. A complete graph with \(n\) vertices can be represented by K n . For example, the graphs represented above are K 2 , K 3 , K 5 and K 9 .
Suppose we had a complete graph with five vertices like the air travel graph above. From Seattle there are four cities we can visit first. From each of those, there are three choices. From each of those cities, there are two possible cities to visit next. There is then only one choice for the last city before returning home.
This can be shown visually:
Counting the number of routes, we can see there are \(4 \cdot 3 \cdot 2 \cdot 1=24\) routes. For six cities there would be \(5 \cdot 4 \cdot 3 \cdot 2 \cdot 1=120\) routes.
For \(n\) vertices in a complete graph, there will be \((n-1) !=(n-1)(n-2)(n-3) \cdots 3 \cdot 2 \cdot 1\) routes. Half of these are duplicates in reverse order, so there are \(\frac{(n-1) !}{2}\) unique circuits.
The exclamation symbol, !, is read “factorial” and is shorthand for the product shown.
How many circuits would a complete graph with 8 vertices have?
Solution
A complete graph with 8 vertices would have \((8-1) !=7 !=7 \cdot 6 \cdot 5 \cdot 4 \cdot 3 \cdot 2 \cdot 1=5040\) possible Hamiltonian circuits. Half of the circuits are duplicates of other circuits but in reverse order, leaving 2520 unique routes.
While this is a lot, it doesn’t seem unreasonably huge. But consider what happens as the number of cities increase:
\(\begin{array}{|l|l|}
\hline \textbf { Cities } & \textbf { Unique Hamiltonian Circuits } \\
\hline 9 & 8 ! / 2=20,160 \\
\hline 10 & 9 ! / 2=181,440 \\
\hline 11 & 10 ! / 2=1,814,400 \\
\hline 15 & 14 ! / 2=43,589,145,600 \\
\hline 20 & 19 ! / 2=60,822,550,204,416,000 \\
\hline
\end{array}\)
As you can see the number of circuits is growing extremely quickly. If a computer looked at one billion circuits a second, it would still take almost two years to examine all the possible circuits with only 20 cities! Certainly Brute Force is not an efficient algorithm.
Unfortunately, no one has yet found an efficient and optimal algorithm to solve the TSP, and it is very unlikely anyone ever will. Since it is not practical to use brute force to solve the problem, we turn instead to heuristic algorithms ; efficient algorithms that give approximate solutions. In other words, heuristic algorithms are fast, but may or may not produce the optimal circuit.
- Select a starting point.
- Move to the nearest unvisited vertex (the edge with smallest weight).
- Repeat until the circuit is complete.
Consider our earlier graph, shown to the right.
Starting at vertex A, the nearest neighbor is vertex D with a weight of 1.
From D, the nearest neighbor is C, with a weight of 8.
From C, our only option is to move to vertex B, the only unvisited vertex, with a cost of 13.
From B we return to A with a weight of 4.
Solution
The resulting circuit is ADCBA with a total weight of \(1+8+13+4 = 26\).
We ended up finding the worst circuit in the graph! What happened? Unfortunately, while it is very easy to implement, the NNA is a greedy algorithm, meaning it only looks at the immediate decision without considering the consequences in the future. In this case, following the edge AD forced us to use the very expensive edge BC later.
Consider again our salesman. Starting in Seattle, the nearest neighbor (cheapest flight) is to LA, at a cost of $70. From there:
Solution
- LA to Chicago: $100
- Chicago to Atlanta: $75
- Atlanta to Dallas: $85
- Dallas to Seattle: $120
Total cost: $450
In this case, nearest neighbor did find the optimal circuit.
Going back to our first example, how could we improve the outcome? One option would be to redo the nearest neighbor algorithm with a different starting point to see if the result changed. Since nearest neighbor is so fast, doing it several times isn’t a big deal.
- Do the Nearest Neighbor Algorithm starting at each vertex
- Choose the circuit produced with minimal total weight
We will revisit the graph from Example 17.
Starting at vertex A resulted in a circuit with weight 26.
Starting at vertex B, the nearest neighbor circuit is BADCB with a weight of 4+1+8+13 = 26. This is the same circuit we found starting at vertex A. No better.
Starting at vertex C, the nearest neighbor circuit is CADBC with a weight of 2+1+9+13 = 25. Better!
Starting at vertex D, the nearest neighbor circuit is DACBA. Notice that this is actually the same circuit we found starting at C, just written with a different starting vertex.
Solution
The RNNA was able to produce a slightly better circuit with a weight of 25, but still not the optimal circuit in this case. Notice that even though we found the circuit by starting at vertex C, we could still write the circuit starting at A: ADBCA or ACBDA.
The table below shows the time, in milliseconds, it takes to send a packet of data between computers on a network. If data needed to be sent in sequence to each computer, then notification needed to come back to the original computer, we would be solving the TSP. The computers are labeled A-F for convenience.
\(\begin{array}{|l|l|l|l|l|l|l|}
\hline & \mathrm{A} & \mathrm{B} & \mathrm{C} & \mathrm{D} & \mathrm{E} & \mathrm{F} \\
\hline \mathrm{A} & \_ \_ & 44 & 34 & 12 & 40 & 41 \\
\hline \mathrm{B} & 44 & \_ \_ & 31 & 43 & 24 & 50 \\
\hline \mathrm{C} & 34 & 31 & \_ \_ & 20 & 39 & 27 \\
\hline \mathrm{D} & 12 & 43 & 20 & \_ \_ & 11 & 17 \\
\hline \mathrm{E} & 40 & 24 & 39 & 11 & \_ \_ & 42 \\
\hline \mathrm{F} & 41 & 50 & 27 & 17 & 42 & \_ \_ \\
\hline
\end{array}\)
a. Find the circuit generated by the NNA starting at vertex B.
b. Find the circuit generated by the RNNA.
- Answer
-
At each step, we look for the nearest location we haven’t already visited.
From B the nearest computer is E with time 24.
From E, the nearest computer is D with time 11.
From D the nearest is A with time 12.
From A the nearest is C with time 34.
From C, the only computer we haven’t visited is F with time 27
From F, we return back to B with time 50.
The NNA circuit from B is BEDACFB with time 158 milliseconds.
While certainly better than the basic NNA, unfortunately, the RNNA is still greedy and will produce very bad results for some graphs. As an alternative, our next approach will step back and look at the “big picture” – it will select first the edges that are shortest, and then fill in the gaps.
1. Select the cheapest unused edge in the graph.
2. Repeat step 1, adding the cheapest unused edge to the circuit, unless:
a. adding the edge would create a circuit that doesn’t contain all vertices, or
b. adding the edge would give a vertex degree 3.
3. Repeat until a circuit containing all vertices is formed.
Using the four vertex graph from earlier, we can use the Sorted Edges algorithm.
Solution
The cheapest edge is AD, with a cost of 1. We highlight that edge to mark it selected.
The next shortest edge is AC, with a weight of 2, so we highlight that edge.
For the third edge, we’d like to add AB, but that would give vertex A degree 3, which is not allowed in a Hamiltonian circuit. The next shortest edge is CD, but that edge would create a circuit ACDA that does not include vertex B, so we reject that edge. The next shortest edge is BD, so we add that edge to the graph.
We then add the last edge to complete the circuit: ACBDA with weight 25.
Notice that the algorithm did not produce the optimal circuit in this case; the optimal circuit is ACDBA with weight 23.
While the Sorted Edge algorithm overcomes some of the shortcomings of NNA, it is still only a heuristic algorithm, and does not guarantee the optimal circuit.
Your teacher’s band, Derivative Work , is doing a bar tour in Oregon. The driving distances are shown below. Plan an efficient route for your teacher to visit all the cities and return to the starting location. Use NNA starting at Portland, and then use Sorted Edges.
\( \begin{array}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline & & & & & & & & & & \\
& \text { Ashland } & \text { Astoria } & \text { Bend } & \text { Corvallis } & \text { Crater Lake } & \text { Eugene } & \text { Newport } & \text { Portland } & \text { Salem } & \text { Seaside } \\
\hline \text { Ashland } & \_ & 374 & 200 & 223 & 108 & 178 & 252 & 285 & 240 & 356 \\
\hline \text { Astoria } & 374 & \_ & 255 & 166 & 433 & 199 & 135 & 95 & 136 & 17 \\
\hline \text { Bend } & 200 & 255 & \_ & 128 & 277 & 128 & 180 & 160 & 131 & 247 \\
\hline \text { Corvallis } & 223 & 166 & 128 & \_ & 430 & 47 & 52 & 84 & 40 & 155 \\
\hline \text { Crater Lake } & 108 & 433 & 277 & 430 & \_ & 453 & 478 & 344 & 389 & 423 \\
\hline \text { Eugene } & 178 & 199 & 128 & 47 & 453 & \_ & 91 & 110 & 64 & 181 \\
\hline \text { Newport } & 252 & 135 & 180 & 52 & 478 & 91 & \_ & 114 & 83 & 117 \\
\hline \text { Portland } & 285 & 95 & 160 & 84 & 344 & 110 & 114 & \_ & 47 & 78 \\
\hline \text { Salem } & 240 & 136 & 131 & 40 & 389 & 64 & 83 & 47 & \_ & 118 \\
\hline \text { Seaside } & 356 & 17 & 247 & 155 & 423 & 181 & 117 & 78 & 118 & \_ \\
\hline
\end{array}\)
Solution
Using NNA with a large number of cities, you might find it helpful to mark off the cities as they’re visited to keep from accidently visiting them again. Looking in the row for Portland, the smallest distance is 47, to Salem. Following that idea, our circuit will be:
\(\begin{array} {ll} \text{Portland to Salem} & 47 \\ \text{Salem to Corvallis} & 40 \\ \text{Corvallis to Eugene} & 47 \\ \text{Eugene to Newport} & 91 \\ \text{Newport to Seaside} & 117 \\ \text{Seaside to Astoria} & 17 \\ \text{Astoria to Bend} & 255 \\ \text{Bend to Ashland} & 200 \\ \text{Ashland to Crater Lake} & 108 \\ \text{Crater Lake to Portland} & 344 \\ \text{Total trip length: } & 1266\text{ miles} \end{array} \)
Using Sorted Edges, you might find it helpful to draw an empty graph, perhaps by drawing vertices in a circular pattern. Adding edges to the graph as you select them will help you visualize any circuits or vertices with degree 3.
We start adding the shortest edges:
\(\begin{array} {ll} \text{Seaside to Astoria} & 17\text{ miles} \\ \text{Corvallis to Salem} & 40\text{ miles} \\ \text{Portland to Salem} & 47\text{ miles} \\ \text{Corvallis to Eugene} & 47\text{ miles} \end{array} \)
The graph after adding these edges is shown to the right. The next shortest edge is from Corvallis to Newport at 52 miles, but adding that edge would give Corvallis degree 3.
Continuing on, we can skip over any edge pair that contains Salem or Corvallis, since they both already have degree 2.
\(\begin{array} {ll} \text{Portland to Seaside} & 78\text{ miles} \\ \text{Eugene to Newport} & 91\text{ miles} \\ \text{Portland to Astoria} & \text{(reject – closes circuit)} \\ \text{Ashland to Crater Lk 108 miles} & \end{array} \)
The graph after adding these edges is shown to the right. At this point, we can skip over any edge pair that contains Salem, Seaside, Eugene, Portland, or Corvallis since they already have degree 2.
\(\begin{array} {ll} \text{Newport to Astoria} & \text{(reject – closes circuit)} \\ \text{Newport to Bend} & 180\text{ miles} \\ \text{Bend to Ashland} & 200\text{ miles} \end{array} \)
At this point the only way to complete the circuit is to add:
Crater Lk to Astoria 433 miles
The final circuit, written to start at Portland, is:
Portland, Salem, Corvallis, Eugene, Newport, Bend, Ashland, Crater Lake, Astoria, Seaside, Portland.
Total trip length: 1241 miles.
While better than the NNA route, neither algorithm produced the optimal route. The following route can make the tour in 1069 miles:
Portland, Astoria, Seaside, Newport, Corvallis, Eugene, Ashland, Crater Lake, Bend, Salem, Portland
Find the circuit produced by the Sorted Edges algorithm using the graph below.
- Answer
-
AB: Add, cost 11
BG: Add, cost 13
AE: Add, cost 14
EF: Add, cost 15
EC: Skip (degree 3 at E)
FG: Skip (would create a circuit not including C)
BF, BC, AG, AC: Skip (would cause a vertex to have degree 3)
GC: Add, cost 36
CF: Add, cost 37, completes the circuit
Final circuit: ABGCFEA
[1] There are some theorems that can be used in specific circumstances, such as Dirac’s theorem, which says that a Hamiltonian circuit must exist on a graph with n vertices if each vertex has degree n /2 or greater.