Loading [MathJax]/jax/output/HTML-CSS/jax.js
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Mathematics LibreTexts

12.1: Minimum Weight Spanning Trees

( \newcommand{\kernel}{\mathrm{null}\,}\)

In this section, we consider pairs (G,w) where G=(V,E) is a connected graph and w:EN0. For each edge eE, the quantity w(e) is called the weight of e. Given a set S of edges, we define the weight of S, denoted w(S), by setting w(S)=eSw(e). In particular, the weight of a spanning tree T is just the sum of the weights of the edges in T.

Weighted graphs arise in many contexts. One of the most natural is when the weights on the edges are distances or costs. For example, consider the weighted graph in Figure 12.1. Suppose the vertices represent nodes of a network and the edges represent the ability to establish direct physical connections between those nodes. The weights associated to the edges represent the cost (let's say in thousands of dollars) of building those connections. The company establishing the network among the nodes only cares that there is a way to get data between each pair of nodes. Any additional links would create redundancy in which they are not interested at this time. A spanning tree of the graph ensures that each node can communicate with each of the others and has no redundancy, since removing any edge disconnects it. Thus, to minimize the cost of building the network, we want to find a minimum weight (or cost) spanning tree.

Screen Shot 2022-03-10 at 10.17.47 PM.png
Figure 12.1. A weighted graph

To do this, this section considers the following problem:

Problem 12.2

Find a minimum weight spanning tree T of G.

To solve this problem, we will develop two efficient graph algorithms, each having certain computational advantages and disadvantages. Before developing the algorithms, we need to establish some preliminaries about spanning trees and forests.

12.1.1 Preliminaries

The following proposition about the number of components in a spanning forest of a graph G has an easy inductive proof. You are asked to provide it in the exercises.

Proposition 12.3

Let G=(V,E) be a graph on n vertices, and let H=(V,S) be a spanning forest. Then 0|S|n1. Furthermore, if |S|=nk, then H has k components. In particular, H is a spanning tree if and only if it contains n1 edges.

The following proposition establishes a way to take a spanning tree of a graph, remove an edge from it, and add an edge of the graph that is not in the spanning tree to create a new spanning tree. Effectively, the process exchanges two edges to form the new spanning tree, so we call this the exchange principle.

Proposition 12.4. Exchange Principle

Let T=(V,S) be spanning tree in a graph G, and let e=xy be an edge of G which does not belong to T. Then

  1. There is a unique path P=(x0,x1,x2,,xt) with (a) x=x0; (b) y=xt; and (c) xixi+1S for each i=0,1,2,,t1.
  2. For each i=0,1,2,,t1, let fi=xixi+1 and then set

Si={e}{gS:gfi},

i.e., we exchange edge fi for edge e. Then Ti=(V,Si) is a spanning tree of G.

Proof

For the first fact, it suffices to note that if there were more than one distinct path from x to y in T, we would be able to find a cycle in T. This is impossible since it is a tree. For the second, we refer to Figure 12.5. The black and green edges in the graph shown at the left represent the spanning tree T. Thus, f lies on the unique path from x to y in T and e=xy is an edge of G not in T. Adding e to T creates a graph with a unique cycle, since T had a unique path from x to y. Removing f (which could be any edge fi of the path, as stated in the proposition) destroys this cycle. Thus Ti is a connected acyclic subgraph of G with n1+11=n1 edges, so it is a spanning tree.

Screen Shot 2022-03-10 at 10.33.25 PM.png
Figure 12.5. The exchange principle

For both of the algorithms we develop, the argument to show that the algorithm is optimal rests on the following technical lemma. To avoid trivialities, we assume n3.

Lemma 12.6

Let F be a spanning forest of G and let C be a component of F. Also, let e=xy be an edge of minimum weight among all edges with one endpoint in C and the other not in C. Then among all spanning trees of G that contain the forest F, there is one of minimum weight that contains the edge e.

Proof

Let T=(V,S) be any spanning tree of minimum weight among all spanning trees that contain the forest F, and suppose that e=xy is not an edge in T. (If it were an edge in T, we would be done.) Then let P=(x0,x1,x2,,xt) be the unique path in T with (a) x=x0; (b) y=xt; and (c) xixi+1S for each i=0,1,2,,t1. Without loss of generality, we may assume that x=x0 is a vertex in C while y=xt does not belong to C. Then there is a least non-negative integer i for which xi is in C and xi+1 is not in C. It follows that xj is in C for all j with 0ji.

Let f=xixi+1. The edge e has minimum weight among all edges with one endpoint in C and the other not in C, so w(e)w(f). Now let Ti be the tree obtained by exchanging the edge f for edge e. It follows that w(Ti)=w(T)w(f)+w(e)w(T). Furthermore, Ti contains the spanning forest F as well as the edge e. It is therefore the minimum weight spanning tree we seek.

Discussion 12.7

Although Bob's combinatorial intuition has improved over the course he doesn't quite understand why we need special algorithms to find minimum weight spanning trees. He figures there can't be that many spanning trees, so he wants to just write them down. Alice groans as she senses that Bob must have been absent when the material from Section 5.6 was discussed. In that section, we learned that a graph on n vertices can have as many as nn2 spanning trees (or horrors, the instructor may have left it off the syllabus). Regardless, this exhaustive approach is already unusable when n=20. Dave mumbles something about being greedy and just adding the lightest edges one-by-one while never adding an edge that would make a cycle. Zori remembers a strategy like this working for finding the height of a poset, but she's worried about the nightmare situation that we learned about with using FirstFit to color graphs. Alice agrees that greedy algorithms have an inconsistent track record but suggests that Lemma 12.6 may be enough to get one to succeed here.

12.1.2 Kruskal's Algorithm

In this section, we develop one of the best known algorithms for finding a minimum weight spanning tree. It is known as Kruskal's Algorithm, although some prefer the descriptive label Avoid Cycles because of the way it builds the spanning tree.

To start Kruskal's algorithm, we sort the edges according to weight. To be more precise, let m denote the number of edges in G=(V,E). Then label the edges as e1,e2,e3,,em so that w(e1)w(e2)w(em). Any of the many available efficient sorting algorithms can be used to do this step.

Once the edges are sorted, Kruskal's algorithm proceeds to an initialization step and then inductively builds the spanning tree T=(V,S):

Algorithm 12.8. Kruskal's Algorithm

Initialization. Set S= and i=0

Inductive Step. While |S|<n1, let j be the least non-negative integer so that j>i and there are no cycles in S{ej}. Then (using pseudo-code) set

i=j and S=S{j}.

The correctness of Kruskal's Algorithm follows from an inductive argument. First, the set Sis initialized as the empty set, so there is certainly a minimum weight spanning tree containing all the edges in S. Now suppose that for some i with 0i<n, |S|=i and there is a minimum weight spanning tree containing all the edges in S. Let F be the spanning forest determined by the edges in S, and let C1,C2,,Cs be the components of F. For each k=1,2,,s, let fk be a minimum weight edge with one endpoint in Ck and the other not in Ck. Then the edge e added to S by Kruskal's Algorithm is just the edge {f1,f2,,fs} having minimum weight. Applying Lemma 12.6 and the inductive hypothesis, we know that there will still be a minimum weight spanning tree of G containing all the edges of S{e}.

Example 12.9. Kruskal's Algorithm

Let's see what Kruskal's algorithm does on the weighted graph in Figure 12.1. It first sorts all of the edges by weight. We won't reproduce the list here, since we won't need all of it. The edge of least weight is ck, which has weight 23. It continues adding the edge of least weight, adding ag,fg,fi,fj, and bj. However, after doing this, the edge of lowest weight is fb, which has weight 38. This edge cannot be added, as doing so would make fjb a cycle. Thus, the algorithm bypasses it and adds bc. Edge ai is next inspected, but it, too, would create a cycle and is eliminated from consideration. Then em is added, followed by dl. There are now two edges of weight 56 to be considered: al and dj. Our sorting algorithm has somehow decided one of them should appear first, so let's say it's dj. After adding dj, we cannot add al, as agfjdl would form a cycle. Edge dk is next considered, but it would also form a cycle. However, ek can be added. Edges km and dm are then bypassed. Finally, edge ch is added as the twelfth and final edge for this 13-vertex spanning tree. The full list of edges added (in order) is shown to the right. The total weight of this spanning tree is 504.

Screen Shot 2022-03-11 at 11.55.41 PM.png

12.1.3 Prim's Algorithm

We now develop Prim's Algorithm for finding a minimum weight spanning tree. This algorithm is also known by a more descriptive label: Build Tree. We begin by choosing a root vertex r. Again, the algorithm proceeds with an initialization step followed by a series of inductive steps.

Algorithm 12.10. Prim's Algorithm

Initialization. Set W={r} and S=

Inductive Step. While |W|<n, let e be an edge of minimum weight among all edges with one endpoint in W and the other not in W. If e=xy,xW and yW, update W and S by setting (using pseudo-code)

W=W{y} and S=S{e}.

The correctness of Prim's algorithm follows immediately from Lemma 12.6.

Example 12.11. Prim's Algorithm

Let's see what Prim's algorithm does on the weighted graph in Figure 12.1. We start with vertex a as the root vertex. The lightest edge connecting a (the only vertex in the tree so far) to the rest of the graph is ag. Next, fg is added. This is followed by fi,fj,bj, and bc. Next, the algorithm identifies ck as the lightest edge connecting {a,g,i,f,j,b,c} to the remaining vertices. Notice that this is considerably later than Kruskal's algorithm finds the same edge. The algorithm then determines that al and jd, both of weight 56 are the lightest edges connecting vertices in the tree to the other vertices. It picks arbitrarily, so let's say it takes al. It next finds dl, then ek, and then em. The final edge added is ch. The full list of edges added (in order) is shown to the right. The total weight of this spanning tree is 504. This (not surprisingly) the same weight we obtained using Kruskal's algorithm. However, notice that the spanning tree found is different, as this one contains al instead of dj. This is not an issue, of course, since in both cases an arbitrary choice between two edges of equal weight was made.

a g 25
f g 26
f i 29
f j 30
b j 34
b c 39
c k 23
a l 56
d l 55
e k 59
e m 49
c h 79

12.1.4 Comments on Efficiency

An implementation of Kruskal's algorithm seems to require that the edges be sorted. If the graph has n vertices and m edges, this requires mlogm operations just for the sort. But once the sort is done, the process takes only n1 steps—provided you keep track of the components as the spanning forest expands. Regardless, it is easy to see that at most O(n2logn) operations are required.

On the other hand, an implementation of Prim's algorithm requires the program to conveniently keep track of the edges incident with each vertex and always be able to identify the edge with least weight among subsets of these edges. In computer science, the data structure that enables this task to be carried out is called a heap.


This page titled 12.1: Minimum Weight Spanning Trees is shared under a CC BY-SA 4.0 license and was authored, remixed, and/or curated by Mitchel T. Keller & William T. Trotter via source content that was edited to the style and standards of the LibreTexts platform.

Support Center

How can we help?