How Kruskal's Algorithm Works
Sort edges by weight, then add each one unless it creates a cycle. Union-find with path compression in 46 lines.
What you will learn
- Why greedy edge selection by weight is sufficient to build a minimum spanning tree
- How sorting edges by their third tuple element sets up the greedy scan in one expression
- How the parent list encodes a disjoint-set forest and why parent[i] = i initializes each node as its own root
- How find_parent implements path compression by flattening the tree toward the root on every lookup
- How union-by-parent-reassignment merges two components and why parent[parent_a] = parent_b prevents cycles
- How the three doctests verify the MST edges in weight-ascending order across simple, tie-weight, and multi-edge graphs
Prerequisites
- Familiarity with Python sorted() and lambda key functions
- A basic mental model of a graph as nodes connected by edges with weights
- No prior knowledge of spanning trees or union-find required
The story behind Kruskal's algorithm
Joseph Kruskal published the algorithm in 1956 in a one-and-a-half page paper in the Proceedings of the American Mathematical Society, titled "On the shortest spanning subtree of a graph and the traveling salesman problem." The spanning tree part gave the MST algorithm; the second part of the title noted that TSP was still open. Kruskal was then a researcher at Bell Telephone Laboratories, and the brevity of the paper is striking: the algorithm is stated in two sentences and the correctness proof takes one paragraph based on the cut property of minimum spanning trees.
The algorithm's efficiency depends on the union-find data structure. Kruskal's original 1956 formulation used a simpler cycle-detection approach; the disjoint-set forest with union-by-rank and path compression, which brings the amortized time per operation close to O(1), was developed by Robert Tarjan in 1975. Together they give an overall complexity of O(E log E) dominated by the edge sort. Alternative MST algorithms include Prim's, which grows the tree from a single vertex using a priority queue (O(E log V)), and Borůvka's, the oldest of the three at 1926, which adds the cheapest outgoing edge from every component in parallel. Kruskal's greedy edge-selection approach generalizes to matroids, making it a foundational example in combinatorial optimization.
Signature, Doctest, and the MST Contract
graphs/minimum_spanning_tree_kruskal.py:1The function takes a node count and an edge list of (src, dst, weight) tuples. Three doctests verify MST edges are weight-ordered and cycle-free.
The signature is clean: a node count and a flat list of (src, dst, weight) tuples. There is no adjacency dict, no Graph class, no separate vertex list. The edge tuples are the entire graph representation. Three doctests pin the contract with increasing complexity. The first is a simple three-edge path graph where every edge is needed: all three appear in the MST. The second has five edges on four nodes with two weight-1 edges; the function must choose the two weight-1 and one weight-2 edges, skipping the weight-3 and weight-5 edges that would create cycles. The third adds a sixth edge with weight 1 between nodes 2 and 1, and again the function returns the three cheapest cycle-free edges. The returned edges appear in weight-ascending order because the algorithm processes edges from the sorted input and adds them in that order.
The function returns MST edges in the order they were added, which is weight-ascending after sorting. Three doctests verify cycle exclusion at increasing graph densities.
def kruskal(
num_nodes: int, edges: list[tuple[int, int, int]]
) -> list[tuple[int, int, int]]:
"""
>>> kruskal(4, [(0, 1, 3), (1, 2, 5), (2, 3, 1)])
[(2, 3, 1), (0, 1, 3), (1, 2, 5)]
>>> kruskal(4, [(0, 1, 3), (1, 2, 5), (2, 3, 1), (0, 2, 1), (0, 3, 2)])
[(2, 3, 1), (0, 2, 1), (0, 1, 3)]
>>> kruskal(4, [(0, 1, 3), (1, 2, 5), (2, 3, 1), (0, 2, 1), (0, 3, 2),
... (2, 1, 1)])
[(2, 3, 1), (0, 2, 1), (2, 1, 1)]
"""Edge Sort and Parent Array Initialization
graphs/minimum_spanning_tree_kruskal.py:15Sorting by the third tuple element establishes the greedy order. parent[i] = i initializes each node as its own disjoint set with no connections.
Two lines set up the entire algorithm. Line 15 sorts the edge list in ascending order by weight using a lambda that extracts the third element of each tuple. The sort is in-place on the local variable, so the input list is not mutated. After sorting, greedy selection by weight is correct because of the MST cut property: for any partition of the vertices into two groups, the minimum-weight edge crossing the partition must appear in at least one MST. Processing edges cheapest-first and skipping cycle-forming ones finds that edge at the right moment. Line 17 initializes the union-find structure: list(range(num_nodes)) creates a list where parent[i] = i for every node, encoding that each node is its own root and no two nodes are yet connected. The entire disjoint-set forest is this single list.
Sorting by weight is the one setup step that makes greedy selection correct. parent[i] = i means every node starts as its own component.
edges = sorted(edges, key=lambda edge: edge[2])
parent = list(range(num_nodes))find_parent: Path Compression
graphs/minimum_spanning_tree_kruskal.py:19find_parent follows parent pointers to the root and flattens every visited node directly to the root on the way back, amortizing future lookups.
find_parent is a closure over the parent list defined in the enclosing scope. It finds the root of node i's component by following parent pointers until it reaches a node that is its own parent (i == parent[i]). The path-compression step on line 21 is the optimization that makes union-find near-linear over many operations: after the recursive call returns the root, the current node's parent pointer is updated to point directly to the root. On the next call for any node in this path, the lookup reaches the root in one step instead of following the whole chain. Without path compression, a degenerate series of unions could produce a chain of length n, making each lookup O(n). With it, the amortized cost approaches O(alpha(n)), where alpha is the inverse Ackermann function, effectively constant for any practical n.
Path compression on line 21 flattens every node directly to its root after a lookup. Without it, repeated unions could build chains that make each lookup O(n).
def find_parent(i):
if i != parent[i]:
parent[i] = find_parent(parent[i])
return parent[i]Greedy Edge Selection and Union
graphs/minimum_spanning_tree_kruskal.py:24If both endpoints share a root, the edge creates a cycle and is skipped. Otherwise add it to the MST and merge the two components.
The greedy loop processes sorted edges and makes an accept-or-reject decision for each one. Lines 28 and 29 find the root of both endpoints. If parent_a == parent_b, both endpoints are in the same component; adding the edge would create a cycle, so the algorithm skips it. If the roots differ, the edge connects two separate components and is safe to add. Line 31 accumulates the total MST cost. Line 32 records the edge in the result list. Line 33 performs the union by setting parent[parent_a] = parent_b, making one root point to the other. This is union-by-root without rank tracking; pathological union sequences can produce unbalanced trees. Path compression in find_parent compensates by flattening trees dynamically.
Two roots differing means the edge is safe. One root points to the other after union. Path compression in find_parent compensates for the lack of union-by-rank here.
minimum_spanning_tree_cost = 0
minimum_spanning_tree = []
for edge in edges:
parent_a = find_parent(edge[0])
parent_b = find_parent(edge[1])
if parent_a != parent_b:
minimum_spanning_tree_cost += edge[2]
minimum_spanning_tree.append(edge)
parent[parent_a] = parent_b
return minimum_spanning_treeThe __main__ Block: Reading a Graph from stdin
graphs/minimum_spanning_tree_kruskal.py:38The __main__ block reads node and edge counts, then reads each edge as three space-separated integers, builds the list, and calls kruskal.
The # pragma: no cover comment on line 38 tells the coverage tool to skip the __main__ block during test collection. The block reads the node and edge counts on a single line, then reads each edge as three space-separated integers on its own line. Line 43 uses a generator expression with int(x) to convert the tokens rather than a list comprehension with explicit indexing, which is the style the repository favors for small, readable conversions. The result of kruskal on line 46 is discarded rather than printed, which matches the pattern from the bellman_ford __main__ block: the caller is expected to use print_distance or equivalent for output. Unlike graphs/bellman_ford.py, this file does not call doctest.testmod() in the __main__ block, so the three doctests in the function are only exercised by CI, not by direct invocation.
The result is discarded rather than printed. The pragma: no cover comment excludes the block from coverage measurement, which is necessary because stdin prompts cannot run under pytest.
if __name__ == "__main__": # pragma: no cover
num_nodes, num_edges = list(map(int, input().strip().split()))
edges = []
for _ in range(num_edges):
node1, node2, cost = (int(x) for x in input().strip().split())
edges.append((node1, node2, cost))
kruskal(num_nodes, edges)You've walked through 5 key areas of the The Algorithms - Python codebase.
Continue: How Fibonacci DP Works → Browse all projectsCreate code tours for your project
Intraview lets AI create interactive walkthroughs of any codebase. Install the free VS Code extension and generate your first tour in minutes.
Install Intraview Free