Graph Algorithms in TheAlgorithms/Python
Seven graph traversal and shortest-path strategies from BFS queues to Prim's greedy expansion.
What you will learn
- How BFS uses a FIFO queue to guarantee level-order traversal
- How DFS replaces the queue with a stack and pops from the end to go deep first
- How Dijkstra's min-heap always expands the currently cheapest known vertex
- How Bellman-Ford relaxes every edge in every round to handle negative weights
- How A* combines actual path cost with a heuristic estimate to steer toward a goal
- How Kruskal's union-find structure prevents cycles when building a minimum spanning tree
- How Prim's greedy key update grows a spanning tree one vertex at a time
Prerequisites
- Familiarity with Python classes and dictionaries
- Basic understanding of graph terminology: vertex, edge, adjacency
- Comfort with O(n log n) complexity notation
BFS: a FIFO queue guarantees level-order exploration
graphs/breadth_first_search.py:40BFS uses a FIFO queue so that every vertex at distance k is visited before any vertex at distance k+1.
BFS solves the reachability problem one distance layer at a time. The implementation marks start_vertex as visited and enqueues it before entering the loop, which is the correct order: marking on enqueue rather than on dequeue prevents a vertex from entering the queue twice in graphs with cycles. The Queue class from Python's standard library enforces FIFO ordering, so queue.get() always returns the vertex that has waited longest. Each dequeued vertex's neighbors are then checked: an unvisited neighbor is added to visited immediately and enqueued, so no neighbor is processed more than once. The doctest confirms that starting from vertex 2 in a four-vertex cycle reaches all vertices. For a deeper breakdown of this file, see the deeper tour.
BFS marks vertices visited at enqueue time, not dequeue time, which is the detail that prevents revisiting in cyclic graphs.
def bfs(self, start_vertex: int) -> set[int]:
"""
>>> g = Graph()
>>> g.add_edge(0, 1)
>>> g.add_edge(0, 1)
>>> g.add_edge(0, 2)
>>> g.add_edge(1, 2)
>>> g.add_edge(2, 0)
>>> g.add_edge(2, 3)
>>> g.add_edge(3, 3)
>>> sorted(g.bfs(2))
[0, 1, 2, 3]
"""
# initialize set for storing already visited vertices
visited = set()
# create a first in first out queue to store all the vertices for BFS
queue: Queue = Queue()
# mark the source node as visited and enqueue it
visited.add(start_vertex)
queue.put(start_vertex)
while not queue.empty():
vertex = queue.get()
# loop through all adjacent vertex and enqueue it if not yet visited
for adjacent_vertex in self.vertices[vertex]:
if adjacent_vertex not in visited:
queue.put(adjacent_vertex)
visited.add(adjacent_vertex)
return visitedDFS: a stack and reversed adjacency give depth-first order
graphs/depth_first_search.py:6DFS uses a LIFO stack so each newly discovered neighbor is visited before backtracking to previously discovered ones.
DFS and BFS share nearly the same loop structure. The critical difference is the data structure: BFS dequeues from the front; DFS pops from the end of stack. Calling stack.pop() retrieves the most recently appended element, so the traversal goes deep along one path before backtracking to explore siblings. The comment in the code names both differences explicitly: pop from the last position, and add neighbors to the stack without marking them as explored until they are actually popped. This non-recursive implementation avoids Python's default recursion limit. reversed(graph[v]) preserves the original neighbor order when visiting children, which matches the alphabetical order in the doctest. For a side-by-side comparison with BFS, see the deeper tour.
DFS differs from BFS in exactly one structural choice: stack.pop() instead of queue.get(), which changes the exploration order from wide to deep.
def depth_first_search(graph: dict, start: str) -> set[str]:
"""Depth First Search on Graph
:param graph: directed graph in dictionary format
:param start: starting vertex as a string
:returns: the trace of the search
>>> input_G = { "A": ["B", "C", "D"], "B": ["A", "D", "E"],
... "C": ["A", "F"], "D": ["B", "D"], "E": ["B", "F"],
... "F": ["C", "E", "G"], "G": ["F"] }
>>> output_G = list({'A', 'B', 'C', 'D', 'E', 'F', 'G'})
>>> all(x in output_G for x in list(depth_first_search(input_G, "A")))
True
>>> all(x in output_G for x in list(depth_first_search(input_G, "G")))
True
"""
explored, stack = set(start), [start]
while stack:
v = stack.pop()
explored.add(v)
# Differences from BFS:
# 1) pop last element instead of first one
# 2) add adjacent elements to stack without exploring them
for adj in reversed(graph[v]):
if adj not in explored:
stack.append(adj)
return exploredDijkstra: a min-heap always expands the cheapest known vertex
graphs/dijkstra.py:37Dijkstra's min-heap guarantees that the first time a vertex is popped it has been reached via the shortest path.
Dijkstra's correctness relies on one invariant: once a vertex is popped from the min-heap, no shorter path to it can exist. Because the heap always returns the minimum-cost pair, any later path to an already-visited vertex would cost at least as much. The visited check on line 52 handles a side effect of Python's heapq: the heap does not support in-place priority updates, so the same vertex can appear multiple times with different costs. Skipping already-visited vertices when they are popped discards the stale higher-cost entries. When u == end, the current cost is returned immediately because the heap guarantees this is the minimum. The doctest shows three different graph layouts all returning the correct minimum distances. The deeper tour walks through the full module including graph definitions.
Dijkstra's heap stores duplicate entries for the same vertex; the visited guard discards stale higher-cost copies when they surface.
def dijkstra(graph, start, end):
"""Return the cost of the shortest path between vertices start and end.
>>> dijkstra(G, "E", "C")
6
>>> dijkstra(G2, "E", "F")
3
>>> dijkstra(G3, "E", "F")
3
"""
heap = [(0, start)] # cost from start node,end node
visited = set()
while heap:
(cost, u) = heapq.heappop(heap)
if u in visited:
continue
visited.add(u)
if u == end:
return cost
for v, c in graph[u]:
if v in visited:
continue
next_item = cost + c
heapq.heappush(heap, (next_item, v))
return -1Bellman-Ford: relax every edge, detect negative cycles
graphs/bellman_ford.py:20Bellman-Ford relaxes all edges n-1 times so the shortest path to any vertex propagates across at most n-1 hops.
Dijkstra fails on negative-weight edges because its heap invariant breaks: popping the minimum-cost vertex no longer guarantees optimality when a later negative edge could reduce a path cost further. Bellman-Ford handles negative weights by abandoning the heap entirely and relaxing every edge in every round. A shortest path in a graph with n vertices can traverse at most n - 1 edges, so after n - 1 rounds every reachable vertex has its correct distance. The doctest uses a negative edge of weight -10 from vertex 2 to vertex 1, which causes vertex 1's distance to fall below the direct edge of weight 4 from vertex 0. The separate check_negative_cycle call performs one additional relaxation pass: if any distance still improves, a negative cycle exists and the function raises an exception. The deeper tour covers the full module.
Bellman-Ford trades the min-heap for brute-force edge relaxation, which is slower but correct on graphs with negative edge weights.
def bellman_ford(
graph: list[dict[str, int]], vertex_count: int, edge_count: int, src: int
) -> list[float]:
"""
Returns shortest paths from a vertex src to all
other vertices.
>>> edges = [(2, 1, -10), (3, 2, 3), (0, 3, 5), (0, 1, 4)]
>>> g = [{"src": s, "dst": d, "weight": w} for s, d, w in edges]
>>> bellman_ford(g, 4, 4, 0)
[0.0, -2.0, 8.0, 5.0]
>>> g = [{"src": s, "dst": d, "weight": w} for s, d, w in edges + [(1, 3, 5)]]
>>> bellman_ford(g, 4, 5, 0)
Traceback (most recent call last):
...
Exception: Negative cycle found
"""
distance = [float("inf")] * vertex_count
distance[src] = 0.0
for _ in range(vertex_count - 1):
for j in range(edge_count):
u, v, w = (graph[j][k] for k in ["src", "dst", "weight"])
if distance[u] != float("inf") and distance[u] + w < distance[v]:
distance[v] = distance[u] + w
negative_cycle_exists = check_negative_cycle(graph, distance, edge_count)
if negative_cycle_exists:
raise Exception("Negative cycle found")
return distanceA*: actual cost plus heuristic steers the search toward the goal
graphs/a_star.py:61A* expands the frontier cell with the lowest f = g + h, where g is actual path cost and h is a heuristic estimate to the goal.
The difference between plain Dijkstra and A-star is a single extra term in each cell's priority. Each entry in the cell list is [f, g, x, y] where f = g + heuristic[x][y]. The list is sorted ascending then reversed so the minimum-f cell is last; pop() removes it in O(1). This avoids a heap import at the cost of an O(n log n) sort per iteration, acceptable for small grids. When expanding the current cell, each cardinal neighbor is checked for grid bounds, the closed visited grid, and obstacle status. A valid neighbor gets cumulative cost g2 and heuristic-augmented score f2. The precomputed Manhattan-distance heuristic in the module's __main__ block is admissible for uniform move cost, guaranteeing the optimal path. See the deeper tour for the full module and heuristic construction.
The heuristic term f2 = g2 + heuristic[x2][y2] biases the frontier toward cells that are both cheap to reach and close to the goal.
while not found and not resign:
if len(cell) == 0:
raise ValueError("Algorithm is unable to find solution")
else: # to choose the least costliest action so as to move closer to the goal
cell.sort()
cell.reverse()
next_cell = cell.pop()
x = next_cell[2]
y = next_cell[3]
g = next_cell[1]
if x == goal[0] and y == goal[1]:
found = True
else:
for i in range(len(DIRECTIONS)): # to try out different valid actions
x2 = x + DIRECTIONS[i][0]
y2 = y + DIRECTIONS[i][1]
if (
x2 >= 0
and x2 < len(grid)
and y2 >= 0
and y2 < len(grid[0])
and closed[x2][y2] == 0
and grid[x2][y2] == 0
):
g2 = g + cost
f2 = g2 + heuristic[x2][y2]
cell.append([f2, g2, x2, y2])
closed[x2][y2] = 1
action[x2][y2] = iKruskal: sort edges, union-find prevents cycles
graphs/minimum_spanning_tree_kruskal.py:1Kruskal greedily picks the cheapest edge that connects two different components, using path-compressed union-find to detect cycles in O(alpha(n)) time.
Kruskal's correctness rests on the cut property: the cheapest edge crossing any cut of the graph belongs to some minimum spanning tree. Sorting all edges by weight and processing them cheapest-first ensures each accepted edge is always the lightest available bridge between two components. The union-find structure in parent tracks components: each node initially points to itself, and find_parent follows the chain to the root with path compression, flattening the tree so future lookups are near-constant. An edge is accepted only when its two endpoints have different roots, meaning they are in different components. Merging them links parent_a to parent_b. The doctests show that with five edges and four nodes, only three edges enter the MST, as expected for n-1 edges in a spanning tree. The deeper tour covers the full file.
Kruskal's path-compressed find_parent detects cycles without explicit cycle enumeration: two vertices in the same component share the same root.
def kruskal(
num_nodes: int, edges: list[tuple[int, int, int]]
) -> list[tuple[int, int, int]]:
"""
>>> kruskal(4, [(0, 1, 3), (1, 2, 5), (2, 3, 1)])
[(2, 3, 1), (0, 1, 3), (1, 2, 5)]
>>> kruskal(4, [(0, 1, 3), (1, 2, 5), (2, 3, 1), (0, 2, 1), (0, 3, 2)])
[(2, 3, 1), (0, 2, 1), (0, 1, 3)]
>>> kruskal(4, [(0, 1, 3), (1, 2, 5), (2, 3, 1), (0, 2, 1), (0, 3, 2),
... (2, 1, 1)])
[(2, 3, 1), (0, 2, 1), (2, 1, 1)]
"""
edges = sorted(edges, key=lambda edge: edge[2])
parent = list(range(num_nodes))
def find_parent(i):
if i != parent[i]:
parent[i] = find_parent(parent[i])
return parent[i]
minimum_spanning_tree_cost = 0
minimum_spanning_tree = []
for edge in edges:
parent_a = find_parent(edge[0])
parent_b = find_parent(edge[1])
if parent_a != parent_b:
minimum_spanning_tree_cost += edge[2]
minimum_spanning_tree.append(edge)
parent[parent_a] = parent_b
return minimum_spanning_treePrim: grow the spanning tree by the cheapest reachable edge
graphs/prim.py:56Prim maintains a key value per vertex representing the cheapest known edge from the current tree to that vertex, then always adds the vertex with the smallest key.
Prim and Kruskal both produce minimum spanning trees but grow the tree differently. Kruskal sorts all edges globally and picks the cheapest non-cycle-forming edge. Prim starts from a single root vertex and always adds the cheapest edge connecting the current tree to an unvisited vertex. Each Vertex stores a key field initialized to math.inf and a parent pointer pi. The root's key is set to 0 so it is chosen first by min(q). After adding vertex u, the algorithm scans neighbors: if a neighbor is unvisited and the edge to it is cheaper than its current key, both the key and parent pointer are updated. This greedy update means v.key always tracks the cheapest known bridge from the tree to v. The docstring reports O(mn) runtime because min(q) scans the full queue each iteration.
Prim's key field per vertex stores the cheapest known edge from the current tree to that vertex; min(q) picks the next addition in O(n) per step.
def prim(graph: list, root: Vertex) -> list:
"""Prim's Algorithm.
Runtime:
O(mn) with `m` edges and `n` vertices
Return:
List with the edges of a Minimum Spanning Tree
Usage:
prim(graph, graph[0])
"""
a = []
for u in graph:
u.key = math.inf
u.pi = None
root.key = 0
q = graph[:]
while q:
u = min(q)
q.remove(u)
for v in u.neighbors:
if (v in q) and (u.edges[v.id] < v.key):
v.pi = u
v.key = u.edges[v.id]
for i in range(1, len(graph)):
a.append((int(graph[i].id) + 1, int(graph[i].pi.id) + 1))
return aYou've walked through 7 key areas of the The Algorithms - Python codebase.
Continue: Backtracking Algorithms in TheAlgorithms/Python → Browse all projectsCreate code tours for your project
Intraview lets AI create interactive walkthroughs of any codebase. Install the free VS Code extension and generate your first tour in minutes.
Install Intraview Free