How BFS Works
Visit every neighbor before going deeper. The FIFO queue invariant that guarantees shortest-path distances on unweighted graphs.
What you will learn
- Why a FIFO queue enforces layer-by-layer expansion and guarantees shortest paths on unweighted graphs
- How the visited set prevents infinite loops when the graph contains cycles
- How Graph.add_edge builds an adjacency dict by appending to existing neighbor lists
- How the bfs method initializes with the start vertex and drains the queue until no unvisited neighbors remain
- Why the doctest builds a graph with a self-loop and duplicate edges to stress the visited-set contract
- How the __main__ block asserts the output rather than only printing it
Prerequisites
- Comfort reading Python classes and methods
- Familiarity with the concept of a queue (first in, first out)
- No prior knowledge of graph traversal required
The story behind breadth-first search
Konrad Zuse described a form of breadth-first traversal in 1945 in his manuscript on the Plankalkul programming language, but the manuscript was not published until 1972. Edward F. Moore independently described BFS in 1959 in a paper about finding the shortest path through a maze, framing it as a wave that expands outward from the source one step at a time. C. Y. Lee rediscovered the same idea in 1961 while solving circuit board wire routing problems.
The reason BFS guarantees shortest paths on unweighted graphs follows directly from the FIFO queue. Because the queue processes vertices in the order they are added, all vertices at distance d from the source are dequeued and processed before any vertex at distance d + 1. When a vertex is first dequeued, it has been reached by the shortest possible path, so no later path can improve on it. This property makes BFS the foundation of shortest-path algorithms in social network analysis (six degrees of separation), web crawlers, and peer-to-peer networks. For weighted graphs the guarantee breaks and Dijkstra's algorithm takes over.
Imports and Graph.__init__: The Adjacency Dict
graphs/breadth_first_search.py:1Graph stores adjacency as a dict of neighbor lists. The dict starts empty; vertices enter on first outgoing edge insertion.
The file's two imports signal the design. from __future__ import annotations enables deferred type-hint evaluation for the class that is about to be defined. from queue import Queue brings in Python's thread-safe FIFO queue, which the BFS method will use to hold the frontier. The Graph class stores its edges in self.vertices, a dict[int, list[int]]. Each key is a vertex number; its value is a list of neighbors. The dict starts empty. There is no pre-registration of vertices: a vertex enters the dict the first time an edge points away from it. Vertices that only receive edges and never send them will not appear as keys, which is why the BFS doctest adds duplicate edges and a self-loop (vertex 3 to vertex 3) to verify the visited set handles those cases without infinite looping.
Adjacency is a dict of lists. Vertices enter the dict lazily on first outgoing edge insertion. Vertices with only incoming edges never become keys.
#!/usr/bin/python
"""Author: OMKAR PATHAK"""
from __future__ import annotations
from queue import Queue
class Graph:
def __init__(self) -> None:
self.vertices: dict[int, list[int]] = {}Graph.add_edge: Lazy Vertex Registration
graphs/breadth_first_search.py:26add_edge either appends to an existing neighbor list or creates a new one-element list. No separate vertex registration step is needed.
Adding an edge is an if-else on whether the source vertex already has a neighbor list. If it does, the new neighbor appends to the existing list. If it does not, the method creates a new one-element list. The doctest verifies both branches: the first add_edge(0, 1) call creates vertex 0's list, and the duplicate add_edge(0, 1) call in the BFS doctest later tests that a second identical edge appends rather than overwrites. The BFS method must therefore handle duplicate neighbors in the adjacency list, which is why the visited set is checked before enqueuing rather than after dequeuing. The simplicity of this two-branch method is intentional: graph construction stays out of the search logic, and the search only reads self.vertices.
Two branches: append to an existing list, or create a new one. The BFS logic reads this dict; it does not depend on any insertion ordering guarantee.
def add_edge(self, from_vertex: int, to_vertex: int) -> None:
"""
adding the edge between two vertices
>>> g = Graph()
>>> g.print_graph()
>>> g.add_edge(0, 1)
>>> g.print_graph()
0 : 1
"""
if from_vertex in self.vertices:
self.vertices[from_vertex].append(to_vertex)
else:
self.vertices[from_vertex] = [to_vertex]bfs: Queue, visited Set, and the Traversal Loop
graphs/breadth_first_search.py:40BFS initializes visited with the start vertex, enqueues it, then drains the queue by dequeuing each vertex and enqueuing its unvisited neighbors.
The BFS traversal rests on three structures whose roles are distinct and non-interchangeable. The visited set on line 54 prevents a vertex from being processed more than once; it is checked before enqueuing, not after dequeuing, which is why the self-loop (vertex 3 to vertex 3) in the doctest does not cause an infinite loop. The queue on line 57 is Python's Queue, a FIFO structure; using a stack here would produce depth-first traversal instead. The self.vertices dict is the adjacency map the add_edge method built. The loop invariant is that every vertex in the queue has already been added to visited, so the adjacency scan only enqueues genuinely new vertices. The function returns the visited set rather than a list because the doctest uses sorted() to get deterministic output; traversal order depends on insertion order in the adjacency list, not alphabetical order.
Mark before enqueuing, not after dequeuing. The visited set and the queue work together: one prevents reprocessing, the other enforces layer-by-layer expansion.
def bfs(self, start_vertex: int) -> set[int]:
"""
>>> g = Graph()
>>> g.add_edge(0, 1)
>>> g.add_edge(0, 1)
>>> g.add_edge(0, 2)
>>> g.add_edge(1, 2)
>>> g.add_edge(2, 0)
>>> g.add_edge(2, 3)
>>> g.add_edge(3, 3)
>>> sorted(g.bfs(2))
[0, 1, 2, 3]
"""
# initialize set for storing already visited vertices
visited = set()
# create a first in first out queue to store all the vertices for BFS
queue: Queue = Queue()
# mark the source node as visited and enqueue it
visited.add(start_vertex)
queue.put(start_vertex)
while not queue.empty():
vertex = queue.get()
# loop through all adjacent vertex and enqueue it if not yet visited
for adjacent_vertex in self.vertices[vertex]:
if adjacent_vertex not in visited:
queue.put(adjacent_vertex)
visited.add(adjacent_vertex)
return visitedThe Doctest: Cycles, Duplicates, and a Self-Loop
graphs/breadth_first_search.py:41The doctest uses a duplicate edge, a back edge, and a self-loop to stress the visited-set contract, then verifies all four vertices are reached.
The doctest is short but each edge is chosen deliberately. add_edge(0, 1) appears twice to test that the adjacency list accumulates duplicates without error and that BFS still visits vertex 1 exactly once. add_edge(2, 0) creates a cycle back to vertex 0, which would loop forever if the visited set were not checked before enqueuing. add_edge(3, 3) creates a self-loop, which is the hardest case: vertex 3 has itself as a neighbor, so a traversal without a visited check would enqueue vertex 3 immediately after dequeuing it. The assertion uses sorted(g.bfs(2)) because bfs returns a set and set ordering is not guaranteed. Starting from vertex 2 rather than 0 shows that BFS does not require the lowest-numbered vertex as the source and still reaches every reachable vertex in a strongly connected subgraph.
The three deliberate edge types in the doctest (duplicate, cycle, self-loop) each stress a different correctness property of the visited-set check.
"""
>>> g = Graph()
>>> g.add_edge(0, 1)
>>> g.add_edge(0, 1)
>>> g.add_edge(0, 2)
>>> g.add_edge(1, 2)
>>> g.add_edge(2, 0)
>>> g.add_edge(2, 3)
>>> g.add_edge(3, 3)
>>> sorted(g.bfs(2))
[0, 1, 2, 3]
"""The __main__ Block: Building a Six-Edge Graph and Asserting the Result
graphs/breadth_first_search.py:74The __main__ block runs the doctests verbosely, builds a six-edge graph, prints its adjacency list, and asserts the BFS result rather than only printing it.
This block has two design choices worth calling out. testmod(verbose=True) on line 77 prints each doctest as it runs with its pass/fail result, making the file self-documenting when invoked directly. The last line uses assert rather than print, so a wrong BFS result raises a visible AssertionError rather than producing silent output. The print_graph() call on line 87 prints the adjacency list in a readable format; the four commented lines below it show exactly what the output looks like, which is a documentation convention the repository uses in several graph files. The graph built here is a subset of the one in the BFS doctest: it excludes the duplicate (0, 1) edge but keeps the cycle and the self-loop, so the assert is equivalent to verifying the same four-vertex reachability.
assert sorted(g.bfs(2)) == [0, 1, 2, 3] makes a wrong result visible as an AssertionError. Printing without asserting is a missed verification opportunity.
if __name__ == "__main__":
from doctest import testmod
testmod(verbose=True)
g = Graph()
g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(1, 2)
g.add_edge(2, 0)
g.add_edge(2, 3)
g.add_edge(3, 3)
g.print_graph()
# 0 : 1 -> 2
# 1 : 2
# 2 : 0 -> 3
# 3 : 3
assert sorted(g.bfs(2)) == [0, 1, 2, 3]Comparison: BFS Versus DFS in the graphs/ Directory
graphs/breadth_first_search.py:63Swap the FIFO Queue for a LIFO stack and the same BFS loop becomes DFS. The visited-set check is identical in both traversal families.
The BFS loop body is the place to see what separates BFS from DFS by reading the code rather than a diagram. The structure here is a FIFO drain: queue.get() removes the oldest entry, and queue.put() appends the newest. The effect is that vertices added during expansion of layer d wait behind all existing layer-d entries and are processed only after every layer-d vertex is done, giving the layer-by-layer guarantee. The DFS file at graphs/depth_first_search.py replaces Queue with a plain list used as a stack, calling stack.pop() to take the most recently added entry. Both share the same visited-set check; the single structural difference is FIFO versus LIFO. That one swap changes shortest-path guarantee to depth-first backtracking.
FIFO queue gives layer-by-layer expansion and shortest paths. Swap in a LIFO stack and the same loop skeleton becomes DFS. The visited-set check is identical in both.
while not queue.empty():
vertex = queue.get()
# loop through all adjacent vertex and enqueue it if not yet visited
for adjacent_vertex in self.vertices[vertex]:
if adjacent_vertex not in visited:
queue.put(adjacent_vertex)
visited.add(adjacent_vertex)
return visitedYou've walked through 6 key areas of the The Algorithms - Python codebase.
Continue: How DFS Works → Browse all projectsCreate code tours for your project
Intraview lets AI create interactive walkthroughs of any codebase. Install the free VS Code extension and generate your first tour in minutes.
Install Intraview Free