How Jump Search Works
Jump through sorted data in square-root-sized blocks, then linear-search the candidate block. O(sqrt(n)) with no recursion, no pivots.
What you will learn
- Why block_size = sqrt(n) is the optimal trade-off between jump count and linear scan length
- How the outer while loop advances prev and step by block_size until the target's block is found
- How the min(step, arr_size) guard prevents an out-of-range index when the last block is smaller than block_size
- How the inner while loop does a linear scan within the candidate block
- Why the algorithm requires a sorted array and fails silently on unsorted input
- How the Comparable Protocol and PEP 695 type-parameter syntax make the function generic over any ordered type
- Why jump search is faster than linear search but slower than binary search, and when to prefer it
Prerequisites
- Comfort reading Python while loops and the math module
- Familiarity with sorted arrays and index arithmetic
- No prior knowledge of jump search required
The story behind jump search
Jump search emerged from the search-algorithm literature in the 1970s and 1980s as a practical middle ground between linear and binary search. The core insight is that binary search's O(log n) complexity comes at a price: it requires random access to compute the midpoint at each step. In memory models where random access is expensive (magnetic tape, some disk structures, paged caches), jumping forward in fixed steps and then scanning backward is cheaper than bisecting, even though the asymptotic count of comparisons is higher.
The optimal block size is the square root of n, derived by minimizing the total comparison count. With block size b, the jump phase takes at most n / b jumps and the linear phase takes at most b comparisons. The total is n / b + b, which is minimized when b = sqrt(n), giving O(2 * sqrt(n)) = O(sqrt(n)) total comparisons. For n = 1,000,000, that is about 2,000 comparisons versus 20 for binary search, but jump search wins on hardware where each binary-search step costs a cache miss or a disk seek, while jump steps can be prefetched. Today's CPUs make random access cheap enough that binary search wins almost everywhere. Jump search remains a clear teaching example of how access cost changes algorithm choice.
Module Docstring and Wikipedia Reference
searches/jump_search.py:1The module docstring summarizes the two-phase search in two sentences and links to the Wikipedia article, following the source-citation rule.
The docstring describes the algorithm in algorithm terms, not in implementation terms. Lines 3 and 4 state the jump phase: iterate with steps of square root of n until the compared element exceeds the target. Lines 5 and 6 state the linear phase: scan forward until the target is found or confirmed absent. The Wikipedia URL on line 8 satisfies the contribution checklist's source-citation requirement. This level of documentation is more informative than many algorithm files in the repository, which name the algorithm but leave the mechanics to the code. A reader arriving here already knows what search algorithms are but may not know why jump search costs O(sqrt(n)); the docstring gives enough context to make the code legible without opening the Wikipedia article.
Two sentences describe both phases before any code. The Wikipedia URL satisfies the contribution checklist's source-citation requirement.
"""
Pure Python implementation of the jump search algorithm.
This algorithm iterates through a sorted collection with a step of n^(1/2),
until the element compared is bigger than the one searched.
It will then perform a linear search until it matches the wanted number.
If not found, it returns -1.
https://en.wikipedia.org/wiki/Jump_search
"""Imports and the Comparable Protocol
searches/jump_search.py:11The Comparable Protocol restricts T to types that support less-than, making jump_search generic over any ordered sequence.
Three imports set up the type system before the algorithm. math provides math.sqrt for the block size calculation. Sequence from collections.abc types the input as any object that supports indexed access and len(), which is broader than list: the function works on tuples, ranges, and custom containers. The Comparable Protocol on line 16 is the same pattern as sorts/insertion_sort.py and sorts/quick_sort.py: it declares a structural type that only requires __lt__, so the function handles strings, numbers, or any user-defined type that defines less-than. Positional-only parameter syntax (/) on line 17 is a Python 3.8+ feature that prevents callers from passing other as a keyword argument. The Protocol and Sequence imports together keep the function generic while keeping the type hints accurate.
The Comparable Protocol requires only __lt__, making jump search work on any ordered type. Sequence is broader than list: it accepts tuples and custom containers.
import math
from collections.abc import Sequence
from typing import Any, Protocol
class Comparable(Protocol):
def __lt__(self, other: Any, /) -> bool: ...Signature, Doctest, and Five Input Shapes
searches/jump_search.py:20PEP 695 type-parameter syntax binds T to Comparable. Five doctests cover an interior hit, negatives, a miss, a Fibonacci sequence, and strings.
The signature uses PEP 695 type-parameter syntax ([T: Comparable]), the same form as sorts/insertion_sort.py in this repository. Five doctests are more than most algorithm files carry, and each tests a distinct scenario. The first uses a six-element integer sequence and an interior hit. The second uses negative integers, which jump search handles because it uses < comparisons rather than index arithmetic. The third is a miss that returns -1. The fourth uses a 16-element Fibonacci sequence and searches for 55 at index 10, which requires at least four block jumps (block size 4 = floor of sqrt(16)) before the linear phase. The fifth uses strings, confirming the Comparable Protocol works for lexicographic order. Together these five examples exercise both phases and the miss branch.
Five doctests cover negatives, misses, a long Fibonacci sequence that forces multiple jumps, and strings. All pass through the Comparable Protocol's single __lt__ requirement.
def jump_search[T: Comparable](arr: Sequence[T], item: T) -> int:
"""
Python implementation of the jump search algorithm.
Return the index if the `item` is found, otherwise return -1.
Examples:
>>> jump_search([0, 1, 2, 3, 4, 5], 3)
3
>>> jump_search([-5, -2, -1], -1)
2
>>> jump_search([0, 5, 10, 20], 8)
-1
>>> jump_search([0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610], 55)
10
>>> jump_search(["aa", "bb", "cc", "dd", "ee", "ff"], "ee")
4
"""Jump Phase: Block Advancement Until Overshoot
searches/jump_search.py:38block_size = sqrt(n). The outer loop advances prev and step by block_size until the block's last element meets or passes the target.
The jump phase computes the block size on line 39 as the integer part of the square root of the array length. prev on line 41 is the start index of the current block; step on line 42 is the end index. The while condition on line 43 compares the last element of the current block against the target. min(step, arr_size) is the bounds guard: on the final block, step may exceed the last valid index, and min clamps it so the subtraction of 1 gives a valid index. If the block's last element is still less than the target, lines 44 and 45 advance both pointers by one block. The check on lines 46 and 47 catches the case where the target is larger than every element: when prev reaches or exceeds arr_size, there are no more blocks to jump and the function returns -1.
min(step, arr_size) - 1 is the safe last-block guard. When prev >= arr_size, the target exceeds every element and -1 is returned immediately.
arr_size = len(arr)
block_size = int(math.sqrt(arr_size))
prev = 0
step = block_size
while arr[min(step, arr_size) - 1] < item:
prev = step
step += block_size
if prev >= arr_size:
return -1Linear Phase and Final Match
searches/jump_search.py:49The inner while loop scans forward from prev within the candidate block. A step-boundary check exits on block exhaustion. An equality test confirms the match.
After the jump phase identifies the candidate block, the linear phase scans forward from prev one element at a time. Line 49 checks whether the current element is still below the target. If it is, line 50 advances prev. Line 51 checks whether the scan has reached the block boundary or the end of the array using the same min(step, arr_size) guard as the jump phase; if so, the element cannot be in this block and the function returns -1. Once the while exits, line 53 verifies the element at prev actually equals the target. That equality check is necessary because the while condition only checks less-than, not equality: the loop exits when arr[prev] is at least as large as item, which could mean an equal match or the first element that is strictly greater. The final return -1 on line 55 handles the strictly-greater case.
The linear phase exits when the current element is not less than the target. An explicit equality check on line 53 distinguishes a match from the first overshoot.
while arr[prev] < item:
prev += 1
if prev == min(step, arr_size):
return -1
if arr[prev] == item:
return prev
return -1You've walked through 5 key areas of the The Algorithms - Python codebase.
Continue: How BFS Works → Browse all projectsCreate code tours for your project
Intraview lets AI create interactive walkthroughs of any codebase. Install the free VS Code extension and generate your first tour in minutes.
Install Intraview Free