The Algorithms - Python Python TheAlgorithms/Python

How the Sieve of Eratosthenes Works

Mark every multiple of each found prime starting at its square. What remains is a list of primes, produced in O(n log log n).

5 stops ~14 min Verified 2026-05-05
What you will learn
  • Why starting the inner marking loop at p-squared rather than 2p skips already-marked composites
  • How a boolean array of size n+1 encodes the sieve state with one bit of information per candidate
  • Why the outer loop only needs to run up to the square root of n
  • How the second loop after the outer loop collects primes above the square root of n
  • What the six doctests cover and why prime_sieve(1) returns an empty list
  • Where O(n log log n) comes from and how the sieve compares to trial-division primality testing
Prerequisites
  • Comfort reading Python while loops and range expressions
  • Basic familiarity with the concept of prime numbers
  • No prior knowledge of the sieve required
The story behind the Sieve of Eratosthenes

Eratosthenes of Cyrene, born around 276 BCE, was a mathematician and the chief librarian of the Library of Alexandria. The sieve he described is one of the oldest algorithms still in use. The idea is to start with a list of all integers from 2 to n, then repeatedly cross out the multiples of each unmarked number. What is left after the process terminates is a complete list of primes up to n.

The algorithm's efficiency comes from starting the marking of multiples at p-squared rather than 2p: all smaller multiples have already been marked by earlier primes. That means the outer loop only needs to run up to the square root of n, because any composite number with a prime factor larger than its square root must have a corresponding factor smaller than its square root. The running time is O(n log log n), which is nearly linear and far faster than testing each candidate individually by trial division.

Modern applications of the sieve include generating prime tables for RSA key generation, solving Project Euler problems (problems 7, 10, 35, 37, 41, and many others cite it), and twin-prime research. Segmented sieve and wheel factorization are optimizations that extend the basic sieve to very large ranges while keeping memory use manageable. This implementation is the textbook form, prioritizing readability over memory efficiency.

1 / 5

Module Docstring, Attribution, and Imports

maths/sieve_of_eratosthenes.py:1

The module docstring names the algorithm, links an animation and Wikipedia reference, and credits the doctest contributor.

The contribution guide requires a docstring that explains the algorithm and links to a source. This file goes one step further: line 7 links to the Wikimedia animation, which is more useful for a learner than a text description, and line 8 points to the Wikipedia article. Lines 10 and 11 credit the person who wrote the doctests and note that a bug-finder was thanked separately, which is the kind of attribution that makes a community-maintained repo trustworthy rather than anonymous. from __future__ import annotations on line 14 enables the PEP 563 postponed evaluation of annotations, used here so the return type hint list[int] works across Python versions. The only import otherwise is math, needed for math.sqrt to compute the outer loop bound.

Key takeaway

Docstrings in this repo link to the source reference. The animation URL on line 7 is more useful for a learner than prose alone could be.

"""
Sieve of Eratosthones

The sieve of Eratosthenes is an algorithm used to find prime numbers, less than or
equal to a given value.
Illustration:
https://upload.wikimedia.org/wikipedia/commons/b/b9/Sieve_of_Eratosthenes_animation.gif
Reference: https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

doctest provider: Bruno Simas Hadlich (https://github.com/brunohadlich)
Also thanks to Dmitry (https://github.com/LizardWizzard) for finding the problem
"""

from __future__ import annotations

import math
2 / 5

Signature, Doctests, and Input Guard

maths/sieve_of_eratosthenes.py:19

prime_sieve validates that n is positive, then the docstring pins six examples including the edge cases prime_sieve(2) and prime_sieve(1).

Six doctests cover the meaningful contract boundaries. prime_sieve(50) and prime_sieve(25) verify the general case. prime_sieve(10) and prime_sieve(9) show that the result is the same for both values, which is correct: 9 is not prime and neither is 10, so both return the same list through 7. prime_sieve(2) covers the minimum prime, and prime_sieve(1) covers the edge case where no primes exist in range, returning an empty list rather than raising. The contribution guide requires that functions raise on invalid input, and the guard on line 37 does that for non-positive integers with a descriptive ValueError. The guard comes before the array allocation, so invalid input is rejected without wasting memory.

Key takeaway

The six doctests cover the general case, an off-by-one pair (9 and 10), the minimum prime (2), and the empty result (1). The ValueError guard rejects non-positive input.

def prime_sieve(num: int) -> list[int]:
    """
    Returns a list with all prime numbers up to n.

    >>> prime_sieve(50)
    [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
    >>> prime_sieve(25)
    [2, 3, 5, 7, 11, 13, 17, 19, 23]
    >>> prime_sieve(10)
    [2, 3, 5, 7]
    >>> prime_sieve(9)
    [2, 3, 5, 7]
    >>> prime_sieve(2)
    [2]
    >>> prime_sieve(1)
    []
    """

    if num <= 0:
        msg = f"{num}: Invalid input, please enter a positive integer."
        raise ValueError(msg)
3 / 5

Array Init and the Outer Prime Loop

maths/sieve_of_eratosthenes.py:41

A boolean array of size n+1 starts all True. The outer loop runs from 2 to sqrt(n). When sieve[start] is True, start is prime.

The sieve representation is as simple as possible: a list of True values with one entry per integer from 0 to n. Index i is True if i is still a candidate prime, False if it has been marked composite. Line 41 allocates n+1 slots so index n is valid. Line 44 computes int(math.sqrt(num)) as the outer loop's upper bound. A composite number c must have at least one prime factor at most equal to its square root: if all prime factors exceeded the square root, their product would exceed c. So every composite up to n is reachable by some prime at most equal to the square root. The outer loop therefore stops at that bound, which is the optimization that makes the sieve faster than trial division. Line 48 checks the sieve state before collecting the current prime.

Key takeaway

The outer loop stops at sqrt(n) because every composite up to n must have a prime factor at or below that bound. That is the sieve's core optimization.

    sieve = [True] * (num + 1)
    prime = []
    start = 2
    end = int(math.sqrt(num))

    while start <= end:
        # If start is a prime
        if sieve[start] is True:
            prime.append(start)
4 / 5

Inner Loop: Mark Multiples Starting at p-Squared

maths/sieve_of_eratosthenes.py:51

For each prime p, the inner loop marks composites starting at p-squared as False. Multiples below p-squared were already marked by earlier primes.

Starting the inner range at start * start on line 52 is the performance-critical choice. For prime p, any multiple smaller than p-squared is of the form p times q where q is less than p, which means it has already been marked composite when q's multiples were eliminated in an earlier outer iteration. Starting at p-squared avoids redundant work. The if sieve[i] is True guard on line 53 is technically unnecessary (setting False to False is harmless) but makes intent explicit. After the outer loop ends at the square root boundary, lines 58 through 60 collect any remaining True entries between the square root and n: these are primes above the square root that were never targeted by the inner marking loop but are also never composite with respect to any prime up to the square root.

Key takeaway

Starting at p-squared skips already-marked composites. The final loop gathers primes above sqrt(n) that the outer loop never directly processed.

            # Set multiples of start be False
            for i in range(start * start, num + 1, start):
                if sieve[i] is True:
                    sieve[i] = False

        start += 1

    for j in range(end + 1, num + 1):
        if sieve[j] is True:
            prime.append(j)

    return prime
5 / 5

The __main__ Guard: One-Line Driver

maths/sieve_of_eratosthenes.py:65

The __main__ block is a single line that reads n from stdin, calls prime_sieve, and prints the result.

A one-line driver is the minimal interactive shell for this algorithm. Line 66 chains four calls in one expression: input(...).strip() reads and trims a string from stdin, int(...) converts it to an integer, prime_sieve(...) computes the list, and print(...) writes it. The contribution guide requires .strip() on any input() call to prevent trailing whitespace from breaking the integer parse. There is no try/except here, which means a non-integer input raises an unhandled ValueError from int(). Compare with sorts/merge_sort.py, which wraps its driver in a try/except; the choice of whether to handle the error at this layer varies by contributor. For a single-function educational file, the raw crash on bad input is acceptable because the docstring already documents the ValueError path through the function itself.

Key takeaway

A one-line driver chains input, int conversion, the algorithm, and print. The .strip() is required by the contribution guide to handle trailing whitespace on any input() call.

if __name__ == "__main__":
    print(prime_sieve(int(input("Enter a positive integer: ").strip())))
Your codebase next

Create code tours for your project

Intraview lets AI create interactive walkthroughs of any codebase. Install the free VS Code extension and generate your first tour in minutes.

Install Intraview Free