How A* Search Works
Combine actual path cost with a heuristic estimate of remaining distance. The 1968 algorithm built for a robot, now in every map and game engine.
What you will learn
- How A* combines g (actual cost so far) and h (heuristic estimate to goal) into f to guide the frontier
- Why the heuristic must be admissible (never overestimate) for A* to guarantee an optimal path
- How the DIRECTIONS array encodes the four compass moves on a grid
- How the cell list acts as a priority queue sorted by f-cost on every iteration
- How the action grid records which direction reached each cell, enabling path reconstruction
- How the invpath list is built backward from the goal and reversed to yield the final path
Prerequisites
- Familiarity with Python lists and nested loops
- A basic mental model of graph traversal (BFS or Dijkstra)
- No prior knowledge of A* required
The story behind A* search
Peter Hart, Nils Nilsson, and Bertram Raphael at the Stanford Research Institute published A-star search in 1968 in a paper titled "A Formal Basis for the Heuristic Determination of Minimum Cost Paths." The algorithm was written to solve a navigation problem for Shakey, the world's first general-purpose mobile robot. Shakey needed to plan routes through a room full of obstacles without evaluating every possible path, and the two existing approaches were unsatisfying: Dijkstra's algorithm explored every reachable node regardless of direction, while greedy best-first search chased the heuristic without caring about how far it had already traveled.
A-star merged both ideas by tracking f = g + h, where g is the actual cost from the start and h is the heuristic estimate to the goal. The key insight is the admissibility requirement: if h never overestimates the true remaining cost, the algorithm finds the cheapest path. The Manhattan distance on a grid is admissible because it counts the minimum steps with no diagonal movement. Today A-star is the standard pathfinding algorithm in turn-based strategy games, GPS routing, and robotics motion planning.
DIRECTIONS: Encoding the Four Compass Moves
graphs/a_star.py:1DIRECTIONS encodes four compass moves as row/column deltas. Integer indices let the action grid record which direction reached each cell.
Before the search function, the file defines the four moves available on the grid. Each entry in DIRECTIONS is a [row_delta, col_delta] pair. Index 0 is left (subtract one from the row), index 1 is down (subtract from column), index 2 is right (add to row), index 3 is up (add to column). Storing moves as integer-indexed entries rather than named strings is the design choice that enables the action grid later in the algorithm: when the search records that direction 2 reached cell [x2, y2], the path-reconstruction step can recover the predecessor by computing x - DIRECTIONS[2][0]. Named string directions would require a reverse-lookup map; integer indices make the inverse free. The from __future__ import annotations import on line 1 enables PEP 563 deferred evaluation of type hints, which this file uses in the function signature.
Integer-indexed DIRECTIONS make the action grid self-inverting during path reconstruction. The predecessor of cell [x, y] is [x - DIRECTIONS[action[x][y]][0], y - DIRECTIONS[action[x][y]][1]].
from __future__ import annotations
DIRECTIONS = [
[-1, 0], # left
[0, -1], # down
[1, 0], # right
[0, 1], # up
]Function Signature and the Doctest Grid
graphs/a_star.py:12search takes a grid, start, goal, step cost, and a caller-supplied heuristic matrix. The doctest verifies path output on a 5x6 obstacle grid.
The function takes five arguments. grid is a 2-D list where 0 means free and 1 means obstacle. init and goal are [row, col] positions. cost is the uniform step cost, set to 1 in every doctest. heuristic is the most important input: a 2-D matrix precomputed by the caller that holds the estimated cost from each cell to the goal. The doctest uses the Manhattan distance (absolute row difference plus absolute column difference) and assigns a penalty of 99 to obstacle cells to prevent the search from routing through them. Separating the heuristic into an input rather than hardcoding it inside the function keeps the algorithm general: replace the Manhattan matrix with Euclidean distance or a custom cost map without touching the search logic.
The heuristic is a caller-supplied matrix, not a function hardcoded inside search. That separation keeps the algorithm general across different distance metrics.
# function to search the path
def search(
grid: list[list[int]],
init: list[int],
goal: list[int],
cost: int,
heuristic: list[list[int]],
) -> tuple[list[list[int]], list[list[int]]]:
"""
Search for a path on a grid avoiding obstacles.
>>> grid = [[0, 1, 0, 0, 0, 0],
... [0, 1, 0, 0, 0, 0],
... [0, 1, 0, 0, 0, 0],
... [0, 1, 0, 0, 1, 0],
... [0, 0, 0, 0, 1, 0]]
>>> init = [0, 0]
>>> goal = [len(grid) - 1, len(grid[0]) - 1]
>>> cost = 1
>>> heuristic = [[0] * len(grid[0]) for _ in range(len(grid))]
>>> heuristic = [[0 for row in range(len(grid[0]))] for col in range(len(grid))]
>>> for i in range(len(grid)):
... for j in range(len(grid[0])):
... heuristic[i][j] = abs(i - goal[0]) + abs(j - goal[1])
... if grid[i][j] == 1:
... heuristic[i][j] = 99
>>> path, action = search(grid, init, goal, cost, heuristic)
>>> path # doctest: +NORMALIZE_WHITESPACE
[[0, 0], [1, 0], [2, 0], [3, 0], [4, 0], [4, 1], [4, 2], [4, 3], [3, 3],
[2, 3], [2, 4], [2, 5], [3, 5], [4, 5]]
>>> action # doctest: +NORMALIZE_WHITESPACE
[[0, 0, 0, 0, 0, 0], [2, 0, 0, 0, 0, 0], [2, 0, 0, 0, 3, 3],
[2, 0, 0, 0, 0, 2], [2, 3, 3, 3, 0, 2]]
"""Initialization: closed, action, and the First Cell
graphs/a_star.py:44closed marks visited cells, action stores arrival direction, and cell is the frontier list of [f, g, row, col] entries sorted by f-cost.
Three data structures govern the search. closed is a grid of zeros that records which cells have already been expanded; writing 1 at init immediately marks the start as visited. action stores, for every cell, the index into DIRECTIONS that led to it from its predecessor. It starts all zeros and gets filled in as the frontier expands. The third structure, cell, is the open frontier. Each entry is [f, g, row, col], where f = g + heuristic[row][col] and g is the actual cost from start. The list starts with the starting position only. At every loop iteration the algorithm will sort this list and pop the entry with the smallest f, which is the A* priority. Using a plain list with sort is simpler to read than heapq; production implementations use a heap for O(log n) pops instead of O(n log n) sorts.
cell is the open frontier: a list of [f, g, row, col] entries. Sorting and popping gives the minimum-f cell, which is the A* expansion order.
closed = [
[0 for col in range(len(grid[0]))] for row in range(len(grid))
] # the reference grid
closed[init[0]][init[1]] = 1
action = [
[0 for col in range(len(grid[0]))] for row in range(len(grid))
] # the action grid
x = init[0]
y = init[1]
g = 0
f = g + heuristic[x][y] # cost from starting cell to destination cell
cell = [[f, g, x, y]]Main Loop: Expand the Minimum-f Cell
graphs/a_star.py:58Sort the frontier, pop the minimum-f cell, then expand valid unvisited neighbors by computing new g and f values and appending them.
The loop body has two phases. Phase one selects the next cell to expand: cell.sort() on line 65 orders by f ascending (since f is the first element of each sub-list), then cell.reverse() and cell.pop() together remove the minimum-f entry. Phase two expands neighbors: the four-direction loop on line 75 computes each candidate neighbor, skips it if out of bounds, already closed, or an obstacle, then computes the neighbor's g and f values, appends the new entry to the frontier, marks the cell closed, and writes the direction index into the action grid. Writing to action[x2][y2] on line 90 is the step that makes path reconstruction possible later. The found flag on line 72 exits the loop as soon as the goal cell is popped, which guarantees the popped path cost is optimal if the heuristic is admissible.
resign is initialized but never reassigned; only goal-capture ends the loop.
Sort by f, pop the minimum, expand valid neighbors, record the arrival direction in the action grid. Exit as soon as the goal is popped.
found = False # flag that is set when search is complete
resign = False # flag set if we can't find expand
while not found and not resign:
if len(cell) == 0:
raise ValueError("Algorithm is unable to find solution")
else: # to choose the least costliest action so as to move closer to the goal
cell.sort()
cell.reverse()
next_cell = cell.pop()
x = next_cell[2]
y = next_cell[3]
g = next_cell[1]
if x == goal[0] and y == goal[1]:
found = True
else:
for i in range(len(DIRECTIONS)): # to try out different valid actions
x2 = x + DIRECTIONS[i][0]
y2 = y + DIRECTIONS[i][1]
if (
x2 >= 0
and x2 < len(grid)
and y2 >= 0
and y2 < len(grid[0])
and closed[x2][y2] == 0
and grid[x2][y2] == 0
):
g2 = g + cost
f2 = g2 + heuristic[x2][y2]
cell.append([f2, g2, x2, y2])
closed[x2][y2] = 1
action[x2][y2] = iPath Reconstruction: Walk Backward Through the Action Grid
graphs/a_star.py:91Walk backward from goal to start using the action grid, subtract each recorded direction delta to reach the predecessor, then reverse for the final path.
The search loop fills the action grid but does not build the path during traversal. Path reconstruction starts at the goal and walks backward. At each step, action[x][y] holds the index of the direction that reached the current cell from its predecessor. Subtracting that direction's row and column deltas on lines 96 and 97 gives the predecessor coordinates. The loop continues until the start cell is reached, accumulating invpath in reverse order from goal to start. The final path block on lines 102 through 104 reverses the list by reading invpath backward. Returning both path and action lets the caller inspect the decision grid as well as the final route, which is what the doctest verifies by printing both outputs.
Action grid reconstruction: subtract the recorded direction delta at each cell to reach the predecessor. Reverse the collected cells to get a start-to-goal path.
invpath = []
x = goal[0]
y = goal[1]
invpath.append([x, y]) # we get the reverse path from here
while x != init[0] or y != init[1]:
x2 = x - DIRECTIONS[action[x][y]][0]
y2 = y - DIRECTIONS[action[x][y]][1]
x = x2
y = y2
invpath.append([x, y])
path = []
for i in range(len(invpath)):
path.append(invpath[len(invpath) - 1 - i])
return path, actionThe __main__ Block: A Concrete 5x6 Grid Example
graphs/a_star.py:108The __main__ block uses the same 5x6 grid as the doctest, fills the Manhattan heuristic matrix, and prints the action map and path.
The __main__ block matches the doctest grid so a reader can trace both at once. The 5x6 grid has obstacles at column 1 in rows 0 through 3, plus one obstacle at row 3 column 4. The Manhattan heuristic on lines 124 through 129 assigns each free cell the sum of absolute row and column differences from the goal at [4, 5], and assigns 99 to obstacle cells so the search routes around them. The comment on line 118 notes coordinates are in [y, x] rows-first format, matching the 2-D list indexing throughout the file. After the search, the block prints the action map (the direction index that reached each cell) followed by the path one cell per line. Reading those two outputs alongside the doctest grid is the direct way to trace how the algorithm navigated the obstacle column.
The __main__ grid matches the doctest. Reading the action map alongside the obstacle grid shows exactly which direction index A* chose at each step.
if __name__ == "__main__":
grid = [
[0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0], # 0 are free path whereas 1's are obstacles
[0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 1, 0],
[0, 0, 0, 0, 1, 0],
]
init = [0, 0]
# all coordinates are given in format [y,x]
goal = [len(grid) - 1, len(grid[0]) - 1]
cost = 1
# the cost map which pushes the path closer to the goal
heuristic = [[0 for row in range(len(grid[0]))] for col in range(len(grid))]
for i in range(len(grid)):
for j in range(len(grid[0])):
heuristic[i][j] = abs(i - goal[0]) + abs(j - goal[1])
if grid[i][j] == 1:
# added extra penalty in the heuristic map
heuristic[i][j] = 99
path, action = search(grid, init, goal, cost, heuristic)
print("ACTION MAP")
for i in range(len(action)):
print(action[i])
for i in range(len(path)):
print(path[i])You've walked through 6 key areas of the The Algorithms - Python codebase.
Continue: How Bellman-Ford Works → Browse all projectsCreate code tours for your project
Intraview lets AI create interactive walkthroughs of any codebase. Install the free VS Code extension and generate your first tour in minutes.
Install Intraview Free