The Concept
A binary heap is one of the most elegant data-structure tricks in computer science: it achieves O(log n) insert and delete-min (or delete-max) by exploiting a single structural guarantee — the tree is always complete.
A complete binary tree fills its levels left to right, with no gaps. That property lets you store the entire tree in a plain array without any pointers. If a node lives at zero-based index i, its left child sits at 2i + 1, its right child at 2i + 2, and its parent at (i - 1) / 2 (integer division). No next pointers, no memory allocations per node — just index arithmetic on a flat array.
Layered on top of that layout is the heap invariant:
- Min-heap: every parent is less than or equal to its children. The root is therefore always the global minimum.
- Max-heap: every parent is greater than or equal to its children. The root is always the global maximum.
Crucially, a heap is not fully sorted. Siblings have no ordering relationship with each other, and the second-smallest element can be anywhere in level 1 or level 2. The heap only promises that the root beats everyone — and it maintains that promise efficiently.
Core Operations
push (insert) — O(log n)
Append the new value at the first open slot at the bottom of the tree (the next position in the array). Then sift up: repeatedly compare the new node with its parent and swap if the invariant is violated. Because the tree has height ⌊log₂ n⌋, at most log n swaps are needed.
pop (extract-min / extract-max) — O(log n)
Remove the root (the answer you want). Replace it with the last element in the array, shrink the array by one, then sift down: swap the displaced root with its smaller child (for a min-heap) until the invariant is restored. Again, at most log n swaps.
peek — O(1)
Read heap[0]. The invariant guarantees it is always the min (or max). No work required.
build_heap from an unsorted array — O(n)
This one surprises people. Starting from the last internal node (index n//2 - 1) and working backwards to the root, call sift-down on each node. The total cost is O(n), not O(n log n), because most nodes are near the leaves and travel only a short distance. Python's heapq.heapify uses exactly this bottom-up approach.
Code Implementation
Complexity Analysis
| Operation | Time | Why |
|---|---|---|
| push | O(log n) | sift-up travels at most tree height |
| pop | O(log n) | sift-down travels at most tree height |
| peek | O(1) | invariant guarantees root is min/max |
| build_heap | O(n) | bottom-up; most nodes sift down very little |
| Space | O(n) | one array slot per element |
Interactive Lab
Watch the min-heap maintain its invariant as elements are pushed and popped. Notice that the array representation mirrors the tree layout exactly.
The max-heap works identically — only the comparison flips. Every parent is now the largest in its subtree, and the root holds the global maximum.
The bottom-up build starts from the last internal node and sifts each one down. Watch how the cost concentrates near the root, leaving the leaf-heavy lower levels nearly untouched — that's why the total cost is O(n).
Heap sort repeatedly pops the root of a max-heap to place the largest remaining element at the end — an in-place O(n log n) sort built entirely from heap operations.
Key Takeaways
- Array layout, no pointers. The complete-tree structure maps directly to an array via index arithmetic (
2i+1,2i+2,(i-1)//2). This makes heaps cache-friendly and avoids allocation overhead. - Heap-ordered, not fully sorted. Siblings are unordered. The only guarantee is parent vs. children. Think of it as a "partially sorted" structure optimised for one thing: fast access to the extreme value.
- O(1) peek, O(log n) push/pop, O(n) build. These bounds are tight. If you need the minimum (or maximum) repeatedly while also inserting new elements, a heap beats sorting the whole collection every time.
- Foundation for important algorithms. Heap sort runs in O(n log n) with O(1) extra space by building a max-heap in place. Dijkstra's shortest-path algorithm relies on a min-heap priority queue to always expand the nearest unvisited node next. Any "K largest / K smallest" query in a stream of data uses a heap of size K.
- Priority queues are heaps. Python's
heapq, Java'sPriorityQueue, and C++'spriority_queueare all backed by a binary heap. Understanding the heap invariant lets you reason about the performance guarantees those standard library types give you.