Heap Sort | Dheerendra Varma

Heap sort answers a natural question: what if we could repeatedly find and remove the maximum element from the array efficiently? With a plain array, finding the maximum takes O(n) per operation. With a sorted array, removal takes O(n) to shift. Neither is fast enough. The heap data structure gives us both operations in O(log n) — and it can be embedded directly inside the array we are sorting, requiring no extra memory.

The result is a sort that is O(n log n) in the best, average, and worst case, uses O(1) auxiliary space, and has no pathological inputs. It is slower than quick sort in practice due to poor cache locality, but its guarantees are stronger.

Step-by-Step Trace

Input: [12, 11, 13, 5, 6, 7]

Max-heap array layout

A max-heap is a complete binary tree stored in an array. For node at index i:

left  child  = 2*i + 1
right child  = 2*i + 2
parent        = (i - 1) / 2

Phase 1 — Build max-heap (Floyd's algorithm)

Start from the last non-leaf node (n/2 - 1) = 2 and sift down toward index 0.

Initial array: [12, 11, 13, 5, 6, 7]

As a tree:
           12  (idx 0)
          /  \
        11    13  (idx 1, 2)
       / \   /
      5   6 7   (idx 3, 4, 5)

--- sift-down from index 2 (value 13) ---
Children: left=5 (idx 5, value 7) — only one child here actually idx 5
  left child index = 2*2+1 = 5, value = 7
  right child index = 2*2+2 = 6, out of bounds
  13 > 7, no swap needed.

Tree after sift-down(2): unchanged
[12, 11, 13, 5, 6, 7]

--- sift-down from index 1 (value 11) ---
  left  child: idx 3, value 5
  right child: idx 4, value 6
  largest child: 6 at idx 4
  11 > 6, no swap needed.

Tree after sift-down(1): unchanged
[12, 11, 13, 5, 6, 7]

--- sift-down from index 0 (value 12) ---
  left  child: idx 1, value 11
  right child: idx 2, value 13
  largest child: 13 at idx 2
  12 < 13 → swap arr[0] ↔ arr[2]

[13, 11, 12, 5, 6, 7]

  Continue sifting 12 down from idx 2:
    left  child: idx 5, value 7
    right child: idx 6, out of bounds
    12 > 7, no swap needed.

Max-heap ready: [13, 11, 12, 5, 6, 7]

Tree:
           13
          /  \
        11    12
       / \   /
      5   6 7

Phase 2 — Extract max repeatedly (sort in place)

Each step: swap root (max) with last unsorted element, shrink heap boundary, sift root down.

heap_size=6
Step 1: swap arr[0]↔arr[5] → [7, 11, 12, 5, 6, 13]   (13 is placed)
        sift-down [7, 11, 12, 5, 6] from idx 0:
          children: 11(idx1), 12(idx2) → max is 12
          7 < 12 → swap idx0↔idx2 → [12, 11, 7, 5, 6, 13]
          continue from idx2: child 7(idx5) out of heap now — stop.
        array: [12, 11, 7, 5, 6 | 13]

heap_size=5
Step 2: swap arr[0]↔arr[4] → [6, 11, 7, 5, 12, 13]   (12 placed)
        sift-down [6, 11, 7, 5] from idx 0:
          children: 11(idx1), 7(idx2) → max is 11
          6 < 11 → swap idx0↔idx1 → [11, 6, 7, 5, 12, 13]
          continue from idx1: children 5(idx3), none(idx4 outside heap)
          6 > 5, no swap.
        array: [11, 6, 7, 5 | 12, 13]

heap_size=4
Step 3: swap arr[0]↔arr[3] → [5, 6, 7, 11, 12, 13]   (11 placed)
        sift-down [5, 6, 7] from idx 0:
          children: 6(idx1), 7(idx2) → max is 7
          5 < 7 → swap idx0↔idx2 → [7, 6, 5, 11, 12, 13]
          continue from idx2: no children in heap.
        array: [7, 6, 5 | 11, 12, 13]

heap_size=3
Step 4: swap arr[0]↔arr[2] → [5, 6, 7, 11, 12, 13]   (7 placed)
        sift-down [5, 6] from idx 0:
          child: 6(idx1)
          5 < 6 → swap → [6, 5, 7, 11, 12, 13]
        array: [6, 5 | 7, 11, 12, 13]

heap_size=2
Step 5: swap arr[0]↔arr[1] → [5, 6, 7, 11, 12, 13]   (6 placed)
        sift-down [5]: no children, done.
        array: [5 | 6, 7, 11, 12, 13]

heap_size=1 → done.

Result: [5, 6, 7, 11, 12, 13]

Naive Implementation

Build an explicit MaxHeap struct, push all elements in, then pop them out in descending order into a result vector.

struct MaxHeap {
    data: Vec<i32>,
}
 
impl MaxHeap {
    fn new() -> Self {
        MaxHeap { data: Vec::new() }
    }
 
    fn push(&mut self, val: i32) {
        self.data.push(val);
        // Bubble the new element up until the heap property is restored.
        let mut i = self.data.len() - 1;
        while i > 0 {
            let parent = (i - 1) / 2;
            if self.data[i] > self.data[parent] {
                self.data.swap(i, parent);
                i = parent;
            } else {
                break;
            }
        }
    }
 
    fn pop(&mut self) -> Option<i32> {
        if self.data.is_empty() {
            return None;
        }
        let n = self.data.len();
        self.data.swap(0, n - 1);
        let max = self.data.pop();
        // Sift the new root down to restore the heap property.
        self.sift_down(0, self.data.len());
        max
    }
 
    fn sift_down(&mut self, mut i: usize, heap_size: usize) {
        loop {
            let left  = 2 * i + 1;
            let right = 2 * i + 2;
            let mut largest = i;
 
            if left < heap_size && self.data[left] > self.data[largest] {
                largest = left;
            }
            if right < heap_size && self.data[right] > self.data[largest] {
                largest = right;
            }
 
            if largest == i {
                break; // heap property satisfied
            }
            self.data.swap(i, largest);
            i = largest;
        }
    }
}
 
fn heap_sort_naive(arr: &[i32]) -> Vec<i32> {
    // Phase 1: insert every element — O(n log n) via repeated push.
    let mut heap = MaxHeap::new();
    for &x in arr {
        heap.push(x);
    }
 
    // Phase 2: pop all elements — they come out in descending order.
    let mut result = Vec::with_capacity(arr.len());
    while let Some(val) = heap.pop() {
        result.push(val);
    }
 
    // Reverse to get ascending order.
    result.reverse();
    result
}
 
fn main() {
    let data = vec![12, 11, 13, 5, 6, 7];
    let sorted = heap_sort_naive(&data);
    println!("{:?}", sorted); // [5, 6, 7, 11, 12, 13]
}

The naive approach uses O(n) extra heap memory for the MaxHeap struct's Vec, and the build phase costs O(n log n) because each of the n push calls can bubble up O(log n) levels. The optimized version fixes both issues.

Optimized Implementation

In-place heap sort using Floyd's algorithm to build the max-heap in O(n), followed by n−1 swap-and-sift-down extractions — all within the original array.

fn heap_sort(arr: &mut [i32]) {
    let n = arr.len();
    if n <= 1 {
        return;
    }
 
    // Phase 1 — Build max-heap using Floyd's bottom-up algorithm.
    //
    // Leaves (indices n/2..n) are trivially valid heaps of size 1.
    // Work backwards from the last internal node (n/2 - 1) to the root,
    // sifting each node down. This is O(n) total, not O(n log n).
    for i in (0..n / 2).rev() {
        sift_down(arr, i, n);
    }
 
    // Phase 2 — Sort by repeatedly extracting the maximum.
    //
    // The maximum is always arr[0] (the heap root). Swap it with the last
    // element in the unsorted region, shrink the heap by 1, and sift the
    // new root down to restore the heap property.
    for heap_end in (1..n).rev() {
        arr.swap(0, heap_end);          // move max to its final position
        sift_down(arr, 0, heap_end);    // restore heap over arr[..heap_end]
    }
}
 
/// Sift node at index `i` downward within arr[..heap_size].
/// Swaps with the largest child until the max-heap property holds.
fn sift_down(arr: &mut [i32], mut i: usize, heap_size: usize) {
    loop {
        let left  = 2 * i + 1;
        let right = 2 * i + 2;
        let mut largest = i;
 
        if left  < heap_size && arr[left]  > arr[largest] { largest = left;  }
        if right < heap_size && arr[right] > arr[largest] { largest = right; }
 
        if largest == i {
            break; // subtree rooted at i satisfies max-heap property
        }
 
        arr.swap(i, largest);
        i = largest; // follow the swapped element down
    }
}
 
fn main() {
    let mut data = vec![12, 11, 13, 5, 6, 7];
    heap_sort(&mut data);
    println!("{:?}", data); // [5, 6, 7, 11, 12, 13]
 
    // Works correctly on edge cases.
    let mut empty: Vec<i32> = vec![];
    heap_sort(&mut empty);
 
    let mut one = vec![42];
    heap_sort(&mut one);
    assert_eq!(one, [42]);
 
    // All duplicates — heap property is maintained even when all values equal.
    let mut dups = vec![3i32; 1000];
    heap_sort(&mut dups);
    assert!(dups.iter().all(|&x| x == 3));
}

Key improvements over the naive version:

O(n) build phase — Floyd's algorithm starts from the leaves (which are already valid heaps) and works upward. The total sift-down work across all nodes sums to O(n) rather than O(n log n) (see time complexity section for the derivation).
O(1) auxiliary space — no separate data structure. The heap and the sorted region both live in the same array, separated only by the heap_end index.
No allocation — the function takes &mut [i32] and operates entirely in place.

Time Complexity

Phase 1 — Building the max-heap

A naive build (pushing n elements one by one) costs O(n log n). Floyd's algorithm is better: it processes only the n/2 internal nodes, and nodes near the bottom of the tree (the majority) have small subtrees to sift through.

Let h be the height of the heap (h = ⌊log₂ n⌋). At depth d from the root, there are at most 2ᵈ nodes, each needing to sift down at most h−d levels. The total work is:

Build cost = Σ_{d=0}^{h} 2^d · (h - d)

Let k = h - d  (number of sift-down levels):

           = Σ_{k=0}^{h} 2^{h-k} · k
           = 2^h · Σ_{k=0}^{h} k / 2^k

The sum Σ_{k=0}^{∞} k / 2^k converges to 2 (standard power series result).

Since 2^h ≤ n:

Build cost ≤ n · 2 = O(n)

Floyd's build is therefore O(n), which is optimal — you must read every element at least once.

Phase 2 — Extraction

Each of the n−1 extractions involves one swap (O(1)) and one sift-down (O(log n) at most, since the heap shrinks by one each time):

Extraction cost = (n - 1) · O(log n) = O(n log n)

Total: O(n) + O(n log n) = O(n log n)

This bound holds for all inputs — best, average, and worst case — because the heap structure is determined entirely by the array layout, not the values. There are no lucky or unlucky arrangements.

Recurrence perspective

Unlike merge sort and quick sort, heap sort does not have a clean divide-and-conquer recurrence. Its O(n log n) is derived directly from the work-per-level argument above rather than from a recurrence relation.

Space Complexity

Auxiliary space: O(1)

Heap sort is fully in-place. The sift_down function uses only a handful of local index variables (i, left, right, largest). The sorted region and the heap region both share the original array. No additional data structure of any size is allocated.

Call stack: O(1)

The sift_down function is iterative (the loop + break pattern), not recursive. The outer heap_sort function is also iterative. The call depth is constant regardless of input size.

This distinguishes heap sort from merge sort (O(n) auxiliary buffer) and the naive recursive quick sort (O(log n) stack frames on average). Heap sort achieves the best possible space bound for a comparison-based sort.

Property	Heap Sort
Best case time	O(n log n)
Average case time	O(n log n)
Worst case time	O(n log n)
Auxiliary space	O(1)
Call stack	O(1)
Stable	No
In-place	Yes