Binary Search | Dheerendra Varma

Binary search is the algorithm that makes sorted data worth having. Instead of scanning every element, it exploits the ordering to eliminate half the remaining candidates with each comparison. An array of one million elements requires at most 20 comparisons — the same number needed to locate one entry among every person on Earth using a binary halving strategy.

The prerequisite is strict: the array must be sorted. In exchange, you get a logarithmic speedup that compounds dramatically with scale.

The Core Intuition

Picture a dictionary. You want to find the word "marsh." You do not start at page 1 — you open to the middle. If you land on "q", you know "marsh" is in the first half. You open to the middle of the first half. You repeat until you land on "marsh" or determine it is not there.

Each opening of the dictionary cuts the search space in half. After one cut: n/2 pages remain. After two cuts: n/4. After k cuts: n/2ᵏ. When n/2ᵏ = 1, you have found the answer or confirmed the absence. Solving for k: k = log₂(n).

This is binary search. The formal requirements:

Sorted array — you need to know "everything to the left is smaller, everything to the right is larger."
Random access — you need to jump to the middle in O(1). Arrays satisfy this; linked lists do not.

Step-by-Step Trace

Array: [2, 5, 8, 12, 16, 23, 38, 56, 72, 91], target: 23

Indices: 0 1 2 3 4 5 6 7 8 9

Initial state: left=0, right=9
  [ 2] [ 5] [ 8] [12] [16] [23] [38] [56] [72] [91]
   L                   M                           R
   0                   4                           9

Step 1: mid = 0 + (9 - 0) / 2 = 4
  arr[4] = 16
  [ 2] [ 5] [ 8] [12] [16] [23] [38] [56] [72] [91]
   L                   ^                           R
                      mid=4  →  16 < 23, target is in right half
                               left = mid + 1 = 5

---

State: left=5, right=9
  [ 2] [ 5] [ 8] [12] [16] [23] [38] [56] [72] [91]
                            L         M           R
                            5         7           9

Step 2: mid = 5 + (9 - 5) / 2 = 7
  arr[7] = 56
  [ 2] [ 5] [ 8] [12] [16] [23] [38] [56] [72] [91]
                            L         ^           R
                                     mid=7  →  56 > 23, target is in left half
                                               right = mid - 1 = 6

---

State: left=5, right=6
  [ 2] [ 5] [ 8] [12] [16] [23] [38] [56] [72] [91]
                            L    M    R
                            5    5    6

Step 3: mid = 5 + (6 - 5) / 2 = 5
  arr[5] = 23
  [ 2] [ 5] [ 8] [12] [16] [23] [38] [56] [72] [91]
                            L    ^    R
                                mid=5  →  23 == 23, FOUND at index 5

Three steps to find the target in an array of 10. log₂(10) ≈ 3.32, so 3 steps is exactly what we expect.

Naive Implementation: Recursive Binary Search

fn binary_search_recursive(arr: &[i32], target: i32, left: usize, right: usize) -> i32 {
    if left > right {                              // search space is empty
        return -1;
    }
 
    let mid = left + (right - left) / 2;          // overflow-safe midpoint
 
    if arr[mid] == target {                        // found
        return mid as i32;
    } else if arr[mid] < target {                  // target is in the right half
        if mid + 1 > right {
            return -1;
        }
        binary_search_recursive(arr, target, mid + 1, right)
    } else {                                       // target is in the left half
        if mid == 0 {
            return -1;
        }
        binary_search_recursive(arr, target, left, mid - 1)
    }
}
 
fn main() {
    let arr = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91];
    let n = arr.len();
 
    let idx = binary_search_recursive(&arr, 23, 0, n - 1);
    println!("Found 23 at index: {}", idx);        // 5
 
    let idx = binary_search_recursive(&arr, 99, 0, n - 1);
    println!("Found 99 at index: {}", idx);        // -1
}

Line-by-line explanation

if left > right The base case for absence. When left exceeds right, the search space is empty — we have narrowed down to nothing and the target is not present. Notice: this comparison uses usize (unsigned integers in Rust). If right were 0 and we computed right - 1, we would get usize::MAX due to underflow — a silent bug. The guards if mid == 0 and if mid + 1 > right handle the boundary cases explicitly.

let mid = left + (right - left) / 2 The overflow-safe midpoint. The naive (left + right) / 2 overflows when left + right > usize::MAX (or i32::MAX in signed contexts). Subtracting first ensures the intermediate value stays bounded. This is one of the most common bugs in binary search implementations — Jon Bentley famously noted that the published binary search in his book "Programming Pearls" had this bug for over 20 years.

binary_search_recursive(arr, target, mid + 1, right) When arr[mid] < target, the target must be to the right of mid (because the array is sorted). We exclude mid itself by starting from mid + 1.

binary_search_recursive(arr, target, left, mid - 1) When arr[mid] > target, the target must be to the left. We exclude mid by ending at mid - 1. The if mid == 0 guard prevents underflow on usize.

The recursion tree

For an array of size n, the recursive calls form a tree:

binary_search(arr, 23, 0, 9)          ← n=10
  binary_search(arr, 23, 5, 9)        ← n=5
    binary_search(arr, 23, 5, 6)      ← n=2
      binary_search(arr, 23, 5, 5)    ← n=1, found!

Each level halves the remaining range. Depth = log₂(n). The recursion never branches — at most one child call per node. Total calls ≤ log₂(n) + 1.

Optimized Implementation: Iterative Binary Search

The recursive version is elegant but uses O(log n) stack space. The iterative version achieves the same logic with O(1) space by replacing the call stack with explicit pointer variables.

fn binary_search_iterative(arr: &[i32], target: i32) -> i32 {
    if arr.is_empty() {
        return -1;
    }
 
    let mut left: usize = 0;
    let mut right: usize = arr.len() - 1;
 
    while left <= right {
        let mid = left + (right - left) / 2;    // overflow-safe mid
 
        if arr[mid] == target {
            return mid as i32;                  // found
        } else if arr[mid] < target {
            left = mid + 1;                     // eliminate left half
        } else {
            if mid == 0 {
                break;                          // prevent usize underflow
            }
            right = mid - 1;                   // eliminate right half
        }
    }
 
    -1                                          // not found
}
 
fn main() {
    let arr = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91];
 
    println!("{}", binary_search_iterative(&arr, 23));   // 5
    println!("{}", binary_search_iterative(&arr, 2));    // 0 (first element)
    println!("{}", binary_search_iterative(&arr, 91));   // 9 (last element)
    println!("{}", binary_search_iterative(&arr, 99));   // -1 (absent)
}

What changed and why

Recursive → iterative: Instead of passing left and right as function arguments and making a new call each time, we mutate left and right directly in a while loop. The logic is identical — compare at mid, move left or right — but no new stack frame is created. Each iteration reuses the same activation record.

while left <= right: The loop continues as long as there is at least one element in the search space. When left > right, the space is empty.

left = mid + 1 vs right = mid - 1: After a comparison, we exclude mid itself from the next iteration — we already checked it, so including it would create an infinite loop (if mid never changes). Adding 1 or subtracting 1 shrinks the range.

The overflow-safe mid is even more critical here: In languages like Java (signed 32-bit ints), (left + right) / 2 overflows when both are large. In Rust with usize, the values cannot be negative, but can still exceed usize::MAX on a 32-bit target. left + (right - left) / 2 is always safe.

Bonus Variants: Finding Boundaries in Arrays with Duplicates

Standard binary search finds an occurrence of the target. When duplicates exist, it may land on any of them. Two variants find the leftmost or rightmost occurrence specifically.

Array with duplicates: [1, 3, 3, 3, 5, 7, 9]

first_occurrence(arr, 3) → 1
last_occurrence(arr, 3) → 3

first_occurrence — leftmost index

fn first_occurrence(arr: &[i32], target: i32) -> i32 {
    if arr.is_empty() {
        return -1;
    }
 
    let mut left: usize = 0;
    let mut right: usize = arr.len() - 1;
    let mut result: i32 = -1;              // tracks the best (leftmost) match found so far
 
    while left <= right {
        let mid = left + (right - left) / 2;
 
        if arr[mid] == target {
            result = mid as i32;           // record this match
            if mid == 0 {
                break;                     // cannot go further left
            }
            right = mid - 1;              // keep searching LEFT for an earlier occurrence
        } else if arr[mid] < target {
            left = mid + 1;
        } else {
            if mid == 0 {
                break;
            }
            right = mid - 1;
        }
    }
 
    result
}

The key difference: when arr[mid] == target, instead of returning immediately, we record the index in result and continue searching to the left (right = mid - 1). This keeps narrowing until the search space is empty, at which point result holds the leftmost index.

Array: [1, 3, 3, 3, 5, 7, 9], target = 3

Step 1: left=0, right=6, mid=3 → arr[3]=3 == 3
        result=3, continue left: right=2

Step 2: left=0, right=2, mid=1 → arr[1]=3 == 3
        result=1, continue left: right=0

Step 3: left=0, right=0, mid=0 → arr[0]=1 < 3
        left=1

Step 4: left=1 > right=0 → loop exits

Return result=1 ✓

last_occurrence — rightmost index

fn last_occurrence(arr: &[i32], target: i32) -> i32 {
    if arr.is_empty() {
        return -1;
    }
 
    let mut left: usize = 0;
    let mut right: usize = arr.len() - 1;
    let mut result: i32 = -1;             // tracks the best (rightmost) match found so far
 
    while left <= right {
        let mid = left + (right - left) / 2;
 
        if arr[mid] == target {
            result = mid as i32;          // record this match
            left = mid + 1;              // keep searching RIGHT for a later occurrence
        } else if arr[mid] < target {
            left = mid + 1;
        } else {
            if mid == 0 {
                break;
            }
            right = mid - 1;
        }
    }
 
    result
}

Symmetric to first_occurrence: when a match is found, move right (left = mid + 1) to search for later occurrences. result keeps the rightmost index seen so far.

Array: [1, 3, 3, 3, 5, 7, 9], target = 3

Step 1: left=0, right=6, mid=3 → arr[3]=3 == 3
        result=3, continue right: left=4

Step 2: left=4, right=6, mid=5 → arr[5]=7 > 3
        right=4

Step 3: left=4, right=4, mid=4 → arr[4]=5 > 3
        right=3

Step 4: left=4 > right=3 → loop exits

Return result=3 ✓

Count occurrences using both variants

fn count_occurrences(arr: &[i32], target: i32) -> i32 {
    let first = first_occurrence(arr, target);
    if first == -1 {
        return 0;
    }
    let last = last_occurrence(arr, target);
    last - first + 1
}
 
fn main() {
    let arr = [1, 3, 3, 3, 5, 7, 9];
    println!("{}", count_occurrences(&arr, 3));  // 3
    println!("{}", count_occurrences(&arr, 5));  // 1
    println!("{}", count_occurrences(&arr, 2));  // 0
}

Two binary searches, each O(log n), gives O(log n) total — versus O(n) for a linear count.

Time Complexity

Deriving O(log n) from first principles

After each comparison, the search space is cut in half. Start with n elements.

After step 0 (before any comparisons): n elements
After step 1: n/2 elements
After step 2: n/4 elements
After step 3: n/8 elements
...
After step k: n / 2^k elements

The algorithm terminates when the search space has at most 1 element:

n / 2^k = 1
n = 2^k
k = log₂(n)

So after at most log₂(n) steps (comparisons), binary search either finds the target or confirms its absence. This is the worst case.

For n = 10 (our example): log₂(10) ≈ 3.32, so at most 4 steps. We used exactly 3. For n = 1,000,000: log₂(1,000,000) ≈ 19.93, so at most 20 steps. For n = 1,000,000,000: log₂(10⁹) ≈ 29.9, so at most 30 steps.

This is the power of logarithms. Multiplying n by one million adds only 20 more steps.

Best Case — O(1)

The target is at the midpoint of the initial array.

arr = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91], target = 16

Step 1: left=0, right=9, mid=4 → arr[4]=16 == 16 → return 4

One comparison, done. Probability of this: 1/n for a random target.

Average Case — O(log n)

For a uniformly random target position, the expected number of comparisons is:

E[comparisons] = sum over all positions i of (depth of i in the search tree) × (1/n)

The search tree for binary search is a complete binary tree of depth log₂(n). The sum of all depths in a complete binary tree of n nodes is approximately n log₂(n) (by the formula Σ floor(log₂(i)) for i=1 to n). Dividing by n: O(log n).

More intuitively: most elements live deep in the tree (close to the leaves), but the tree is shallow — maximum depth log₂(n). There is no configuration of the target that forces more than log₂(n) steps. Average depth ≈ log₂(n) - 1.

Worst Case — O(log n)

The target is absent, or it is at a leaf position in the search tree.

arr = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91], target = 99

Step 1: left=0, right=9, mid=4 → arr[4]=16 < 99 → left=5
Step 2: left=5, right=9, mid=7 → arr[7]=56 < 99 → left=8
Step 3: left=8, right=9, mid=8 → arr[8]=72 < 99 → left=9
Step 4: left=9, right=9, mid=9 → arr[9]=91 < 99 → left=10
Step 5: left=10 > right=9 → loop exits, return -1

5 steps for n=10. ⌈log₂(10)⌉ = 4, plus one for the final empty-space check. Still O(log n).

Summary table

Case	Condition	Comparisons	Complexity
Best	Target at initial midpoint	1	O(1)
Average	Target at random position	~log₂(n) - 1	O(log n)
Worst	Target absent or at a leaf	⌈log₂(n)⌉	O(log n)

Unlike linear search, best and worst case both simplify to the same class (ignoring the O(1) best case): O(log n). This makes binary search highly predictable in practice.

Space Complexity

Iterative implementation — O(1)

The iterative version uses exactly three mutable variables beyond its input: left, right, and mid. These are single integers — they do not grow with n. No data structures, no allocations, no recursion.

Space: O(1).

Recursive implementation — O(log n)

Each recursive call creates a new stack frame. In Rust, a stack frame for binary_search_recursive contains:

arr: a fat pointer (address + length) — 2 × 8 = 16 bytes on 64-bit systems
target: 4 bytes
left, right: 8 bytes each on 64-bit
The return address and saved registers: ~16-32 bytes
Total: roughly 50-60 bytes per frame

The maximum recursion depth is ⌈log₂(n)⌉. For n = 1,000,000, that is 20 frames × ~60 bytes = ~1,200 bytes. Small in absolute terms, but the growth is O(log n).

binary_search_recursive(arr, 23, 0, 9)      ← frame 1
  binary_search_recursive(arr, 23, 5, 9)    ← frame 2
    binary_search_recursive(arr, 23, 5, 6)  ← frame 3
      binary_search_recursive(arr, 23, 5, 5) ← frame 4 (found, returns)
      ↑ frame 4 pops
    ↑ frame 3 pops
  ↑ frame 2 pops
↑ frame 1 returns result

At any point in time, only one path from root to current node is live on the stack. That path has length at most log₂(n).

Space: O(log n).

Iterative vs recursive: the tradeoff in binary search

Property	Recursive	Iterative
Stack space	O(log n)	O(1)
Readability	Close to mathematical definition	Slightly more verbose
Tail-call optimization	Rust does not guarantee TCO	N/A
Risk of stack overflow	Negligible (log n is tiny)	None

For binary search specifically, the O(log n) stack depth is so small that stack overflow is essentially impossible. Even for n = 2⁶⁴ ≈ 1.8 × 10¹⁹ (more elements than atoms you could store), the maximum depth would be 64 frames. The iterative form is still preferred in production code — not for safety reasons, but for clarity and the principle that O(1) space is always at least as good as O(log n).

Rust note: Rust does not perform tail-call optimization (TCO). The recursive call binary_search_recursive(arr, target, mid + 1, right) is a tail call — it is the last operation in the function — but Rust does not transform it into a jump. Each call still allocates a new frame. If you want guaranteed O(1) stack space, write the iterative form.

When to Use Binary Search

Binary search is the right choice when:

Data is sorted — this is non-negotiable. Sorting first costs O(n log n); if you search k times, the total is O(n log n + k log n). Linear search costs O(kn). Binary search wins when k > n / log n (roughly).
The array is static — if data changes frequently, maintaining sort order has cost. Consider sorted containers (BTreeMap in Rust) or re-sorting as needed.
Exact match or boundary queries — "does X exist?" "what is the leftmost position ≥ X?" "count occurrences of X?" All binary search variants.
Searching answer spaces — binary search is not just for arrays. Any monotone predicate over an ordered domain (integers, floats, time) can be binary searched. "Find the smallest k such that f(k) is true" when f is monotone — binary search on k.

The standard library version: Rust's slice::binary_search returns Result<usize, usize>:

Ok(i) — target found at index i.
Err(i) — target not found; i is the index where it would be inserted to maintain sorted order.

fn main() {
    let arr = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91];
 
    match arr.binary_search(&23) {
        Ok(i)  => println!("Found at index {}", i),   // Found at index 5
        Err(i) => println!("Not found, insert at {}", i),
    }
 
    match arr.binary_search(&24) {
        Ok(i)  => println!("Found at index {}", i),
        Err(i) => println!("Not found, insert at {}", i),  // Not found, insert at 6
    }
}

The insertion index returned on Err is directly useful for maintaining sorted order on insertion — no extra search needed.