Fundamentals 18 min read

Binary Search: Fundamentals, Common Implementations, and Applications

Binary search is an efficient O(log n) algorithm for locating elements in sorted arrays or intervals, and this article explains its basic principles, recursive and iterative implementations, handling of edge cases, applications such as log file indexing, and extensions to rotated arrays and variable‑length data.

Rare Earth Juejin Tech Community

Feb 29, 2024

Binary Search: Fundamentals, Common Implementations, and Applications

Basic Introduction

Binary search (also known as half‑interval search) is a search algorithm used to find a specific element in a sorted array.

By definition, binary search requires the array (or interval) to be sorted; the input can also be given as a start and end index of a range.

"Its time complexity is O(log n), very efficient."

Basic Characteristics

The drawback is that the array or interval to be searched must be sorted.

If the array undergoes frequent insertions and deletions, the average complexity degrades to O(n).

Therefore, when the input array or interval is sorted and does not change frequently, and we need to locate an element that satisfies a condition, binary search is the optimal choice.

Problem‑Solving Idea

The general binary‑search solving steps are as follows:

Take the middle element of the sorted array or interval and check whether it satisfies the search condition; if it does, stop the search.

If the middle element does not satisfy the condition, decide which half (left or right) to continue searching based on the ordering of the array.

Repeat the process on the chosen half until the target is found or the interval becomes empty.

Binary Search Framework

int binarySearch(int[] nums, int target) {
    int left = 0, right = ...;
    while(...) {
        // calculate mid safely to avoid overflow: mid = left + (right - left) / 2
        int mid = (right + left) / 2;
        if (nums[mid] == target) {
            ...
        } else if (nums[mid] < target) {
            left = ...
        } else if (nums[mid] > target) {
            right = ...
        }
    }
    return ...;
}

Common Solutions

Recursive Solution

Assume we want to check whether the number 7 exists in the sorted array {1, 3, 4, 6, 7, 8, 10, 13, 14}.

The recursive template is as follows:

int binarySearch(int[] nums, int target, int low, int high) {
    // If the interval is illegal, return -1
    if (low > high) {
        return -1;
    }
    // Compute the middle index safely
    int middle = low + (high - low) / 2;
    // If the middle element is the target, return its index
    if (nums[middle] == target) {
        return middle;
    }
    // If target is smaller, search the left half
    if (target < nums[middle]) {
        return binarySearch(nums, target, low, middle - 1);
    } else {
        // Otherwise search the right half
        return binarySearch(nums, target, middle + 1, high);
    }
}

Note:

When calculating [low, middle - 1] and [middle + 1, high] you must use closed intervals because the middle element has already been examined.

For an odd‑length array the middle is the exact center; for an even‑length array the middle is the left‑center element (computed with low + (high - low) / 2).

The final time complexity remains O(log n).

Iterative (Non‑Recursive) Solution

The iterative template is as follows:

int binarySearch(int[] nums, int target, int low, int high) {
    while (low <= high) {
        int middle = low + (high - low) / 2;
        if (nums[middle] == target) {
            return middle;
        }
        if (target < nums[middle]) {
            high = middle - 1;
        } else {
            low = middle + 1;
        }
    }
    // Target not found
    return -1;
}

Why the while condition uses <= instead of <?

Answer: because high is initialized to nums.length - 1, the index of the last element, not to nums.length. Using [left, right] (both ends inclusive) matches the valid index range, whereas [left, right) would make right out of bounds.

Practical Application

Kafka is a widely used distributed message queue. All messages are stored as logs, and each topic consists of multiple partitions, each of which is an ordered sequence of log files.

To locate a message by its offset efficiently, an index file is built that maps offsets to file positions; binary search can then be used on the index because offsets are ordered.

00000000000000000000.log
00000000000000000000.index
00000000000000000000.timeindex
00000000000000000035.log
00000000000000000035.index
00000000000000000035.timeindex

.log

stores the actual message bodies. .index stores positions for fast lookup. .timeindex stores positions keyed by timestamps.

Example Analyses

Finding Exact Boundaries

LeetCode 34 asks for the first and last positions of a target value in a sorted array. The first occurrence is the lower bound, the last occurrence is the upper bound.

int searchLowerBound(int[] nums, int target, int low, int high) {
    if (low > high) return -1;
    int middle = low + (high - low) / 2;
    if (nums[middle] == target && (middle == 0 || nums[middle - 1] < target)) {
        return middle;
    }
    if (target <= nums[middle]) {
        return searchLowerBound(nums, target, low, middle - 1);
    } else {
        return searchLowerBound(nums, target, middle + 1, high);
    }
}

int searchUpperBound(int[] nums, int target, int low, int high) {
    if (low > high) return -1;
    int middle = low + (high - low) / 2;
    if (nums[middle] == target && (middle == nums.length - 1 || nums[middle + 1] > target)) {
        return middle;
    }
    if (target < nums[middle]) {
        return searchUpperBound(nums, target, low, middle - 1);
    } else {
        return searchUpperBound(nums, target, middle + 1, high);
    }
}

Finding a Fuzzy Boundary

To find the first element greater than a given value, the condition is that the element is larger than the target and either it is the first element or the previous element is ≤ the target.

Integer firstGreaterThan(int[] nums, int target, int low, int high) {
    if (low > high) return null;
    int middle = low + (high - low) / 2;
    if (nums[middle] > target && (middle == 0 || nums[middle - 1] <= target)) {
        return middle;
    }
    if (target < nums[middle]) {
        return firstGreaterThan(nums, target, low, middle - 1);
    } else {
        return firstGreaterThan(nums, target, middle + 1, high);
    }
}

Searching in a Rotated Sorted Array

LeetCode 33 requires searching a target in a rotated sorted array without duplicates.

int binarySearch(int[] nums, int target, int low, int high) {
    if (low > high) return -1;
    int middle = low + (high - low) / 2;
    if (nums[middle] == target) return middle;
    if (nums[low] <= nums[middle]) { // left half is sorted
        if (nums[low] <= target && target < nums[middle]) {
            return binarySearch(nums, target, low, middle - 1);
        }
        return binarySearch(nums, target, middle + 1, high);
    } else { // right half is sorted
        if (nums[middle] < target && target <= nums[high]) {
            return binarySearch(nums, target, middle + 1, high);
        }
        return binarySearch(nums, target, low, middle - 1);
    }
}

Searching an Unknown‑Length Log

When the length of a log file is unknown, we can first exponentially increase an upper bound until a null entry is encountered, then perform a standard binary search within that range.

// Find an upper bound where logs[high] becomes null
int getUpperBound(String[] logs, int high) {
    if (logs[high] == null) {
        return high;
    }
    return getUpperBound(logs, high * 2);
}

// Binary search for the first null entry (i.e., the log length)
int binarySearch(String[] logs, int low, int high) {
    if (low > high) return -1;
    int middle = low + (high - low) / 2;
    if (logs[middle] == null && logs[middle - 1] != null) {
        return middle;
    }
    if (logs[middle] == null) {
        return binarySearch(logs, low, middle - 1);
    } else {
        return binarySearch(logs, middle + 1, high);
    }
}

Conclusion

The article content has been collected on a personal website for easy reading: http://hardyfish.top/

If you found it useful, please like, view, and share it. Thank you!

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

algorithm Recursion Binary Search Search iteration logarithmic complexity sorted-array

Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.