Binary Search: Fundamentals, Common Implementations, and Applications
Binary search is an efficient O(log n) algorithm for locating elements in sorted arrays or intervals, and this article explains its basic principles, recursive and iterative implementations, handling of edge cases, applications such as log file indexing, and extensions to rotated arrays and variable‑length data.
Basic Introduction
Binary search (also known as half‑interval search) is a search algorithm used to find a specific element in a sorted array.
By definition, binary search requires the array (or interval) to be sorted; the input can also be given as a start and end index of a range.
"Its time complexity is O(log n), very efficient."
Basic Characteristics
The drawback is that the array or interval to be searched must be sorted.
If the array undergoes frequent insertions and deletions, the average complexity degrades to O(n).
Therefore, when the input array or interval is sorted and does not change frequently, and we need to locate an element that satisfies a condition, binary search is the optimal choice.
Problem‑Solving Idea
The general binary‑search solving steps are as follows:
Take the middle element of the sorted array or interval and check whether it satisfies the search condition; if it does, stop the search.
If the middle element does not satisfy the condition, decide which half (left or right) to continue searching based on the ordering of the array.
Repeat the process on the chosen half until the target is found or the interval becomes empty.
Binary Search Framework
int binarySearch(int[] nums, int target) {
int left = 0, right = ...;
while(...) {
// calculate mid safely to avoid overflow: mid = left + (right - left) / 2
int mid = (right + left) / 2;
if (nums[mid] == target) {
...
} else if (nums[mid] < target) {
left = ...
} else if (nums[mid] > target) {
right = ...
}
}
return ...;
}Common Solutions
Recursive Solution
Assume we want to check whether the number 7 exists in the sorted array {1, 3, 4, 6, 7, 8, 10, 13, 14}.
The recursive template is as follows:
int binarySearch(int[] nums, int target, int low, int high) {
// If the interval is illegal, return -1
if (low > high) {
return -1;
}
// Compute the middle index safely
int middle = low + (high - low) / 2;
// If the middle element is the target, return its index
if (nums[middle] == target) {
return middle;
}
// If target is smaller, search the left half
if (target < nums[middle]) {
return binarySearch(nums, target, low, middle - 1);
} else {
// Otherwise search the right half
return binarySearch(nums, target, middle + 1, high);
}
}Note:
When calculating [low, middle - 1] and [middle + 1, high] you must use closed intervals because the middle element has already been examined.
For an odd‑length array the middle is the exact center; for an even‑length array the middle is the left‑center element (computed with low + (high - low) / 2 ).
The final time complexity remains O(log n).
Iterative (Non‑Recursive) Solution
The iterative template is as follows:
int binarySearch(int[] nums, int target, int low, int high) {
while (low <= high) {
int middle = low + (high - low) / 2;
if (nums[middle] == target) {
return middle;
}
if (target < nums[middle]) {
high = middle - 1;
} else {
low = middle + 1;
}
}
// Target not found
return -1;
}Why the while condition uses <= instead of <?
Answer: because high is initialized to nums.length - 1 , the index of the last element, not to nums.length . Using [left, right] (both ends inclusive) matches the valid index range, whereas [left, right) would make right out of bounds.
Practical Application
Kafka is a widely used distributed message queue. All messages are stored as logs, and each topic consists of multiple partitions, each of which is an ordered sequence of log files.
To locate a message by its offset efficiently, an index file is built that maps offsets to file positions; binary search can then be used on the index because offsets are ordered.
00000000000000000000.log
00000000000000000000.index
00000000000000000000.timeindex
00000000000000000035.log
00000000000000000035.index
00000000000000000035.timeindex.log stores the actual message bodies.
.index stores positions for fast lookup.
.timeindex stores positions keyed by timestamps.
Example Analyses
Finding Exact Boundaries
LeetCode 34 asks for the first and last positions of a target value in a sorted array. The first occurrence is the lower bound, the last occurrence is the upper bound.
int searchLowerBound(int[] nums, int target, int low, int high) {
if (low > high) return -1;
int middle = low + (high - low) / 2;
if (nums[middle] == target && (middle == 0 || nums[middle - 1] < target)) {
return middle;
}
if (target <= nums[middle]) {
return searchLowerBound(nums, target, low, middle - 1);
} else {
return searchLowerBound(nums, target, middle + 1, high);
}
}
int searchUpperBound(int[] nums, int target, int low, int high) {
if (low > high) return -1;
int middle = low + (high - low) / 2;
if (nums[middle] == target && (middle == nums.length - 1 || nums[middle + 1] > target)) {
return middle;
}
if (target < nums[middle]) {
return searchUpperBound(nums, target, low, middle - 1);
} else {
return searchUpperBound(nums, target, middle + 1, high);
}
}Finding a Fuzzy Boundary
To find the first element greater than a given value, the condition is that the element is larger than the target and either it is the first element or the previous element is ≤ the target.
Integer firstGreaterThan(int[] nums, int target, int low, int high) {
if (low > high) return null;
int middle = low + (high - low) / 2;
if (nums[middle] > target && (middle == 0 || nums[middle - 1] <= target)) {
return middle;
}
if (target < nums[middle]) {
return firstGreaterThan(nums, target, low, middle - 1);
} else {
return firstGreaterThan(nums, target, middle + 1, high);
}
}Searching in a Rotated Sorted Array
LeetCode 33 requires searching a target in a rotated sorted array without duplicates.
int binarySearch(int[] nums, int target, int low, int high) {
if (low > high) return -1;
int middle = low + (high - low) / 2;
if (nums[middle] == target) return middle;
if (nums[low] <= nums[middle]) { // left half is sorted
if (nums[low] <= target && target < nums[middle]) {
return binarySearch(nums, target, low, middle - 1);
}
return binarySearch(nums, target, middle + 1, high);
} else { // right half is sorted
if (nums[middle] < target && target <= nums[high]) {
return binarySearch(nums, target, middle + 1, high);
}
return binarySearch(nums, target, low, middle - 1);
}
}Searching an Unknown‑Length Log
When the length of a log file is unknown, we can first exponentially increase an upper bound until a null entry is encountered, then perform a standard binary search within that range.
// Find an upper bound where logs[high] becomes null
int getUpperBound(String[] logs, int high) {
if (logs[high] == null) {
return high;
}
return getUpperBound(logs, high * 2);
}
// Binary search for the first null entry (i.e., the log length)
int binarySearch(String[] logs, int low, int high) {
if (low > high) return -1;
int middle = low + (high - low) / 2;
if (logs[middle] == null && logs[middle - 1] != null) {
return middle;
}
if (logs[middle] == null) {
return binarySearch(logs, low, middle - 1);
} else {
return binarySearch(logs, middle + 1, high);
}
}Conclusion
The article content has been collected on a personal website for easy reading: http://hardyfish.top/
If you found it useful, please like, view, and share it. Thank you!
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.