Understanding GeoHash: Principles, Encoding Process, and Application in Ride‑Hailing
This article introduces the GeoHash algorithm, explains how latitude and longitude are recursively bisected into binary strings, compressed with Base32, and demonstrates its use for efficiently locating nearby drivers in ride‑hailing services while discussing precision trade‑offs and edge cases.
Background
When a passenger requests a ride, the system must quickly determine which drivers are geographically close. The naive approach stores each (longitude, latitude) pair in an array and scans the entire dataset, which becomes infeasible as the number of records grows.
The article explores how to optimize this lookup using the GeoHash algorithm.
GeoHash Basic Principles
GeoHash converts a coordinate into a binary string by repeatedly bisecting the longitude and latitude ranges. Each bisection yields a bit (1 if the value is greater than the midpoint, otherwise 0). The resulting bits for longitude and latitude are interleaved, forming a single binary sequence that is then encoded with Base32 for compact storage.
Example: for the coordinate (116.3111126, 40.085003), the longitude and latitude are each bisected 15 times, producing the binary strings 110100101011010 and 101110010000001 respectively. Interleaving these bits yields the combined binary string 111001110100100110001010001001 , which is split into 5‑bit groups and mapped to Base32 characters, resulting in a 6‑character GeoHash.
bit
left
mid
right
1
-180
0
180
1
0
90
180
0
90
135
180
1
90
112.5
135
0
112.5
123.75
135
0
112.5
118.125
123.75
1
112.5
115.3125
118.125
0
115.3125
116.71875
118.125
1
115.3125
116.015625
116.71875
0
116.015625
116.3671875
116.71875
1
116.015625
116.19140625
116.3671875
1
116.19140625
116.279296875
116.3671875
0
116.279296875
116.323242188
116.3671875
1
116.279296875
116.301269532
116.323242188
0
116.301269532
116.31225586
116.323242188
The same bisection process is applied to latitude, producing its own binary sequence (shown in the original article’s second table).
Applying GeoHash to Proximity Search
Because GeoHash strings share longer common prefixes when coordinates are closer, a simple prefix match can retrieve all points within a desired radius. This property enables fast identification of nearby drivers without scanning the entire dataset.
Remaining Challenges
GeoHash cells cover rectangular areas, so two points near the edge of adjacent cells may have different prefixes despite being physically closer than points sharing the same prefix. This edge‑case can lead to inaccurate proximity results, and further techniques are needed to mitigate it.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.