Backend Development 22 min read

Evolution and Optimization of Numeric Indexing for Geolocation in Elasticsearch

This article reviews the evolution and optimization of Elasticsearch's numeric indexing for geolocation from 2015 to present, covering early string-based methods, KD‑Tree, Quadtree, and BKD‑tree implementations, and explains how these advances enable millisecond‑level POI searches using geo_distance queries.

Architecture Digest
Architecture Digest
Architecture Digest
Evolution and Optimization of Numeric Indexing for Geolocation in Elasticsearch

Business Background

LBS services require fast "search nearby POI" queries; Elasticsearch provides millisecond‑level geo_distance queries to satisfy this need.

Background Knowledge

It explains how to precisely locate an address using latitude/longitude, compute distances with the Haversine formula, and share coordinates via Geohash.

GET /my_locations/_search { "query": { "bool": { "must": { "match_all": {} }, "filter": { "geo_distance": { "distance": "1km", "pin.location": { "lat": 40, "lon": 116 } } } } } }

Solution Evolution

Pre‑2.0 (String Simulation) – Elasticsearch relied on Lucene's inverted index and simulated numeric ranges with term prefixes.

Elasticsearch 2.0 – Introduced geo_distance using numeric range queries on separate lat and lon fields, calculating a bounding rectangle and then applying a Haversine filter.

public static DistanceBoundingCheck distanceBoundingCheck(double sourceLatitude, double sourceLongitude, double distance, DistanceUnit unit) { ... }

Elasticsearch 2.2 – Added Quadtree‑based indexing, storing lat/lon as a single Morton‑encoded numeric field, enabling more efficient coarse filtering before precise distance checks.

double centerLon = 116.433322;
double centerLat = 39.900255;
double radiusMeters = 1000.0;
GeoRect geoRect = GeoUtils.circleToBBox(centerLon, centerLat, radiusMeters);
System.out.println(geoRect);

Elasticsearch 5.0+ – Switched to BKD‑tree (a multidimensional B‑tree) for numeric and geo indexing, offering superior memory usage and query speed. Queries intersect the query rectangle with BKD‑tree cells to quickly include or exclude large groups of points.

// Core query class
public class GeoPointDistanceQuery extends Query { ... }

The article concludes that these indexing advances have transformed Elasticsearch from a pure full‑text engine into a versatile analytics platform capable of handling high‑performance geospatial queries, and hints at future directions such as R‑Tree support for shape indexing.

References

https://www.elastic.co/cn/blog/lucene-points-6.0

https://www.cs.cmu.edu/~ckingsf/bioinfo-lectures/kdtrees.pdf

https://www.csee.usf.edu/~tuy/Literature/KDtree-CACM75.pdf

ElasticsearchKD-TreeBKD-TreeQuadtreegeo_distancenumeric indexing
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.