Big Data 10 min read

Elasticsearch Optimization and Index Splitting Strategies in the Youzan Search System

The Youzan search system uses middleware‑driven Elasticsearch optimizations—segment merging, larger buffers, routing, and rollover—to cut index files and document scans, splits large indices into business‑specific or hot‑cold sub‑indices, and adds asynchronous cross‑datacenter replication with soft‑delete versioning for high‑availability and scalable performance.

Youzan Coder

Sep 14, 2018

Elasticsearch Optimization and Index Splitting Strategies in the Youzan Search System

The article discusses the architectural evolution of the Youzan search system and the need for middleware to meet increasing stability and performance requirements beyond basic Elasticsearch maintenance.

Elasticsearch query execution efficiency is expressed as O(num_of_files * logN), where num_of_files is the number of index file segments and N is the amount of data to traverse. Performance can be improved by (1) reducing the number of index files traversed and (2) reducing the total number of documents examined.

To reduce index file count, the article recommends using the optimize API to force segment merges and increasing index.buffer / refresh_interval to limit the creation of small segments. These techniques have trade‑offs: a larger refresh interval delays data visibility, and optimize is suitable mainly for cold data sets.

To reduce the total document count, the article suggests minimizing document updates, using explicit _routing to direct queries to specific shards, and employing the rollover API for hot‑cold data isolation.

Because frequent updates generate new documents plus tombstones for deleted ones, reducing update frequency also lowers merge and refresh overhead.

When a single large index stores massive data, the fixed number of shards can become a bottleneck. To improve horizontal scalability, the middleware performs index splitting: a large index is divided into multiple smaller indices based on business rules. Splitting is applied only when three conditions are met: (1) read/write operations always include a fixed condition, (2) the read/write dimension is unique, and (3) users do not need global search results. An example is shop‑specific product search, where each shop’s data can be routed to its own sub‑index.

Index splitting reduces the total document volume each query must scan, yielding noticeable performance gains. The article also describes a logical‑to‑physical re‑balancing step that hashes logical indices into a configurable number of physical indices, reducing data skew.

For scenarios where query dimensions are not unique, hot‑cold isolation is recommended. Time‑based data such as logs or orders can be split into hot and cold indices using Elasticsearch’s rollover API. The middleware abstracts the rollover logic so that users interact with a fixed logical index and only need to provide a time field.

Routing tables define time‑span based rules (e.g., early stages use a 50‑day span, later stages shorten to 10 days). Query time ranges are mapped to the appropriate sub‑indices, and write operations target the current active index. This approach supports multi‑dimensional queries while allowing flexible rule adjustments and fast deletion of expired data by dropping whole indices.

High availability (HA) is addressed by cross‑datacenter replication. Direct cross‑datacenter Elasticsearch clusters are avoided due to bandwidth concerns. Instead, a middleware‑level asynchronous replication system is built: the proxy initiates incremental sync, sends messages to a message queue, and uses Elasticsearch’s reindex API for full copies.

To maintain consistency during asynchronous replication, optimistic locking with version numbers is used. Physical deletions break version continuity, so the solution adopts soft deletes: delete operations are transformed into index operations containing a special flag, making the document invisible to normal searches while preserving version history.

An auxiliary data reconciliation system runs in real time to verify and repair master‑slave index consistency and to measure synchronization latency.

In summary, the article outlines the overall framework of the Youzan search system, covering index optimization, splitting, hot‑cold isolation, HA replication, and middleware‑level enhancements.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Elasticsearch High Availability middleware Index Optimization Hot/Cold Isolation search scaling

Written by

Youzan Coder

Official Youzan tech channel, delivering technical insights and occasional daily updates from the Youzan tech team.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.