How We Scaled JD’s UGC Platform with Elasticsearch: A Backend Architecture Deep Dive

This case study details how JD’s "Browse" UGC project evolved from rapid agile delivery to a performance bottleneck as data grew, and how introducing Elasticsearch, redesigning the query flow, and refactoring storage components restored fast, flexible searches for both front‑end and operations users.

JD Retail Technology
JD Retail Technology
JD Retail Technology
How We Scaled JD’s UGC Platform with Elasticsearch: A Backend Architecture Deep Dive

Background

JD’s "Browse" project is a user‑generated content (UGC) platform launched last year. The team used agile development to deliver features quickly, meeting initial deadlines and ensuring stable online operation.

Emerging Problems

As user‑generated reviews grew from hundreds of thousands to millions, associated images, videos, and product data reached tens of millions. The operations side began experiencing slow queries and keyword‑search failures because MySQL‑based JED elastic databases killed long‑running queries.

These issues were anticipated in the original design, but data growth far outpaced expectations, prompting a systematic optimization effort.

Current State Analysis

Data storage uses JED as the primary database, Hive for T+1 offline analysis, and JIMDB as a cache. Front‑end queries (by primary key, user PIN, or fixed attributes) work well with simple composite indexes, handling tens of millions of records.

However, the operations backend requires arbitrary attribute combinations, joins, and fuzzy keyword searches. MySQL‑based JED struggles with such queries at this scale, and ShardingSphere’s sharding limits complex SQL support, causing many product‑side query requirements to be unfulfillable.

System Architecture

The project is divided into three subsystems:

SOA service layer – provides business‑related services.

Common capability component – offers storage and query services via JSF interfaces and JMQ messages.

CMS operations management system – gives operators management, review, and query capabilities.

The bottleneck resides in the database query stage: MySQL queries time out or run slowly, causing the CMS backend to fail.

Optimization Plan

After evaluating alternatives, the team introduced Elasticsearch as a dedicated query engine to satisfy keyword and multi‑condition searches, and to overcome JED’s limitations with sharding and complex SQL.

Key changes to the write path:

Modify the existing write API: after a successful JED insert, also write to Elasticsearch (failure of ES write does not block the transaction).

Add a scheduled task to sync yesterday’s incremental data to ES, ensuring eventual consistency.

After the API change, bulk‑migrate existing data from JED to ES to avoid gaps.

The design treats ES as a query engine rather than a primary data store; strong consistency between the relational DB and ES is not required because the platform does not involve transactional flows.

Re‑architected Query Flow

Originally, the CMS backend sent query parameters to the common component, which built a dynamic SQL statement for MyBatis to execute against MySQL.

The new flow adds an ES query converter alongside the existing MySQL converter. The original query parameters remain unchanged; the converter translates them into Elasticsearch DSL using the High‑Level REST client, employing filters, must, and must_not clauses. A switch allows operators to choose between MySQL and ES at runtime.

Results and Reflection

Post‑migration, the system eliminated MySQL’s weak fuzzy‑search performance, dramatically speeding up queries and resolving the CMS failures. Elasticsearch also handled the previously unsupported join and complex SQL scenarios, laying a solid foundation for future scaling.

Reflection: Rapid delivery often forces architectural compromises, but core designs—such as the query parameter and converter framework—must be carefully crafted. Anticipating data growth and incorporating scalable search solutions early can prevent costly re‑engineering later.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance OptimizationBackend ArchitectureElasticsearchdatabase scalingQuery EngineJD UGC Platform
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.