Databases 11 min read

Optimizing Query Performance for JD's BIP Procurement System with JED, JimKV, and Elasticsearch

This article details how JD's BIP procurement system tackled massive query‑performance challenges by segmenting order data, leveraging the JED distributed MySQL solution, introducing JimKV for hot‑data caching, and offloading complex searches to Elasticsearch, resulting in dramatically reduced load and faster user experiences.

JD Tech
JD Tech
JD Tech
Optimizing Query Performance for JD's BIP Procurement System with JED, JimKV, and Elasticsearch

With decades of growth, JD's BIP procurement system reached unprecedented scale, exposing frequent database query performance issues. The article presents a real‑world optimization case aimed at improving user experience and reducing technical risk.

System Overview

The BIP system supports various order creation, approval, and feedback functions. It relies on JED (a MySQL‑based distributed database) and JimKV/JimDB (a distributed KV store) for storage.

Root Causes

Complex queries with many filter conditions, hard to change at the interface level.

Massive data volume due to continuous business growth.

Data model spanning 20+ tables, requiring extensive multi‑table joins.

Business & Technical Pain Points

Slow order‑list queries and exports, especially under data skew.

Inefficient query conditions (e.g., archive flag, warehouse receipt flag).

Numerous slow SQLs, large tables causing DDL bottlenecks, and scattered index fields.

Goals

Business: improve core query/export experience and streamline query conditions.

Technical: reduce slow SQLs, lower table sizes, and simplify the data model.

Solution Overview

1. Reduce query data volume by isolating "in‑stock" orders (≈8% of total) into a separate "in‑stock order" concept and storing them in JimKV for fast hot‑data access.

2. Enhance complex query capability by synchronizing non‑in‑stock orders to Elasticsearch, partitioning indices by month, and routing queries based on order ID and time range.

3. Implement a dynamic query routing layer that directs requests to the appropriate storage (JED or ES) based on user, latency requirements, and data scope.

4. Deploy ES master‑slave failover via DUCC switches and integrate order‑delay monitoring using existing business metrics.

Implementation Details

Mark in‑stock orders during creation; back‑fill historical data via offline tables.

Binlog listeners publish changes to JMQ, partitioned by order ID to avoid hot partitions.

Incremental sync uses source binlog; full sync employs DTS for both new and historical tables.

Data validation compares table‑level totals then samples rows for consistency.

Query Scheduling Strategy

Offline analysis identifies data skew (T‑1) and pushes results to JimDB. At runtime, the scheduler automatically selects the optimal cluster (JED or ES) based on query characteristics.

Gray‑Release Process

Four‑step rollout: data module, pre‑release verification, user‑level gray, and external‑interface gray, ensuring seamless migration.

Results & Future Outlook

Stable migration with no online issues; order data volume reduced by 92%; query latency dramatically improved.

Future work includes deeper monitoring, further data‑model simplification, gradual decommission of non‑core storage clusters, and exploring full replacement of JED with ES for in‑stock orders.

ElasticsearchDatabase Optimizationquery performanceDistributed StorageJD.com
JD Tech
Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.