Big Data 16 min read

How JD Logistics Tackled Billion-Scale Data Challenges with Doris

This article details JD Logistics' journey from fragmented, massive‑scale data to a unified, real‑time analytics platform, covering business needs, pain points, tool evaluation, a new Doris‑based architecture, table management, data import procedures, automation scripts, and future roadmap for data engineering.

dbaplus Community
dbaplus Community
dbaplus Community
How JD Logistics Tackled Billion-Scale Data Challenges with Doris

1. Business Scenario

JD Logistics operates a nationwide, integrated supply‑chain service that requires real‑time, multi‑dimensional data analysis across hundreds of warehouses and delivery points. The data team faces massive data volumes, inconsistent standards, duplicated efforts, slow response times, and a lack of unified data asset management.

Early: massive multi‑dimensional queries demand real‑time performance.

Scattered: data stored in disparate systems without standardization.

Heavy: repetitive reporting (daily, weekly, monthly) is inefficient.

Slow: diverse regional and product‑line data scenarios hinder rapid change.

Lacking: no unified data asset management for self‑service analytics.

Difficult: leadership struggles to obtain data, measure marketing ROI, and drive data‑centric decisions.

2. Current Needs

The ecosystem consists of:

Production system : supports daily business operations and generates raw production data.

Data warehouse : a strategic repository for analytical reporting and decision support.

Data mart : built on the warehouse and big‑data platform, serving various business groups (CFO, CMO, COO, Mobile, etc.).

Application system : products that leverage data to assist users in making better decisions.

3. Data Team Approach: Business‑Finance Data System

The team aims to bridge the natural gap between operational and financial data, standardizing metrics so that costs and revenues can be traced to each transaction, enabling fine‑grained, real‑time business‑finance analysis.

4. Problems Faced

4.1 Data Visualization

Exporting data to local machines occurs ~3,000 times per week, with no traceability after export. Short‑term solutions add a warning dialog and generate export bills; long‑term solutions focus on user‑driven methodology, offline reporting, and self‑service exploration.

4.2 Permission Management

Analysis permissions are overly broad (e.g., analysts can access all tables), and metric permissions are scattered across systems, leading to chaos. Solutions include tightening BDP access based on business characteristics and centralizing metric control via a unified data API.

5. Tool Evaluation

An evaluation team compared internal JD Power (rapid iteration) with external BI tools (Tableau, Yonghong BI, etc.) across cost, maturity, usability, extensibility, and performance. Scores from business, product, and R&D stakeholders led to the final BI tool selection.

6. Solution Architecture

Existing stack: JD Power + Presto + BDP suffered from resource contention and slow query performance. The new architecture replaces Presto with Doris, providing isolated resources, decoupling BDP from reporting, and achieving second‑level query responses. Reported benefits include:

Query latency reduced from >10 seconds to sub‑second.

Independent resource control and on‑demand optimization.

7. Doris Table Management

Common operations:

Create table

Add partition

Drop partition

Key notes: standardize partition rules, limit excessive Rollup creation, and batch import data in small, serial batches to optimize resource usage.

8. Data Import from Hive to Doris (Broker Load)

Steps include converting Hive tables to Doris format, performing a Broker Load, and tracking load status.

Example load‑status query:

show load from jddl_test where label = 'app_ea_pal_vender_all_sum_m_20201101_183213_19688970430' \G

Important parameters:

LABEL : identifies the import batch for later status queries.

max_filter_ratio : maximum allowed error rate (e.g., 0.2 for 20%).

timeout : load job timeout in seconds (default 86 400 s).

9. Automated Data Push

Command‑line options for the automation script:

-t table name (required).

-c column list (optional, defaults to all columns).

-n number of days of data to push (default 1 day).

-e end date for data extraction (default yesterday).

-d Doris operation: db_reset (rebuild table), db_drop (delete table), db_create (create or show table DDL).

Note: Different database characteristics create integration bottlenecks; new technologies must be introduced with thorough pre‑planning.

10. Automated Reporting

By connecting JD Power to Doris, business users can configure data sources and build analysis reports within ten minutes, achieving a one‑stop platform for data preparation, report generation, and interactive analysis across PC, iPhone, iPad, and Android.

11. Future Plans

11.1 Offline Data Technology Upgrade

BDP will continue to evolve with optimized underlying models, lifecycle management for data tables, and smarter scheduling (Hive/Spark) to balance resource utilization.

11.2 Business‑Driven Technical Iteration

As business matures, finer‑grained operations demand a unified, systematic, clear, and flexible data layer that supports ad‑hoc queries and multi‑dimensional OLAP analysis.

11.3 Team Building

Focus areas include methodology development, technical skill enhancement for end‑to‑end offline‑real‑time pipelines, project management, fine‑grained data permissions, and talent pipeline construction.

Business scenario diagram
Business scenario diagram
Current needs diagram
Current needs diagram
Data team architecture
Data team architecture
Tool matrix
Tool matrix
Data engine evolution
Data engine evolution
Resource isolation diagram
Resource isolation diagram
Rollup impact
Rollup impact
Big DataSQLData WarehousedorisBI Tools
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.