Building JD's Enterprise-wide Big Data Platform: Architecture, Stages, and Challenges
This article summarizes Bao Yongjun’s presentation on JD.com’s end‑to‑end big data platform, covering its strategic value, industry trends, architectural design, development phases from scale‑out to intelligent real‑time processing, and future directions for a cloud‑native, AI‑driven data ecosystem.
This article is based on Bao Yongjun’s talk at the 2020 DAMS China Data Intelligence Management Summit.
Speaker Introduction
Bao Yongjun is currently the head of JD’s Data Infrastructure Platform, Advertising Quality, and Recommendation R&D departments, a member of JD’s Technical Committee, and chair of the JD Retail Data Algorithm Committee. He leads the construction of JD’s big‑data platform infrastructure, AI algorithm platform, and recommendation systems, with extensive experience in large‑scale data architecture and platform development.
In his talk, Bao emphasizes the value of data and shares JD’s journey in building an enterprise‑wide big data platform.
1. The Value of Data
Data, like oil, profoundly transforms the information society and creates increasing value for industries and society.
The Forrester report shows that data‑driven companies achieve 2.4× faster business acceleration than laggards, and technology firms now dominate the top‑10 global market‑cap companies.
Most of these tech companies create business value by continuously producing, processing, consuming, and reshaping data.
Domestic Big Data Industry Trends
China’s government has included big data in its work reports for six consecutive years; the market is projected to reach 1.57 trillion CNY in 2023, and big‑data‑related technologies have penetrated all industries, driving deep digital transformation.
2. Industry Big Data Platform Status
1) Development Stages
From a technical perspective, big‑data platforms are still in an exploratory, early‑stage phase.
Data‑mid‑platform concepts are gaining attention, but successful enterprise applications are mostly limited to leading internet and innovative companies.
2) Architecture
There is no unified industry standard yet; the diagram below shows a typical big‑data platform architecture.
The ecosystem is complex, involving many heterogeneous products and rapidly evolving technologies, which raises technical barriers and decision‑making risks for enterprises.
3) Construction Challenges
Key challenges include diminishing returns from simply adding physical resources, the need for more precise analytics beyond rule‑based methods, and the difficulty of transferring internet‑scale digital‑transformation experience to other industries.
Rapid data growth makes pure hardware scaling inefficient.
Traditional rule‑based analysis cannot meet precise business mining needs.
Post‑COVID new‑infrastructure drives massive data demand, but cross‑industry adoption still requires exploration.
3. JD’s Enterprise‑wide Big Data Platform Journey
1) Overall Situation
The platform now runs tens of thousands of servers, processes millions of daily tasks, and stores exabytes of data, supporting JD’s e‑commerce, finance, logistics, health, and other complex business scenarios.
2) Development Phases
The platform evolved through five stages:
Scale‑out Phase
To handle explosive data growth, JD separated compute and storage, adopted custom hardware, and implemented erasure coding for efficient, high‑compression storage.
Systematic Phase
Business diversification introduced challenges such as data silos, data decay, governance difficulties, heterogeneous sources, and rapid requirement expansion.
JD built a standardized, manageable, maintainable, and reproducible data‑mid‑platform to address these issues across finance, logistics, e‑commerce, insurance, and health.
Real‑time Phase
High‑throughput, low‑latency processing became essential for second‑level decision making, especially during peak events like 618 and Double‑11.
Easy Realtime Platform
The platform offers high availability, container‑based cloud‑native elastic scheduling, self‑healing, and a one‑stop cloud‑code development environment, enabling non‑technical business users to develop SQL‑based real‑time decision logic.
Intelligent Phase
JD aims to evolve from statistical analysis to AI‑driven, precise, and deep data understanding, creating a comprehensive data‑algorithm platform.
Key intelligent challenges include massive machine‑learning compute, secure cross‑business data fusion (addressed by a federated‑learning exchange platform), and multimodal graph computation (solved by the Galileo graph engine).
9N Commercial Analytics & Business Intelligence Platform
The platform integrates core algorithm engines, including a federated‑learning engine, with a cloud‑native resource manager, powering JD’s retail, health, and finance businesses and supporting the transition to intelligent, automated advertising.
Commercialization Phase
JD has built a full‑domain big data platform that supports multiple sectors, shares commonalities with mainstream data‑mid‑platforms, and aims to externalize its technology as PaaS/SaaS solutions.
4. Future Directions
JD envisions a cloud‑native, AI‑enhanced intelligent data platform that continuously upgrades technology, deepens AI integration, and shares its experience with the industry through collaborative PaaS/SaaS ecosystems.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JD Retail Technology
Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
