Suning’s Big Data Platform Evolution: From SAP BW to Real‑Time Streaming
This article chronicles Suning’s journey from early SAP‑based data warehouses to a modern, open‑source big data platform featuring real‑time collection, Hadoop‑Hive offline processing, Storm‑based streaming, and a visual development environment, highlighting how each layer addresses growing data volume, variety, and business demands.
In the previous issue we introduced Suning's data center; this article details the evolution of Suning's big data platform architecture.
Suning recognized data value early, building a SAP‑based BW data warehouse in 2007 and an Oracle‑based EDM warehouse in 2010 with a classic snow‑flake schema, BO reporting, DS ETL, and a self‑developed scheduler. Business growth caused a surge in report usage and complex permission control, leading to domain‑based report systems and data marts.
By 2014, with exploding data volume, variety, and semi‑/unstructured data, Suning built a new big data platform on open‑source technologies to address storage and computation challenges.
1. Data collection platform : a real‑time acquisition and distribution system that raises data capture from daily to second granularity, handling heterogeneous sources, distributed message queues, and high consistency; it ingests about 1 billion records per day.
2. Data processing platform : comprises offline and streaming engines, aiming for “store enough, compute fast”.
3. Offline computing platform : built on Hadoop + Hive, processing tens of terabytes daily to support customer segmentation, precise marketing, offline model training, and analytical reporting.
4. Streaming computing platform : based on Storm, named Libra, offering a standardized SQL interface; it delivers billions of real‑time calculations per day for live‑room streaming, personalized recommendation, precise ad delivery, and user experience management.
5. Data development platform : provides a visual development environment with scheduling, permission, metadata management, and task monitoring, handling over 100 k tasks daily.
The overall goal of the platform is to “collect comprehensively, transmit quickly, store sufficiently, compute rapidly, and develop conveniently”. Thanks to this robust architecture, Suning’s business data can be effectively analyzed and its value fully realized.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Suning Technology
Official Suning Technology account. Explains cutting-edge retail technology and shares Suning's tech practices.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
