Big Data 13 min read

Building JD's OLAP System: From Data Ingestion to Management and Future Plans

This article explains how JD.com designs and evolves its OLAP platform, covering data sources, ingestion, storage, real‑time and offline processing, key challenges such as timeliness, high throughput, consistency, and the solutions implemented to support massive e‑commerce analytics.

DataFunSummit

Nov 8, 2021

Building JD's OLAP System: From Data Ingestion to Management and Future Plans

The presentation introduces JD's end‑to‑end OLAP architecture, starting from business requirements such as order data, user click/search behavior, advertising, and monitoring metrics.

Data Ingestion : JD integrates diverse sources (local files, HDFS, Kafka/MQ) and formats (CSV, JSON, AVRO, PARQUET, BINLOG). A unified import service abstracts these sources, allowing analysts to configure topics, target destinations, and schemas via a visual UI.

Timeliness : Real‑time clusters handle frequent updates, while offline clusters process batch jobs with lower latency requirements. Physical isolation of the two clusters prevents interference and optimises resource allocation.

Updates & Deletions : Updates are performed by overwriting full records (e.g., order status changes), and deletions are achieved by dropping partitions or using versioned data to mask old records.

High Throughput : JD deploys 10 GbE networks, SSDs for real‑time workloads, and HDDs for offline jobs, ensuring the system can ingest petabyte‑scale data daily.

Storage : A distributed column‑store with compression (e.g., Snappy) handles massive datasets, while multi‑replica strategies and RAID provide fault tolerance.

Consistency : Distributed coordination (Zookeeper) combined with local transaction mechanisms guarantees data consistency across nodes.

Query Performance : Data is partitioned (often by date) and sharded; pre‑aggregation, materialised views, and various indexes (hash, B‑tree, inverted) accelerate query speed.

Usability : The OLAP service supports JDBC/ODBC and a graphical query interface, enabling analysts without deep SQL knowledge to run queries directly.

QPS Optimisation : Partition caches, result caches, and increased replica counts improve query throughput, while scaling hardware further boosts performance.

Management : Monitoring, alerting, automated node black‑listing, and scripted node replacement reduce operational overhead and accelerate failure recovery.

Evolution Timeline :

1.0 – Simple order analytics using relational databases (Oracle/MySQL).

2.0 – Expansion to logistics, supply‑chain, and payment data; data volume grows to TB/PB, prompting a shift to Hive/Spark offline warehouses.

3.0 – Real‑time analytics with unified OLAP services built on Doris and ClickHouse, supporting both batch and streaming workloads.

Future Plans :

Platform optimisation: dynamic scaling for ClickHouse and intelligent operations automation.

Query speed: smarter caching and automated index creation based on query patterns.

Further integration of real‑time caching in Doris.

The Q&A section addresses engine choices (Druid, ClickHouse vs. Doris), selection criteria, and JD's current data‑ingestion pathways for ClickHouse.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems Big Data real-time analytics Data Warehouse OLAP JD.com

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.