Design and Implementation of JD Mini‑Program Custom Data Analysis Service
This article presents the technical solution and key processes of JD's mini‑program custom data analysis service, covering business background, ClickHouse‑based storage design, real‑time processing pipelines, dynamic rule parsing, table architecture, monitoring mechanisms, and future outlook for large‑scale data analytics.
Business Background With the rapid growth of mobile internet, mini‑programs have become a new retail carrier, generating massive and diverse data. JD's mini‑program data center initially collected data via SDKs, but faced issues such as incomplete data types, lack of industry‑level insights, and inflexible statistics.
Technical Selection After evaluating solutions, ClickHouse was chosen for its columnar storage, data compression, MPP architecture, and diverse table engines, which together satisfy the needs for massive data storage, high‑performance queries, and low operational cost.
Overall Architecture The custom data analysis system consists of three layers: custom data reporting, data processing, and data storage. Data is reported via HTTP gateways and client SDKs, routed to real‑time or offline warehouses, processed by Flink or MapReduce, and finally persisted in ClickHouse.
Process Design The rule engine stores user‑defined reporting attributes, generates dynamic SQL scripts based on custom rules, dispatches them to ClickHouse for execution, caches results, and sets expiration to improve subsequent query efficiency.
Table Design ClickHouse tables are divided into local and distributed tables. Local tables store actual data, while distributed tables act as logical views. Table schemas use shard and replica settings for high availability, partitioning by report date, sorting by AppID, and appropriate table engines (e.g., ReplicatedMergeTree). The content column stores JSON‑encoded custom fields, enabling flexible extraction via JSONExtractString .
Monitoring Comprehensive monitoring covers MQ, Flink, ClickHouse, and the mini‑program runtime using Grafana dashboards for CPU, memory, I/O, and custom alert rules, ensuring system stability and rapid issue detection.
Summary and Outlook The service greatly improves data completeness and analytical capability for JD mini‑programs, initially supporting 50+ core programs and enabling fine‑grained merchant operations. Future work aims to build intelligent data analysis models to further enhance operational efficiency and business productivity.
JD Retail Technology
Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.