Big Data 39 min read

How JD Retail’s Data Platform Boosts Efficiency with Unified Modeling and AI‑Driven Insights

This article details JD Retail’s end‑to‑end data platform, covering data asset certification, 5W2H modeling, unified query DSL, intelligent acceleration, robust governance, visualization components, low‑code orchestration, and large‑model AI applications that together reduce query latency, cut development costs, and empower analysts across the retail business.

Data Thinking Notes
Data Thinking Notes
Data Thinking Notes
How JD Retail’s Data Platform Boosts Efficiency with Unified Modeling and AI‑Driven Insights

Data Asset Chapter – Asset Certification and Governance

The retail data landscape includes millions of tables, many temporary or invalid, making model discovery and usage difficult for analysts. To improve retrieval efficiency, reduce storage‑compute costs, and ensure trustworthy data, the team focused on asset certification, standardizing assets, and retiring redundant ones, covering core domains such as transactions, users, traffic, marketing, and finance.

Data Modeling Methodology

Three stages—conceptual, logical, and physical models—define a standard description of assets. The conceptual model captures business entities and relationships, the logical model refines it per business process, and the physical model materializes it with storage details, update frequency, and granularity.

Example: adm_d04_trade_std_ord_det_snapshot represents order data with daily incremental snapshots, keyed by ord_type+sale_ord_det_id .

Data Capability Chapter – Indicator Middle‑Platform Practice

Challenges include scattered metric definitions, resource gaps, and difficulty sharing indicators. The solution provides full‑stack indicator management, native topology, rule engine for unified formulas, anomaly detection, and intelligent acceleration via logical wide tables.

Unified DSL example for a query:

{
  "indicators": ["ge_deal_standard_deal_ord_amt"],
  "attributes": ["shop"],
  "criteria": {
    "criterions": [
      {"propertyName": "main_brand", "values": "8557", "type": "string", "op": "="},
      {"propertyName": "dt", "value": "2023-12-21", "type": "string", "op": "="}
    ],
    "orders": [{"ascending": false, "propertyName": "ge_deal_standard_deal_ord_amt"}],
    "maxResults": 5,
    "firstResult": 0,
    "group": ["shop"]
  }
}

Query planning splits tasks semantically and by engine, enabling parallel execution, dynamic load‑aware routing, and multi‑level caching (JIMDB + local cache) to cut redundant scans by two‑thirds and improve TP99 latency.

Data Intelligence Chapter – Large‑Model‑Based Smart Applications

Natural‑language queries are parsed into five elements (indicator, aggregation, filter, sorting/paging, dimensions) and mapped to the DSL. A knowledge graph aligns business terms with data services, using NER, normalization, and similarity matching to resolve ambiguous entities. Prompt engineering, local LLM fine‑tuning, and continuous evaluation ensure high accuracy.

Evaluation combines synthetic and human‑generated samples, automated batch comparison, and user feedback loops (like/dislike) to improve stability and relevance.

Data Visualization Chapter – Visualization Tools

Advanced visualization components built on graphic‑grammar theory (DATA, TRANS, SCALE, COORD, ELEMENT, GUIDE) support DuPont analysis, anomaly grids, cross‑analysis tables, and automated reporting. Components are configurable via low‑code, supporting PC and mobile, with SVG/D3 rendering and interactive features such as zoom, collapse, and drill‑down.

Low‑Code Orchestration

A custom MVC‑based state management framework (JMT) and visual orchestration system enable complex page layouts, component linking, and data‑set pipelines. Code generation and injection allow rapid customization, while micro‑frontend deployment ensures multi‑platform delivery (PC, mobile, H5) with unified security (token, cookie) and gateway protections.

Data Push

Email push generates board snapshots (image, HTML, PDF) via a Node service that drives a headless browser to capture Canvas output, handling low‑performance devices by extracting raw Canvas data and encoding it without heavy native graphics libraries.

Business Impact

The platform supports over 4 billion daily data calls, 8 000+ retail indicators, and 22 data products, cutting average delivery time from 3 days to 0.8 days (70% efficiency gain). AI‑driven analysis and low‑code acceleration enable rapid report generation during major promotions, while mobile low‑code apps deliver real‑time insights to field users.

(Source: JD Retail Technology)

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big DataAIdata modelingData PlatformvisualizationData Governance
Data Thinking Notes
Written by

Data Thinking Notes

Sharing insights on data architecture, governance, and middle platforms, exploring AI in data, and linking data with business scenarios.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.