How JD Retail Tackles Data Governance Challenges to Boost Efficiency
JD Retail faces growing data volume, redundant models, and resource‑intensive storage, prompting a comprehensive data‑governance strategy that defines standards, streamlines architecture, isolates development, and optimizes compute and storage costs, ultimately enabling more efficient, secure, and agile data operations across the enterprise.
Data Management Challenges
JD Retail confronts multiple data‑management challenges: continuous data growth leads to many inefficient and redundant data models, increasing maintenance cost and affecting data quality; shared accounts for data management and development cause change‑management issues; expanding table count and storage scale intensifies compute and storage consumption.
Key Pain Points
Weak Asset Awareness – difficulty locating assets among hundreds of thousands of models, many temporary or duplicate tables; low confidence in using data.
Inflexible Data Architecture – tightly coupled dimensions and pre‑computed data, large budgets for iterative work, long delivery cycles, and resource‑heavy materialized wide tables.
Development Quality and Safety Risks – uncontrolled table structure changes, operational risks from parameter mismatches, and development tasks that write directly to production tables.
Rising IT Resource Costs – ever‑increasing table numbers and storage, low utilization due to invalid or duplicate tables, and high compute consumption.
Data Governance Framework
To address these issues JD Retail proposes a comprehensive data‑governance framework that covers data standards, agile architecture, isolated development, and compute‑storage optimization.
1. Data‑Standard Governance
JD Retail has defined a unified data‑language standard describing model elements such as business domain, subject, attributes, update frequency, and granularity. High‑quality, high‑value models are certified and cataloged, while low‑quality models are decommissioned to free resources. Systematic standard elements improve metadata registration and support downstream intelligent modeling.
2. Architecture Governance
The focus is on making the architecture more agile. Logical virtual tables replace physical wide tables, abstracting data models into dimensions and metrics. Automated materialization (HBO, CBO, RBO) decides which paths to pre‑materialize, reducing manual effort and IT cost. JD Retail also explores lake‑warehouse integration with incremental updates and stream‑batch capabilities.
3. Development Governance
Development‑production isolation separates accounts, tables, and queues, ensuring secure data production.
4. Resource Governance
Storage governance includes table lifecycle management, invalid or similar table identification, and data compression. Compute governance targets idle tasks, low‑utilization jobs, frequent failures, and optimizes operators and engines, including peak‑shaving execution.
By mining proactive metadata, JD Retail builds governance models and visual dashboards that provide actionable recommendations, achieving over‑target results in 2023 and establishing a sustainable governance operation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
