Tagged articles
548 articles
Page 5 of 6
vivo Internet Technology
vivo Internet Technology
Apr 20, 2022 · Big Data

Implementing Field Lineage in Spark SQL: A Technical Deep Dive

The article details how to add field‑lineage tracking to Spark SQL by creating a custom SparkSessionExtension that injects a check‑analysis rule and a parser, which capture INSERT statements, analyze the physical plan, and generate a JSON mapping of source‑to‑target fields for data governance.

Data GovernanceData QualityField Lineage
0 likes · 9 min read
Implementing Field Lineage in Spark SQL: A Technical Deep Dive
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 7, 2022 · Big Data

How Alibaba’s Big Data Model Governance Boosted Efficiency and Cut Costs

This article details Alibaba's large‑scale data model governance initiative, analyzing current data issues, presenting a comprehensive solution—including model digitization, public model sinking, productization, daily governance, and search‑enhancement—and outlining achieved results and future plans to further improve data quality, reuse, and operational efficiency.

Data GovernanceDataWorksModel Scoring
0 likes · 12 min read
How Alibaba’s Big Data Model Governance Boosted Efficiency and Cut Costs
dbaplus Community
dbaplus Community
Mar 15, 2022 · Big Data

How to Build a Real‑Time Data Warehouse with Flink SQL: Architecture, Implementation, and Governance

This article explains the challenges of early real‑time data pipelines, introduces a layered real‑time warehouse architecture, provides step‑by‑step Flink SQL code for building a demo warehouse, and covers comprehensive data governance, quality metrics, lifecycle management, and naming conventions for production‑grade big‑data systems.

Data GovernanceData QualityFlink SQL
0 likes · 60 min read
How to Build a Real‑Time Data Warehouse with Flink SQL: Architecture, Implementation, and Governance
BaiPing Technology
BaiPing Technology
Mar 14, 2022 · Big Data

Mastering DataWorks & MaxCompute: A Complete Guide to Big Data Architecture and Governance

DataWorks, Alibaba Cloud’s comprehensive PaaS platform, combined with the serverless MaxCompute data warehouse, offers an integrated solution for data integration, development, quality, and services, while detailed naming and layer conventions ensure scalable, maintainable big‑data architectures and effective governance across ODS, CDM, DWD, DWS, and ADS layers.

Big DataData GovernanceDataWorks
0 likes · 8 min read
Mastering DataWorks & MaxCompute: A Complete Guide to Big Data Architecture and Governance
政采云技术
政采云技术
Feb 8, 2022 · Industry Insights

Unlocking Enterprise Value with a Data Middle Platform: Architecture & Indicators

This article traces the evolution from traditional data warehouses to modern data lakes and data middle platforms, explains why siloed data development hampers efficiency, and details the architecture and indicator‑library design used by Zhengcaiyun to achieve unified, reusable data services.

Big DataData GovernanceData Lakehouse
0 likes · 14 min read
Unlocking Enterprise Value with a Data Middle Platform: Architecture & Indicators
DevOps
DevOps
Jan 26, 2022 · R&D Management

Digital R&D Management Capability Building for Financial Organizations

This article outlines the comprehensive architecture and key points for building digital R&D management capabilities in financial organizations, reviewing historical challenges, identifying four major pain points, proposing an overall framework, detailing twelve essential capabilities, and offering principles for effective implementation.

Data GovernanceDigital TransformationFinancial Industry
0 likes · 20 min read
Digital R&D Management Capability Building for Financial Organizations
DataFunTalk
DataFunTalk
Jan 24, 2022 · Big Data

MobTech Data Governance and Security Practices: Architecture, Implementation, and Financial Industry Use Cases

This article presents MobTech’s comprehensive data governance and security practices, covering the necessity of governance, its benefits, a full‑chain governance framework, specific challenges in the financial sector, the evolution of their integrated architecture, and detailed implementations of security, model, asset, monitoring, and quality management systems.

Data GovernanceData Qualityfinancial technology
0 likes · 21 min read
MobTech Data Governance and Security Practices: Architecture, Implementation, and Financial Industry Use Cases
DataFunSummit
DataFunSummit
Jan 23, 2022 · Big Data

MobTech's Integrated Data Governance Practices and Architecture

This article presents MobTech's comprehensive data governance and security practices, covering the necessity of governance, challenges in large‑scale data environments, the full‑link governance chain, modular architecture, and specific implementations for financial risk‑control scenarios.

Big DataData ArchitectureData Governance
0 likes · 19 min read
MobTech's Integrated Data Governance Practices and Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Jan 18, 2022 · Big Data

Data Warehouse Data Quality Measurement Standards

The article outlines four key dimensions for evaluating data warehouse data quality—correctness, completeness, timeliness, and consistency—explains common consistency issues such as differing metric values across models, cross‑dimensional aggregations, and real‑time versus batch calculations, and proposes organizational and review mechanisms to mitigate these problems.

Big DataConsistencyData Governance
0 likes · 9 min read
Data Warehouse Data Quality Measurement Standards
21CTO
21CTO
Jan 13, 2022 · Fundamentals

How to Achieve Data Maturity: Turning Data into a Strategic Product

The article explains why data maturity is essential for modern enterprises, defines its three pillars—people, tools, and readiness—shows how treating data as a product follows the same principles as great products, and outlines the four S (Speed, Scale, Simplicity, SQL) that guide a mature data ecosystem.

Big DataData GovernanceData Product
0 likes · 6 min read
How to Achieve Data Maturity: Turning Data into a Strategic Product
21CTO
21CTO
Jan 8, 2022 · Big Data

How Amazon’s Intelligent Lakehouse Redefines Big Data Architecture

The article examines Amazon’s Intelligent Lakehouse architecture, tracing its evolution from early data‑lake‑warehouse integrations to a modern, serverless, secure, and AI‑enhanced platform that unifies data storage, governance, and analytics to lower big‑data costs and boost agility.

Big DataData GovernanceData Lake
0 likes · 12 min read
How Amazon’s Intelligent Lakehouse Redefines Big Data Architecture
Volcano Engine Developer Services
Volcano Engine Developer Services
Jan 4, 2022 · Big Data

How ByteDance Scales EB-Level Data: Architecture, BP Model & Real-Time Insights

ByteDance’s data platform, built over seven years, now handles exabyte-scale data and over 100 million TPS, using a hybrid “middle‑platform + Business Partner” model, custom engines like ClickHouse/ByteHouse, agile governance, and a suite of products to support internal and external businesses, illustrating large-scale big-data engineering practices.

Big DataByteDanceData Governance
0 likes · 22 min read
How ByteDance Scales EB-Level Data: Architecture, BP Model & Real-Time Insights
Big Data Technology & Architecture
Big Data Technology & Architecture
Jan 4, 2022 · Big Data

Big Data Mastery Roadmap: Learning Path, Resources, Future Trends and Interview Guidance

This comprehensive guide outlines a step‑by‑step learning roadmap for aspiring big data professionals, covering fundamentals, programming languages, Linux, databases, distributed theory, networking, offline and real‑time computing, data governance, warehouses, toolchains, video/book recommendations, future industry trends, interview tips, and community resources.

Big DataData GovernanceDistributed Systems
0 likes · 42 min read
Big Data Mastery Roadmap: Learning Path, Resources, Future Trends and Interview Guidance
dbaplus Community
dbaplus Community
Dec 22, 2021 · Fundamentals

How Xiaomi Built a Scalable Metadata Platform for Data Governance

This article details Xiaomi's end‑to‑end metadata platform, covering its three‑layer architecture, the evolution of full‑domain metadata, real‑time lineage, precise measurement, and how these capabilities enable data map, governance, cost control, and quality improvements for future business empowerment.

Data GovernanceData QualityXiaomi
0 likes · 20 min read
How Xiaomi Built a Scalable Metadata Platform for Data Governance
DataFunSummit
DataFunSummit
Dec 22, 2021 · Big Data

Data Governance Practices and Experiences at NetEase Cloud Music

This article details NetEase Cloud Music's comprehensive data governance journey, covering data warehouse architecture, data standards, event tracking (埋点) governance, asset lifecycle management, and future automation plans, illustrating how systematic governance improves data quality, cost efficiency, and business insight.

Big DataData Governancedata-warehouse
0 likes · 21 min read
Data Governance Practices and Experiences at NetEase Cloud Music
Architects Research Society
Architects Research Society
Dec 21, 2021 · Fundamentals

Next-Generation Master Data Management (MDM): Architecture, Business Value, and Technical Challenges

This article explains master data management concepts, regulatory drivers, business benefits, key technical challenges, architectural trends such as graph databases and machine learning, and highlights leading vendors, providing a comprehensive overview for enterprises seeking modern MDM solutions.

AnalyticsBig DataData Governance
0 likes · 9 min read
Next-Generation Master Data Management (MDM): Architecture, Business Value, and Technical Challenges
Architects Research Society
Architects Research Society
Dec 20, 2021 · Fundamentals

Common Misconceptions About Master Data Management (MDM)

The article explains common misconceptions about Master Data Management, emphasizing its enterprise-wide scope, the importance of data quality, governance, workflow, real‑time integration, and the need for organizational change management, while warning against treating MDM as a simple project.

Data GovernanceData QualityMDM
0 likes · 8 min read
Common Misconceptions About Master Data Management (MDM)
Baidu Geek Talk
Baidu Geek Talk
Dec 20, 2021 · Mobile Development

Master Data Management: Concepts, Architecture, and Practical Implementation in Baidu Smart Mini Programs

The article outlines master data management concepts and maturity levels, then details Baidu Smart Mini Program’s practical architecture—spanning analysis, domain‑driven design, high‑availability services, transaction handling, caching, real‑time sync, and governance—that eliminates data silos, ensures consistency, and supports over 9,000 QPS with 99.99% SLA.

Baidu Mini ProgramsData GovernanceMaster Data Management
0 likes · 16 min read
Master Data Management: Concepts, Architecture, and Practical Implementation in Baidu Smart Mini Programs
Ctrip Technology
Ctrip Technology
Dec 16, 2021 · Big Data

Data Standard Management Practices in Ctrip Vacation Data Governance

This article outlines Ctrip Vacation's data standard management approach, covering why standards are needed, the three‑element framework of scope, tools, and policies, and detailed practices for data integration, production change handling, metadata governance, portal dashboard standardization, and self‑service query templating.

Big DataData GovernanceData Integration
0 likes · 12 min read
Data Standard Management Practices in Ctrip Vacation Data Governance
DataFunSummit
DataFunSummit
Dec 14, 2021 · Big Data

Data Map: Background, Definition, and Youzan’s Practical Implementation

This article introduces the concept of a data map, explains its background and goals, describes Youzan’s end‑to‑end data‑map practice—including full data lineage, search, management, link analysis, impact estimation, and optimization—and concludes with a summary and future outlook.

Big DataData GovernanceData Lineage
0 likes · 16 min read
Data Map: Background, Definition, and Youzan’s Practical Implementation
DataFunTalk
DataFunTalk
Dec 10, 2021 · Big Data

Building and Evolving NetEase Yanxuan Real-Time Computing Platform: Architecture, SQLization, Serviceization, and Data Governance

This article details NetEase Yanxuan's real-time computing platform development from 2017 to present, covering its architecture, Flink‑SQL development environment, service‑oriented deployment, resource optimization, cloud‑native migration, comprehensive data governance, and future plans for stream‑batch integration and intelligent job diagnostics.

Big DataCloud NativeData Governance
0 likes · 14 min read
Building and Evolving NetEase Yanxuan Real-Time Computing Platform: Architecture, SQLization, Serviceization, and Data Governance
DataFunSummit
DataFunSummit
Dec 10, 2021 · Big Data

Real‑Time Platform Construction at NetEase Yanxuan: Architecture, SQL‑Based Streaming, Serviceization, and Data Governance

This article details NetEase Yanxuan's evolution of a real‑time data platform from 2017 to present, covering background, current scale, layered architecture, Flink‑SQL development IDE, service‑oriented task execution, resource‑optimizing deployment modes, cloud‑native migration, comprehensive data governance, and future batch‑stream integration plans.

Big DataCloud NativeData Governance
0 likes · 15 min read
Real‑Time Platform Construction at NetEase Yanxuan: Architecture, SQL‑Based Streaming, Serviceization, and Data Governance
IT Architects Alliance
IT Architects Alliance
Dec 8, 2021 · Industry Insights

6 Proven Strategies to Modernize Your Cloud Data Warehouse

This article outlines six practical strategies—identifying bottlenecks, empowering data engineers, adopting distributed management, creating data contracts, embracing diverse perspectives, and streamlining workflows—to help organizations leverage cloud data warehouses more efficiently and drive better business intelligence outcomes.

Business IntelligenceData Governancecloud computing
0 likes · 8 min read
6 Proven Strategies to Modernize Your Cloud Data Warehouse
Big Data Technology & Architecture
Big Data Technology & Architecture
Nov 28, 2021 · Big Data

OneData Methodology: Building a Unified Data Warehouse Architecture and Governance Framework

This article presents the OneData methodology for designing, standardizing, and governing a data warehouse, detailing background challenges, goals, industry references, core concepts, unified business and design consolidation, data modeling layers, naming conventions, data quality controls, and the resulting operational improvements and business value.

Big DataData GovernanceOnedata
0 likes · 20 min read
OneData Methodology: Building a Unified Data Warehouse Architecture and Governance Framework
DataFunTalk
DataFunTalk
Nov 27, 2021 · Big Data

iQIYI Data Middle Platform: Architecture, Data Governance Practices, and Future Plans

The article details iQIYI’s data middle platform architecture and its comprehensive data governance practices, covering platform overview, data flow, unified standards, metadata management, production quality assurance, and future AI‑driven enhancements, illustrating how centralized data services improve reliability, efficiency, and security.

Big DataData GovernanceData Quality
0 likes · 27 min read
iQIYI Data Middle Platform: Architecture, Data Governance Practices, and Future Plans
AntTech
AntTech
Nov 26, 2021 · Information Security

Achieving “Computable but Not Identifiable”: Balancing Personal Data Protection and Industry Development with Trusted Computing

The article examines how the Personal Information Protection Law creates a new authorization framework and introduces the “computable but not identifiable” concept, arguing that trusted‑computing technologies and controlled environments can reconcile strict privacy safeguards with the data‑driven needs of AI and other industries.

Data Governanceartificial intelligencedata anonymization
0 likes · 10 min read
Achieving “Computable but Not Identifiable”: Balancing Personal Data Protection and Industry Development with Trusted Computing
Baidu Geek Talk
Baidu Geek Talk
Nov 24, 2021 · Big Data

Building Big Data Infrastructure at Baidu Aifanfan: Architecture Practices and Lessons Learned

At Baidu Aifanfan, the data team built a unified real‑time and offline big‑data platform—leveraging Watt, Bigpipe, Fengge, AFS and Palo within Lambda/Kappa patterns and a fast‑slow parallel rollout—that cut OLAP query latency from 18 minutes to under 15 seconds, enabled self‑service analytics, and standardized metrics across 15 agile teams.

Apache DorisBig Data ArchitectureData Governance
0 likes · 23 min read
Building Big Data Infrastructure at Baidu Aifanfan: Architecture Practices and Lessons Learned
21CTO
21CTO
Nov 23, 2021 · Information Security

Dynamic Data Security: Unlocking Data Value and Protecting Privacy in Banking

In a recent statement, ICBC’s CTO emphasizes that data, as a crucial production factor, derives its core value during use, urging dynamic data security and personal information protection, cross‑institution collaboration, regulated data markets, and safe cross‑border flows to support a healthy digital economy.

Data GovernancePrivacy Computingcross‑border data
0 likes · 3 min read
Dynamic Data Security: Unlocking Data Value and Protecting Privacy in Banking
DataFunTalk
DataFunTalk
Oct 30, 2021 · Big Data

Product Practice of Data Governance Tools at NetEase: Review, Pain Points, Strategy, and Future Planning

The presentation at the DataFun Summit detailed NetEase's data‑governance tool practice, reviewing past initiatives, current challenges, comprehensive product strategies, and future roadmap to improve compute and storage efficiency, cost quantification, and systematic governance across business lines.

Big DataData GovernanceData Lifecycle
0 likes · 13 min read
Product Practice of Data Governance Tools at NetEase: Review, Pain Points, Strategy, and Future Planning
DataFunTalk
DataFunTalk
Oct 27, 2021 · Big Data

Data Value System and Cockpit Construction: A Case Study from CITIC Bank

This article explains how CITIC Bank's software development center built a data value system and management cockpit, detailing business objectives, overall architecture, digital management methodology, implementation steps, and real‑world usage to support the bank's digital transformation.

Big DataData GovernanceDigital Transformation
0 likes · 16 min read
Data Value System and Cockpit Construction: A Case Study from CITIC Bank
DataFunSummit
DataFunSummit
Oct 26, 2021 · Big Data

Data Value System and Cockpit Construction: A Case Study from CITIC Bank

This article presents a comprehensive overview of CITIC Bank's data value system and cockpit construction, detailing business objectives, overall planning, digital management framework, methodology, implementation cases, and current usage, illustrating how data-driven analytics support the bank's digital transformation.

Big DataData CockpitData Governance
0 likes · 17 min read
Data Value System and Cockpit Construction: A Case Study from CITIC Bank
High Availability Architecture
High Availability Architecture
Oct 25, 2021 · Big Data

iQIYI Data Governance Practices: Event Tracking (Pingback) Governance and Application

The article details iQIYI's comprehensive data governance initiative for event tracking (Pingback), covering definitions, timing, quality requirements, governance challenges, standardized specifications, coordinate management, testing and gray‑release processes, upgrade workflows, and data security measures that together reduced event volume by 40% and cut resource consumption in half.

AnalyticsBig DataData Governance
0 likes · 16 min read
iQIYI Data Governance Practices: Event Tracking (Pingback) Governance and Application
DataFunTalk
DataFunTalk
Oct 25, 2021 · Big Data

Building a Multi‑Dimensional Analysis System at Baixin Bank: Practices and Insights

This article details Baixin Bank’s multi‑dimensional analysis framework, covering the bank’s business model, data accuracy, completeness and usability requirements, the design of indicator and analysis systems, ladder‑style service concepts, user‑product‑enterprise scenario modeling, and the implementation of self‑service data products and governance processes.

BIBankingData Governance
0 likes · 20 min read
Building a Multi‑Dimensional Analysis System at Baixin Bank: Practices and Insights
iQIYI Technical Product Team
iQIYI Technical Product Team
Oct 15, 2021 · Industry Insights

How iQIYI Streamlined Event Tracking: A Deep Dive into Data Governance

This article details iQIYI's comprehensive data‑governance practice for event tracking, covering the definition of pingback, the need for governance, the governance framework, coordinate management, gray‑data handling, and the upgrade process that reduced tracking volume by 40% while cutting resource consumption in half.

AnalyticsBig DataData Governance
0 likes · 17 min read
How iQIYI Streamlined Event Tracking: A Deep Dive into Data Governance
iQIYI Technical Product Team
iQIYI Technical Product Team
Oct 9, 2021 · Big Data

iQIYI Data Quality Monitoring: Exploration and Practice

At iTech Salon, iQIYI’s Peng Tao outlined a three‑layer data‑quality monitoring framework—pingback, middle, and business report layers—detailing anomaly‑detection techniques such as thresholds, statistical, correlation and Prophet forecasting, and announced future plans for intelligent rule generation and automated attribution to pinpoint root causes.

Data GovernanceData Qualityrule engine
0 likes · 11 min read
iQIYI Data Quality Monitoring: Exploration and Practice
IT Architects Alliance
IT Architects Alliance
Sep 12, 2021 · Industry Insights

Data Warehouse vs. Database: Core Differences and Building a Data Platform

This article explains what a data warehouse is, contrasts it with traditional databases, outlines how to design and build a data warehouse—including model selection, topic domain division, bus matrix, layered architecture, and data governance—then expands to the concept of a data middle platform and its distinction from data lakes and big‑data platforms.

Big DataData GovernanceData Platform
0 likes · 18 min read
Data Warehouse vs. Database: Core Differences and Building a Data Platform
WecTeam
WecTeam
Sep 10, 2021 · Mobile Development

Boost Build Speed 35%: Swift‑ObjC Mixed Compilation & ByteDance Data Governance

This week’s WecTeam Front‑end Weekly spotlights two technical deep‑dives: a Swift‑Objective‑C mixed‑compilation technique that slashes build times by 35%, and ByteDance’s large‑scale data‑tracking governance framework that underpins its trillion‑plus real‑time analytics pipeline.

ByteDanceCompilation OptimizationData Governance
0 likes · 2 min read
Boost Build Speed 35%: Swift‑ObjC Mixed Compilation & ByteDance Data Governance
DevOps
DevOps
Sep 6, 2021 · Operations

Huawei's Digital Transformation Practice: Management, Process, and Technology Evolution

This article presents Huawei's extensive digital transformation journey, detailing the continuous management system reforms, strategic shifts across multiple industries, data governance challenges, and practical initiatives such as cloud platforms, intelligent supply chains, and customer‑centric digital experiences that together illustrate how large enterprises can achieve sustainable growth through digitalization.

Data GovernanceEnterprise Managementcloud computing
0 likes · 33 min read
Huawei's Digital Transformation Practice: Management, Process, and Technology Evolution
dbaplus Community
dbaplus Community
Aug 31, 2021 · Big Data

How Meituan Waimai Built and Evolved Its Massive Data Warehouse from V1 to V3

This article details Meituan Waimai's data warehouse evolution—covering business context, four‑layer architecture, Spark‑based ETL, successive V1.0, V2.0, and V3.0 redesigns, data governance practices, resource‑optimization tactics, security measures, and future road‑maps—illustrated with diagrams and concrete technical choices.

Data GovernanceETLResource Optimization
0 likes · 24 min read
How Meituan Waimai Built and Evolved Its Massive Data Warehouse from V1 to V3
DataFunTalk
DataFunTalk
Aug 30, 2021 · Fundamentals

20 Practical Strategies for Effective Data Governance

Effective data governance hinges on leadership commitment, clear policies, skilled teams, and integration into business processes, and this article outlines twenty actionable strategies—from securing executive support and embedding rules in systems to fostering data quality, visualization, and sustainable operations—to guide organizations toward successful governance.

Data GovernanceData QualityLeadership
0 likes · 8 min read
20 Practical Strategies for Effective Data Governance
DataFunSummit
DataFunSummit
Aug 22, 2021 · Big Data

Evolution and Optimization of Meituan Waimai Offline Data Warehouse: Architecture, ETL, Modeling, Governance, and Future Plans

This article details the historical development, architectural layers, ETL migration to Spark, data modeling standards, governance processes, resource optimization, security measures, and future roadmap of Meituan Waimai's offline data warehouse, illustrating how the team addressed scalability and efficiency challenges.

Big DataData GovernanceETL
0 likes · 21 min read
Evolution and Optimization of Meituan Waimai Offline Data Warehouse: Architecture, ETL, Modeling, Governance, and Future Plans
21CTO
21CTO
Aug 17, 2021 · Fundamentals

How Traditional Enterprises Can Master Digital Transformation: A 14th Five-Year Blueprint

The article explains that digital transformation for traditional companies requires a mindset shift beyond mere analysis of status and environment, and offers a PPT detailing the 14th Five-Year Plan, transformation roadmap, data governance, and AI application to guide enterprises through this strategic overhaul.

14th Five-Year PlanAIData Governance
0 likes · 2 min read
How Traditional Enterprises Can Master Digital Transformation: A 14th Five-Year Blueprint
Volcano Engine Developer Services
Volcano Engine Developer Services
Aug 3, 2021 · Big Data

Inside ByteDance’s Traffic Platform: Powering Trillions of Real‑Time Events

This article, compiled from a Volcano Engine meetup, explains how ByteDance’s unified traffic platform designs, governs, and processes massive event‑tracking data in real time, covering embedding content solutions, link architecture, dynamic processing engines, and data‑governance practices that support trillions of daily events.

Big DataData GovernanceReal-time Processing
0 likes · 16 min read
Inside ByteDance’s Traffic Platform: Powering Trillions of Real‑Time Events
IT Architects Alliance
IT Architects Alliance
Jul 31, 2021 · Big Data

Alibaba's Data Platform Evolution: Four Stages, Core Challenges, and Future Trends

The article outlines Alibaba's twelve‑year journey building a data middle‑platform, detailing four development stages, the four major technical challenges faced, and emerging trends such as lake‑warehouse integration, autonomous data‑warehouse operation, natural‑language query, and AI‑driven data engineering.

AlibabaData GovernanceData Middle Platform
0 likes · 17 min read
Alibaba's Data Platform Evolution: Four Stages, Core Challenges, and Future Trends
ITPUB
ITPUB
Jul 7, 2021 · Big Data

How NetEase Cloud Music Scaled Its Data Warehouse for Billion‑User Traffic

This article details NetEase Cloud Music's journey of redesigning its data warehouse and governance processes to support over a billion monthly active users, covering pain points, standardization, shared services, self‑service tools, and the resulting improvements in data quality, latency, and operational efficiency.

AnalyticsData GovernanceData Platform
0 likes · 19 min read
How NetEase Cloud Music Scaled Its Data Warehouse for Billion‑User Traffic
Architect
Architect
Jul 1, 2021 · Big Data

Data Governance Practices at Meituan Hotel Travel Platform

This article presents a comprehensive case study of Meituan's hotel‑travel data governance, covering the background, challenges, strategic goals, standardized processes, technical systems, cost and security optimizations, measurable outcomes, and future plans for automated governance.

Big DataCost OptimizationData Governance
0 likes · 29 min read
Data Governance Practices at Meituan Hotel Travel Platform
360 Tech Engineering
360 Tech Engineering
Jun 25, 2021 · Big Data

Introducing ULTRON: A Real‑Time Data Warehouse Platform Powered by FlinkSQL

ULTRON is a one‑stop real‑time data‑warehouse development platform built on FlinkSQL that unifies data integration, asset management, cluster deployment, modeling, ETL, OLAP analysis and governance, addressing the limitations of traditional batch‑oriented warehouses and simplifying streaming data workflows for developers.

Data GovernanceFlinkSQLStreaming
0 likes · 13 min read
Introducing ULTRON: A Real‑Time Data Warehouse Platform Powered by FlinkSQL
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Jun 21, 2021 · Big Data

What Is a Big Data Platform and How to Design Its Architecture?

This article explains what a big data platform is, outlines its seven‑component overall architecture, details the technical stack from data sources to applications, and describes the key subsystems such as catalog management, data integration, governance, storage, processing, sharing, development, and analysis.

Data GovernanceData IntegrationDistributed Systems
0 likes · 11 min read
What Is a Big Data Platform and How to Design Its Architecture?
58 Tech
58 Tech
Jun 9, 2021 · Big Data

Designing and Implementing a Unified Data Metric System for 58 Commercial Data Team

This article explains how 58's commercial data team built a comprehensive data metric system—from identifying common metric definition issues to establishing a domain‑driven hierarchy, distinguishing atomic and derived metrics, implementing a unified metric management platform, and providing APIs and examples for querying and visualizing metrics.

Big DataData GovernanceJava
0 likes · 17 min read
Designing and Implementing a Unified Data Metric System for 58 Commercial Data Team
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 6, 2021 · Big Data

Understanding Data Warehouses: Concepts, Architecture, Modeling, and Governance

This article provides a comprehensive overview of data warehouses, explaining their purpose, differences from databases, OLTP vs OLAP, traditional versus internet data warehouse models, layered architecture, modeling theories, metric dictionaries, date dimensions, naming conventions, data governance, and incremental synchronization techniques with practical SQL examples.

Big DataData GovernanceETL
0 likes · 24 min read
Understanding Data Warehouses: Concepts, Architecture, Modeling, and Governance
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Jun 1, 2021 · Fundamentals

How Huawei Built a Comprehensive Data Governance Framework for Digital Transformation

Huawei’s 2017 digital‑transformation vision led to a five‑step data‑governance blueprint that evolved through two phases, defining a detailed data‑classification framework, structured and unstructured data management methods, metadata governance, and compliance‑driven external data handling to support enterprise‑wide intelligent operations.

Data Governancedata classificationmetadata
0 likes · 20 min read
How Huawei Built a Comprehensive Data Governance Framework for Digital Transformation
Efficient Ops
Efficient Ops
May 30, 2021 · Operations

How Intelligent Operations Are Redefining IT Management – Key Takeaways from the 2021 GOPS Conference

The 2021 GOPS Global Operations Conference in Shenzhen highlighted the shift toward intelligent, AI‑driven IT operations, presenting practical solutions, a three‑principle six‑step framework, and four core capabilities that help enterprises digitize, govern, and automate their operational data for higher efficiency.

Data GovernanceIT OperationsIntelligent Operations
0 likes · 7 min read
How Intelligent Operations Are Redefining IT Management – Key Takeaways from the 2021 GOPS Conference
IT Architects Alliance
IT Architects Alliance
May 25, 2021 · Big Data

How Modern Data Middle Platforms Power Real‑Time and Offline Analytics

This article provides a comprehensive technical overview of data middle platforms, covering data aggregation, offline and real‑time development, smart operations, data asset management, governance, service layers, platform implementations, warehouse layering, and key differences between offline and real‑time data warehouses.

Big DataData GovernanceData Platform
0 likes · 26 min read
How Modern Data Middle Platforms Power Real‑Time and Offline Analytics
Architects Research Society
Architects Research Society
May 23, 2021 · Big Data

Data Architecture Trends: From Chaos to an Organized Era – Insights from Anthony J. Algmin

The article reviews Anthony J. Algmin’s reflections on past data‑architecture predictions, current hot topics such as cloud, AI/ML, data governance, and real‑time analytics, and forecasts future trends including metadata management, blockchain, and the evolving role of data architects within enterprises.

Big DataData ArchitectureData Governance
0 likes · 13 min read
Data Architecture Trends: From Chaos to an Organized Era – Insights from Anthony J. Algmin
Programmer DD
Programmer DD
May 22, 2021 · Big Data

What Is a Data Lake? Origins, Architecture, and How It Powers Modern Big Data

This article explains the concept of a data lake—its origin in 2011, how it differs from traditional databases and data warehouses, its core characteristics such as raw data storage, on‑demand computing, and schema‑on‑read, as well as its advantages, challenges, architectural components, and future outlook within the big‑data ecosystem.

Big DataData ArchitectureData Governance
0 likes · 20 min read
What Is a Data Lake? Origins, Architecture, and How It Powers Modern Big Data
Big Data Technology & Architecture
Big Data Technology & Architecture
May 19, 2021 · Big Data

Comprehensive Guide to Data Governance: Metadata, Data Quality, Standards, and Asset Management

This article provides an extensive overview of data governance in the big‑data era, covering common pitfalls, the role of metadata, data quality management, data standardization, and data asset management, and offers practical recommendations for organizations to implement effective governance practices.

Big DataData Asset ManagementData Governance
0 likes · 42 min read
Comprehensive Guide to Data Governance: Metadata, Data Quality, Standards, and Asset Management
Big Data Technology & Architecture
Big Data Technology & Architecture
May 15, 2021 · Big Data

One‑Stop Big Data Platform Construction: Practices from WeBank, Beike, and iQIYI

This article shares practical notes on building a one‑stop big data platform, outlining essential functions such as data extraction, cleaning, storage, analysis, governance, and security, and presents implementation case studies from WeBank, Beike, and iQIYI to illustrate real‑world architectures and solutions.

Big DataCase StudyData Governance
0 likes · 8 min read
One‑Stop Big Data Platform Construction: Practices from WeBank, Beike, and iQIYI
Big Data Technology & Architecture
Big Data Technology & Architecture
May 11, 2021 · Big Data

Data Quality: Dimensions, Rules, and Constraints

The article explains the importance of data quality in the big data era, defines key quality dimensions such as completeness, uniqueness, validity, consistency, accuracy, timeliness, and credibility, and details how each dimension can be measured and enforced through specific constraints and validation rules.

Big DataConsistencyData Governance
0 likes · 9 min read
Data Quality: Dimensions, Rules, and Constraints
Architecture Digest
Architecture Digest
May 7, 2021 · Big Data

Comprehensive Overview of Data Middle Platform Architecture and Practices

This article provides a detailed introduction to data middle platform concepts, covering data aggregation, ingestion tools, offline and real‑time development, data governance, service layers, monitoring, and deployment patterns, illustrating how enterprises build unified data ecosystems across various industries.

Big DataData GovernanceData Platform
0 likes · 25 min read
Comprehensive Overview of Data Middle Platform Architecture and Practices
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 22, 2021 · Big Data

Debunking Common Misconceptions About Data Lakes

This article debunks eight common misconceptions about data lakes, explains why they are not mutually exclusive with data warehouses, clarifies that they are not limited to Hadoop or raw data only, and provides practical tips for building flexible, secure, and business‑driven data lake solutions.

AnalyticsBig DataCloud Services
0 likes · 21 min read
Debunking Common Misconceptions About Data Lakes
Meituan Technology Team
Meituan Technology Team
Apr 15, 2021 · Big Data

Data Governance Practices at Meituan Hotel & Travel Platform

Meituan’s hotel‑travel platform tackled exploding data‑quality, cost, efficiency, and security issues by establishing a full‑link governance framework—standardized processes, a Data Management Committee, and unified “One Model, One Logic, One Service, One Portal” systems—that cut per‑unit costs by ~40%, boosted engineer productivity over 60%, eliminated major security incidents, and set the stage for autonomous, AI‑driven data governance.

Big DataData GovernanceData Quality
0 likes · 32 min read
Data Governance Practices at Meituan Hotel & Travel Platform
Efficient Ops
Efficient Ops
Mar 31, 2021 · Operations

Uncovering Digital Risks in DevOps: Safeguarding Your Digital Transformation

This article examines how result‑oriented DevOps drives digital transformation while exposing digital risks—from missing high‑level test scenarios and broken security data links to insufficient user‑experience foresight—and outlines strategies for data governance, risk mitigation, and effective decision‑support across the enterprise.

Data GovernanceDevOpsDigital Transformation
0 likes · 12 min read
Uncovering Digital Risks in DevOps: Safeguarding Your Digital Transformation
dbaplus Community
dbaplus Community
Mar 17, 2021 · Big Data

How We Cut PBs of Waste and Optimized HDFS with Tiered Storage and Cloud Migration

This article details a three‑part technical sharing that covers cost governance for offline Hadoop clusters, a large‑scale data‑center migration with architecture upgrades, and a tiered storage strategy using EC and COS to reduce storage costs and improve performance in a cloud‑native big‑data environment.

Big Data MigrationCOSCloud Native
0 likes · 10 min read
How We Cut PBs of Waste and Optimized HDFS with Tiered Storage and Cloud Migration
Baidu Intelligent Testing
Baidu Intelligent Testing
Mar 10, 2021 · Artificial Intelligence

End-to-End Consistency Assurance for Click‑Through Rate Models: Methodology, Implementation, and Reporting

This article presents a comprehensive model quality assurance framework for click‑through‑rate (CTR) prediction, detailing the challenges of data and logic inconsistency, defining consistency goals, describing a full‑stack verification pipeline—including online data capture, offline sample alignment, multi‑stage q‑value comparison, and automated reporting—and sharing practical deployment experiences and results.

CTRData Governancemachine learning
0 likes · 19 min read
End-to-End Consistency Assurance for Click‑Through Rate Models: Methodology, Implementation, and Reporting
Suning Technology
Suning Technology
Mar 3, 2021 · Big Data

How Can China Build a Secure, Free Data Sharing Ecosystem?

The article examines China's push for free public data sharing, highlighting policy directives, the need for top‑level design, security standards, and education to create a unified, safe data‑governance framework that fuels the digital economy.

Big DataData GovernanceDigital Economy
0 likes · 6 min read
How Can China Build a Secure, Free Data Sharing Ecosystem?
DataFunTalk
DataFunTalk
Feb 23, 2021 · Big Data

Meituan Hotel & Travel Data Governance: Journey, Practices, and Future Directions

This article outlines Meituan's hotel‑travel data governance evolution, describing the key quality, cost, security, standardization and efficiency challenges faced as the business scaled, and detailing the organizational, technical, metric, service and product‑entry solutions implemented to achieve systematic, measurable, and automated data governance.

Big DataData Governancedata security
0 likes · 19 min read
Meituan Hotel & Travel Data Governance: Journey, Practices, and Future Directions
Yanxuan Tech Team
Yanxuan Tech Team
Feb 5, 2021 · Big Data

How NetEase Yanxuan Built a Robust Data Task Governance System in 2020

This article details NetEase Yanxuan's 2020 initiative to improve data task governance, describing identified pain points, the pre‑mid‑post framework for model, baseline, and incident handling, and the resulting products, processes, and future plans for a more reliable data warehouse.

Baseline ManagementData GovernanceData Quality
0 likes · 27 min read
How NetEase Yanxuan Built a Robust Data Task Governance System in 2020

NetEase Yanxuan Data Task Governance Practice: Pre‑, In‑, and Post‑Operation Strategies

NetEase Yanxuan tackled data‑task governance by establishing pre‑operation guarantees, baseline‑driven in‑operation controls, and post‑operation interventions, delivering stable task output, reduced alarms, lineage awareness, rapid incident recovery, and reusable best‑practice products that earned the 2020 Technology Sharing Co‑building Award.

Baseline ManagementBig DataData Governance
0 likes · 25 min read
NetEase Yanxuan Data Task Governance Practice: Pre‑, In‑, and Post‑Operation Strategies
21CTO
21CTO
Jan 25, 2021 · Big Data

Understanding Data Lakes vs. Data Warehouses: A Complete Guide

This article provides a comprehensive overview of data lakes and data warehouses, explaining their definitions, architectures, differences, and practical use cases, while also covering related concepts such as OLTP/OLAP, ETL processes, data governance, and modern lakehouse solutions.

Data GovernanceData Lakedata-warehouse
0 likes · 95 min read
Understanding Data Lakes vs. Data Warehouses: A Complete Guide
Youzan Coder
Youzan Coder
Jan 20, 2021 · Information Security

How Youzan Built a Scalable Big Data Security Framework for Privacy Protection

This article details Youzan's end‑to‑end big data security architecture, covering data lifecycle protection, classification, access control, auditing, backup, privacy safeguards, sensitive data detection, masking strategies, and compliance processes to ensure secure and compliant data handling across the platform.

Data GovernanceSensitive Data Detectionbig data security
0 likes · 18 min read
How Youzan Built a Scalable Big Data Security Framework for Privacy Protection
Architects Research Society
Architects Research Society
Jan 13, 2021 · Fundamentals

Master Data Management (MDM): Concepts, Business Value, Technical Challenges, and Architectural Considerations

The article explains master data management (MDM) as a framework for creating a single, reliable source of truth, outlines its growing business relevance, discusses key technical challenges such as data governance and scalability, and explores next‑generation architectures involving graph databases, big data, and machine learning.

Big DataData GovernanceGraph Database
0 likes · 10 min read
Master Data Management (MDM): Concepts, Business Value, Technical Challenges, and Architectural Considerations
ITPUB
ITPUB
Jan 12, 2021 · Databases

What the Latest DTCC Conference Reveals About the Future of Databases

The DTCC conference recap explores emerging data trends, multi‑model databases, governance frameworks, architecture migrations, NewSQL and MySQL high‑availability, distributed transaction challenges, AI‑driven operations, data middle‑platform debates, cloud‑native storage‑compute separation, and comprehensive data security across the full data lifecycle.

Data GovernanceDistributed Systemscloud computing
0 likes · 19 min read
What the Latest DTCC Conference Reveals About the Future of Databases
Architects Research Society
Architects Research Society
Jan 11, 2021 · Fundamentals

Top Reasons Why MDM Implementations Fail

This article examines the common pitfalls that cause Master Data Management (MDM) projects to fail, including underestimating effort, insufficient resources, overly ambitious scope, lack of data governance, excessive rules, and inadequate executive support, offering practical insights for successful implementation.

Data GovernanceData QualityEnterprise Data
0 likes · 12 min read
Top Reasons Why MDM Implementations Fail
Youzan Coder
Youzan Coder
Dec 25, 2020 · Big Data

Metadata Governance and Collection in a Data Asset Platform

The platform implements comprehensive metadata governance by extracting, standardizing, and ingesting basic, trend, resource, lineage, and task metadata from offline and real‑time systems via a Kafka‑based SDK, enabling unified storage, monitoring, alerts, and future automation to improve data asset visibility and quality.

Big DataData GovernanceSDK
0 likes · 18 min read
Metadata Governance and Collection in a Data Asset Platform
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Dec 18, 2020 · Big Data

Unlocking the Data Middle Platform: From Ingestion to Real‑Time Analytics

This article provides a comprehensive overview of data middle platform concepts, covering data aggregation, collection tools, development modules, job scheduling, baseline control, heterogeneous storage, permission management, real‑time and offline processing, governance, services, and implementation details for building robust big‑data solutions.

Data GovernanceData PlatformETL
0 likes · 25 min read
Unlocking the Data Middle Platform: From Ingestion to Real‑Time Analytics
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 7, 2020 · Big Data

How to Build a New‑Retail Data Middle Platform with DataWorks

This article explains how new‑retail companies can design and implement a data middle platform using Alibaba Cloud's DataWorks, covering business model analysis, technical architecture, layer‑by‑layer data modeling, governance, security, and the concrete benefits of turning raw data into actionable business insights.

Big Data ArchitectureData GovernanceData Middle Platform
0 likes · 28 min read
How to Build a New‑Retail Data Middle Platform with DataWorks
DataFunTalk
DataFunTalk
Nov 24, 2020 · Artificial Intelligence

Building Next‑Generation Data Intelligence Infrastructure with Knowledge Graphs: From New Infrastructure to Cognitive AI Platforms

This presentation explains how knowledge graphs serve as the foundation for new‑infrastructure initiatives, detailing the evolution of AI from perception to cognition, the role of big‑data centers, DIKW modeling, intelligent data governance, and the construction of a cognitive AI middle‑platform for industry applications.

AI InfrastructureBig DataData Governance
0 likes · 18 min read
Building Next‑Generation Data Intelligence Infrastructure with Knowledge Graphs: From New Infrastructure to Cognitive AI Platforms
Beike Product & Technology
Beike Product & Technology
Nov 13, 2020 · Big Data

Beike One‑Stop Big Data Development Platform: Architecture, Evolution, and Future Outlook

The article summarizes Beike's one‑stop big data development platform, describing its data business background, the evolution from a simple Hadoop‑Kafka‑Hive stack to a metadata‑driven, asset‑oriented platform, and outlines current capabilities in data management, integration, scheduling, quality, openness, and future plans.

Big DataData GovernanceData Platform
0 likes · 11 min read
Beike One‑Stop Big Data Development Platform: Architecture, Evolution, and Future Outlook
DataFunTalk
DataFunTalk
Oct 7, 2020 · Big Data

Yanxuan Data Warehouse: Architecture, Standards, and Evaluation Framework

This article outlines the Yanxuan data warehouse’s layered architecture, the offline and real‑time development platforms, the comprehensive standards for metric definition, model design, and SQL development, and proposes a six‑dimensional evaluation system covering data norms, security, quality, stability, continuous improvement, and development efficiency.

Big DataData GovernanceSQL Standards
0 likes · 12 min read
Yanxuan Data Warehouse: Architecture, Standards, and Evaluation Framework