Tagged articles
255 articles
Page 3 of 3
Big Data Technology & Architecture
Big Data Technology & Architecture
May 19, 2021 · Big Data

Comprehensive Guide to Data Governance: Metadata, Data Quality, Standards, and Asset Management

This article provides an extensive overview of data governance in the big‑data era, covering common pitfalls, the role of metadata, data quality management, data standardization, and data asset management, and offers practical recommendations for organizations to implement effective governance practices.

Big DataData Asset ManagementData Governance
0 likes · 42 min read
Comprehensive Guide to Data Governance: Metadata, Data Quality, Standards, and Asset Management
Big Data Technology & Architecture
Big Data Technology & Architecture
May 11, 2021 · Big Data

Data Quality: Dimensions, Rules, and Constraints

The article explains the importance of data quality in the big data era, defines key quality dimensions such as completeness, uniqueness, validity, consistency, accuracy, timeliness, and credibility, and details how each dimension can be measured and enforced through specific constraints and validation rules.

Big DataConsistencyData Governance
0 likes · 9 min read
Data Quality: Dimensions, Rules, and Constraints
Meituan Technology Team
Meituan Technology Team
Apr 15, 2021 · Big Data

Data Governance Practices at Meituan Hotel & Travel Platform

Meituan’s hotel‑travel platform tackled exploding data‑quality, cost, efficiency, and security issues by establishing a full‑link governance framework—standardized processes, a Data Management Committee, and unified “One Model, One Logic, One Service, One Portal” systems—that cut per‑unit costs by ~40%, boosted engineer productivity over 60%, eliminated major security incidents, and set the stage for autonomous, AI‑driven data governance.

Big DataData GovernanceData Quality
0 likes · 32 min read
Data Governance Practices at Meituan Hotel & Travel Platform
DataFunTalk
DataFunTalk
Apr 14, 2021 · Big Data

Beike's Data Development Platform: Evolution, Architecture, and Future Outlook

The talk by Beike senior engineer Yang Zongqiang details the evolution of the company's data development platform, covering background, three architecture upgrades, platform features such as metadata management, data integration, scheduling, quality assurance, and future directions for building an enterprise‑grade big‑data system.

Data PlatformData Qualitymetadata
0 likes · 21 min read
Beike's Data Development Platform: Evolution, Architecture, and Future Outlook
HelloTech
HelloTech
Mar 26, 2021 · Big Data

Data Quality and Interface Semantic Monitoring for Algorithm Testing Platform

The article describes how algorithm testing teams tackled data‑quality and interface‑semantic monitoring problems by building a unified business monitoring platform that checks table, storage and service consistency, validates response semantics, and, through dashboards, alerts and correction tools, quickly identified dozens of offline and online issues, guiding future reliability enhancements.

AIBig DataData Quality
0 likes · 26 min read
Data Quality and Interface Semantic Monitoring for Algorithm Testing Platform
DataFunTalk
DataFunTalk
Mar 6, 2021 · Big Data

Youzan Data Governance: Quality Assurance, Cost Management, and Operational Practices

This article explains Youzan's data governance framework, covering the definition of data governance, the company's asset‑centric approach, quantitative quality scoring, cost‑based pricing formulas, billing and allocation mechanisms, continuous operational improvements, and the measurable outcomes achieved.

Cost OptimizationData PlatformData Quality
0 likes · 17 min read
Youzan Data Governance: Quality Assurance, Cost Management, and Operational Practices
Yanxuan Tech Team
Yanxuan Tech Team
Feb 5, 2021 · Big Data

How NetEase Yanxuan Built a Robust Data Task Governance System in 2020

This article details NetEase Yanxuan's 2020 initiative to improve data task governance, describing identified pain points, the pre‑mid‑post framework for model, baseline, and incident handling, and the resulting products, processes, and future plans for a more reliable data warehouse.

Baseline ManagementData GovernanceData Quality
0 likes · 27 min read
How NetEase Yanxuan Built a Robust Data Task Governance System in 2020
Architects Research Society
Architects Research Society
Jan 11, 2021 · Fundamentals

Top Reasons Why MDM Implementations Fail

This article examines the common pitfalls that cause Master Data Management (MDM) projects to fail, including underestimating effort, insufficient resources, overly ambitious scope, lack of data governance, excessive rules, and inadequate executive support, offering practical insights for successful implementation.

Data GovernanceData QualityEnterprise Data
0 likes · 12 min read
Top Reasons Why MDM Implementations Fail
DataFunTalk
DataFunTalk
Jan 9, 2021 · Big Data

Building a Traffic and Event‑Tracking System at NetEase Yanxuan: Tagging, Management, Attribution, and Quality Assurance

This article details how NetEase Yanxuan designed and implemented a comprehensive traffic system—including event‑tagging methods, a top‑down management framework, data‑quality controls, testing strategies, and attribution models—to turn fragmented user behavior into actionable e‑commerce insights.

Data Qualitye-commerce analyticsevent tagging
0 likes · 18 min read
Building a Traffic and Event‑Tracking System at NetEase Yanxuan: Tagging, Management, Attribution, and Quality Assurance
Xianyu Technology
Xianyu Technology
Jan 8, 2021 · Mobile Development

Data Quality Assurance Solution for Mobile App Tracking Points

The document proposes a data‑quality assurance framework for mobile‑app tracking points that automatically collects client‑side data, generates validation rules from historical samples, and runs automated tests on over 100 critical points—cutting manual verification from half a day to minutes and using tools such as Frida and AOP to detect missing or altered tracking data.

Automated TestingData QualityTechnical Solution
0 likes · 7 min read
Data Quality Assurance Solution for Mobile App Tracking Points
DataFunSummit
DataFunSummit
Nov 17, 2020 · Big Data

Sohu Intelligent Media Data Warehouse Architecture and Technical Practices

This article presents Sohu Intelligent Media's data warehouse construction practice, covering fundamental concepts, batch and real‑time processing, OLAP theory, multidimensional modeling, workflow management, data quality, metadata lineage, and security, with a focus on Apache Doris and a Lambda‑style architecture.

Apache DorisBatch ProcessingData Quality
0 likes · 18 min read
Sohu Intelligent Media Data Warehouse Architecture and Technical Practices
NetEase Yanxuan Technology Product Team
NetEase Yanxuan Technology Product Team
Oct 23, 2020 · Industry Insights

How NetEase Yanxuan Built a Scalable Data Product System: Lessons & Practices

This article details NetEase Yanxuan's four‑stage journey—from establishing a business‑centric BI platform to ensuring data quality, empowering CXOs with mobile dashboards, and delivering scenario‑specific data products—highlighting the challenges faced, technical solutions implemented, and key takeaways for building enterprise data products.

BI platformData ProductData Quality
0 likes · 18 min read
How NetEase Yanxuan Built a Scalable Data Product System: Lessons & Practices
IT Architects Alliance
IT Architects Alliance
Sep 29, 2020 · Big Data

How Qualitis Ensures High‑Availability Data Quality Monitoring on Big Data Platforms

Qualitis is a big‑data‑platform‑based data‑quality‑management service that defines, detects, and reports data‑set quality issues, featuring idempotent backend services, load‑balanced high‑availability, Zookeeper‑coordinated process synchronization, thread‑pool throttling, and clearly separated internal and external APIs.

ArchitectureBig DataData Quality
0 likes · 6 min read
How Qualitis Ensures High‑Availability Data Quality Monitoring on Big Data Platforms
JD Retail Technology
JD Retail Technology
Sep 28, 2020 · Artificial Intelligence

Why AI Testing Is Still Painful and How to Solve It

The talk explores the current pain points of AI testing, outlines data‑quality analysis methods, highlights critical ETL and model‑testing considerations, and shares practical case studies and platform designs to improve machine‑learning quality assurance.

AI testingData QualityETL
0 likes · 5 min read
Why AI Testing Is Still Painful and How to Solve It
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Aug 24, 2020 · Big Data

How to Master Data Quality Management in the Big Data Era

This article explores the concept of data quality, identifies ten common root causes, presents a comprehensive data quality management framework, outlines evaluation methods and key dimensions, and discusses future challenges and tools for improving data quality in large‑scale data environments.

Data GovernanceData ManagementData Quality
0 likes · 16 min read
How to Master Data Quality Management in the Big Data Era
Big Data Technology Architecture
Big Data Technology Architecture
Jun 29, 2020 · Big Data

Real‑time Data Warehouse Construction: Goals, Architecture, and Best Practices with Apache Flink

This article summarizes the objectives, design principles, application scenarios, layer‑by‑layer construction methods, quality assurance mechanisms, and supporting tools for building a real‑time data warehouse using Apache Flink, providing practical guidance for data engineers and architects.

Apache FlinkData QualityFlink
0 likes · 24 min read
Real‑time Data Warehouse Construction: Goals, Architecture, and Best Practices with Apache Flink
Architects Research Society
Architects Research Society
Jun 16, 2020 · Information Security

Information Governance: Roles, Responsibilities, and Key Processes

Information governance is a program that ensures enterprise data accuracy, completeness, consistency, accessibility, and security by establishing business‑driven roles such as a data governance committee, data stewards, and data custodians, and by defining key responsibilities, processes, and metrics for data quality, privacy, and compliance.

Data GovernanceData QualityEnterprise Data Management
0 likes · 11 min read
Information Governance: Roles, Responsibilities, and Key Processes
Architects Research Society
Architects Research Society
Jun 15, 2020 · Databases

Overview of Data Modeling, Architecture, Master Data Management, Metadata, and Data Quality

This article explains the concepts of data modeling and architecture, including logical data, process, and rule modeling, various data model types, master data management principles, metadata categories, and data quality management practices, highlighting their roles in enterprise information systems.

Data QualityMaster Data Managementdata modeling
0 likes · 9 min read
Overview of Data Modeling, Architecture, Master Data Management, Metadata, and Data Quality
TAL Education Technology
TAL Education Technology
Jun 11, 2020 · Big Data

Data Quality Monitoring: Standards, Practices, and Technical Solutions

This article outlines the importance of data quality in the big‑data era, defines evaluation criteria such as integrity, accuracy, consistency and timeliness, describes daily monitoring and reconciliation processes, and proposes technical solutions and challenges for building a comprehensive data‑quality monitoring platform.

Data GovernanceData QualityOperations
0 likes · 7 min read
Data Quality Monitoring: Standards, Practices, and Technical Solutions
58 Tech
58 Tech
Jun 10, 2020 · Big Data

Real‑time Data Warehouse Practices at 58 Tongcheng Bao: From Spark Streaming 1.0 to Flink‑based 2.0

This article details the evolution of 58 Tongcheng Bao's real‑time data warehouse, describing the initial Spark‑Streaming architecture, its limitations, and the redesign using Flink with a layered ODS‑DWD‑DWS‑APP model, data‑quality monitoring, join techniques, and the resulting improvements in latency and accuracy.

Big DataData QualityFlink
0 likes · 9 min read
Real‑time Data Warehouse Practices at 58 Tongcheng Bao: From Spark Streaming 1.0 to Flink‑based 2.0
Big Data Technology & Architecture
Big Data Technology & Architecture
May 24, 2020 · Big Data

Data Governance Core Areas and Practices for Banking

The article provides a comprehensive overview of banking data governance, covering core domains such as data models, metadata, standards, quality, lifecycle, distribution, exchange, security, and services, and explains how big‑data techniques can improve risk control, product innovation, and operational efficiency.

BankingData Qualitydata security
0 likes · 16 min read
Data Governance Core Areas and Practices for Banking
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 18, 2020 · Big Data

How Alibaba Youku Guarantees Real‑Time Data Quality for Massive Video Search

Amid the pandemic‑driven surge in online video demand, Alibaba Youku built a comprehensive real‑time data quality assurance system—covering data content, consistency, correctness, availability, timeliness, performance testing, and automated intervention—to ensure that billions of video search results are delivered accurately and efficiently.

Data Qualitytesting
0 likes · 15 min read
How Alibaba Youku Guarantees Real‑Time Data Quality for Massive Video Search
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Mar 17, 2020 · Fundamentals

A Real‑Life Example of User Profiling to Boost Sales

This article uses a vivid kite‑selling story to illustrate how user profiling, data tagging, and recommendation tactics can be combined to increase transaction volume, improve average order value, and avoid common pitfalls such as unclear goals, poor data quality, and unvalidated tags.

Data QualityMarketing Strategydata analysis
0 likes · 9 min read
A Real‑Life Example of User Profiling to Boost Sales
58 Tech
58 Tech
Feb 10, 2020 · Big Data

Construction and Practice of a Site-wide User Behavior Data Warehouse at 58.com

This article systematically describes the challenges, design principles, modeling methods, layered architecture, implementation steps, and standards used in building a comprehensive user behavior data warehouse for 58.com, highlighting practical experiences and future improvement directions.

Big DataData QualityETL
0 likes · 11 min read
Construction and Practice of a Site-wide User Behavior Data Warehouse at 58.com
58 Tech
58 Tech
Oct 21, 2019 · Big Data

Improving Information Exposure Measurement: Visible Ad Metrics and Data Processing Practices at 58 Platform

To address inaccuracies in traditional information exposure metrics, this article proposes adopting advertising visibility standards—defining visible exposure by pixel and time thresholds, implementing client-side logging, unique TID tracking, and ETL pipelines—to provide more reliable data for product strategy and user behavior analysis.

Big DataData Qualityad visibility
0 likes · 8 min read
Improving Information Exposure Measurement: Visible Ad Metrics and Data Processing Practices at 58 Platform
Youzan Coder
Youzan Coder
Aug 23, 2019 · Big Data

How to Build a Robust Event Logging Quality System with Real‑Time Validation

This article outlines common event‑logging quality problems, a systematic registration and real‑time validation framework built on Flink, configurable rule syntax, explainable results, continuous monitoring, targeted optimizations, and an evaluation model that together form a comprehensive quality‑center for big‑data platforms.

Big DataData QualityFlink
0 likes · 11 min read
How to Build a Robust Event Logging Quality System with Real‑Time Validation
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 30, 2019 · Big Data

How to Build a Systematic Data Quality Model for Big Data Testing

This article presents a comprehensive data quality model derived from ISO 9126, maps its characteristics to data testing, outlines practical testing methods and tool requirements, and demonstrates how to integrate quality checks into the data development lifecycle for reliable, efficient big‑data pipelines.

Data QualityData ReliabilityISO 9126
0 likes · 28 min read
How to Build a Systematic Data Quality Model for Big Data Testing
Suning Technology
Suning Technology
Jul 3, 2019 · Artificial Intelligence

Debunking Common AI Myths: What Every Business Should Know

This article dispels five widespread AI misconceptions—from believing AI works like the human brain to thinking it is bias‑free—while offering practical guidance on recognizing AI limits, improving data quality, managing risks, and applying AI responsibly across industries.

AIBusiness strategyData Quality
0 likes · 13 min read
Debunking Common AI Myths: What Every Business Should Know
DataFunTalk
DataFunTalk
Jul 1, 2019 · Artificial Intelligence

Data-Driven Foundations for Building Recommendation Systems

The article explains how data serves as a critical asset for recommendation systems, outlining the necessary steps from understanding business problems and data dimensions to collection, cleaning, integration, and analysis, while distinguishing explicit and implicit user feedback and emphasizing data quality, timeliness, and relevance.

Data QualityETLdata collection
0 likes · 11 min read
Data-Driven Foundations for Building Recommendation Systems
Ctrip Technology
Ctrip Technology
Apr 11, 2019 · Artificial Intelligence

An Overview of Anomaly Detection Methods and Their Applications

This article introduces the concept of anomaly detection, outlines common application scenarios such as ELT pipelines, feature engineering, A/B testing, and fraud detection, and reviews various detection methods—including statistical models, machine learning, rule‑based logic, and density‑based techniques—while discussing practical implementation considerations.

Data QualityTime Seriesanomaly detection
0 likes · 12 min read
An Overview of Anomaly Detection Methods and Their Applications
AntTech
AntTech
Feb 27, 2019 · Big Data

Ant Financial Data Governance: Practices and Challenges in Data Quality Management

The article details Ant Financial’s comprehensive data quality governance framework, covering its architecture, challenges, implementation strategies, and real‑world case studies, illustrating how the company integrates data monitoring, AI‑driven self‑healing, and rigorous release controls to ensure high‑quality data across its platform.

Ant FinancialBig DataData Governance
0 likes · 17 min read
Ant Financial Data Governance: Practices and Challenges in Data Quality Management
Beike Product & Technology
Beike Product & Technology
Dec 6, 2018 · Artificial Intelligence

Real Estate Rental Platform: True Listing Model and Credit System Construction

This presentation details how Beike Rental leverages big data and machine‑learning techniques to detect non‑authentic listings, build a four‑criterion true‑listing model, develop pricing and image‑analysis models, and design a merchant credit scoring system that improves service quality and market efficiency.

Credit ScoringData QualityImage Analysis
0 likes · 27 min read
Real Estate Rental Platform: True Listing Model and Credit System Construction
Efficient Ops
Efficient Ops
Aug 15, 2018 · Operations

Why Most CMDB Projects Fail and How Huawei Made It Work

This article analyzes the common reasons CMDB initiatives collapse, shares Huawei's three‑phase journey from inception to value creation, and distills practical lessons on data consumption, accuracy, automation, visualization, and organizational execution for successful configuration management.

CMDBConfiguration ManagementData Quality
0 likes · 27 min read
Why Most CMDB Projects Fail and How Huawei Made It Work
Architects' Tech Alliance
Architects' Tech Alliance
May 19, 2018 · Industry Insights

Which Five Emerging Tech Trends Will Redefine Enterprises in 2018?

Accenture's 2018 Technology Vision outlines five pivotal trends—Citizen AI, Extended Reality, True Data, Large‑Scale Collaboration, and the Intelligent Internet—explaining how they reshape business models, customer relationships, and the underlying infrastructure needed for future growth.

Artificial IntelligenceData QualityEnterprise Collaboration
0 likes · 6 min read
Which Five Emerging Tech Trends Will Redefine Enterprises in 2018?
dbaplus Community
dbaplus Community
Apr 3, 2018 · Big Data

How Meituan Built DataMan: A Scalable Data Quality Monitoring Platform for Big Data

This article details Meituan's DataMan platform, describing the background of data quality challenges, the eight-step PDCA-driven solution, architectural design, technical stack, monitoring standards, and the resulting improvements in data governance and operational efficiency across their massive data warehouse ecosystem.

Big DataData GovernanceData Quality
0 likes · 20 min read
How Meituan Built DataMan: A Scalable Data Quality Monitoring Platform for Big Data
Meituan Technology Team
Meituan Technology Team
Mar 22, 2018 · Big Data

DataMan: A Data Quality Governance Platform for Meituan's Big Data Ecosystem

Meituan’s DataMan platform provides a unified, closed‑loop data‑quality governance solution that collects demand, refines rules, executes monitoring across offline and real‑time jobs, tracks issues, and builds a knowledge base, improving completeness, accuracy, consistency, and timeliness while optimizing storage, reducing fault resolution time, and supporting data‑driven decisions.

Data GovernanceData Qualitydata-warehouse
0 likes · 17 min read
DataMan: A Data Quality Governance Platform for Meituan's Big Data Ecosystem
Liulishuo Tech Team
Liulishuo Tech Team
Oct 22, 2017 · Big Data

Data-CI: A SQL-Based Data Unit Testing Framework for ETL

The article introduces data-ci, a SQL‑driven unit testing framework that lets engineers write, organize, and automate data validation tests for ETL pipelines, providing assertions, failure callbacks, coverage reporting, and CI integration to improve data quality and reliability.

Big DataData QualityETL
0 likes · 9 min read
Data-CI: A SQL-Based Data Unit Testing Framework for ETL
ITPUB
ITPUB
Sep 30, 2017 · Big Data

Designing Scalable Open‑Source ETL Systems: Lessons from Baidu Waimai

This talk details Baidu Waimai's end‑to‑end ETL design, covering demand sources, data flow patterns, multi‑stage system evolution, storage choices, scheduling architecture, configuration‑driven processing, quality monitoring, and how data lineage enables transparent, self‑service data delivery.

Big DataData QualityETL
0 likes · 25 min read
Designing Scalable Open‑Source ETL Systems: Lessons from Baidu Waimai
JD Retail Technology
JD Retail Technology
Sep 13, 2017 · Artificial Intelligence

Machine Learning Applications for Product Data Quality and Knowledge Graph Construction at JD.com

At the 2nd China Big Data International Summit 2017, JD’s chief architect presented how machine‑learning techniques are applied across e‑commerce to improve product data quality, ensure compliance, resolve image‑text mismatches, automate category identification, restructure titles, and build a multi‑dimensional product knowledge graph.

Artificial IntelligenceData Qualityknowledge graph
0 likes · 9 min read
Machine Learning Applications for Product Data Quality and Knowledge Graph Construction at JD.com
Baidu Waimai Technology Team
Baidu Waimai Technology Team
Mar 21, 2017 · Backend Development

Automated Testing Framework for Baidu Waimai User Profiling Using Asynchronous Coroutines

This article describes how Baidu Waimai’s user‑profile offline data system was equipped with a highly automated, coroutine‑based testing framework that dramatically improves field‑value accuracy verification, test coverage, and execution efficiency across strategy, ES, and API layers.

Automated TestingBackendBaidu Waimai
0 likes · 9 min read
Automated Testing Framework for Baidu Waimai User Profiling Using Asynchronous Coroutines
Baidu Intelligent Testing
Baidu Intelligent Testing
Jul 28, 2016 · Operations

Ensuring Store Data Quality in O2O Products: Processes and Rules

This article outlines the importance of store data in O2O products and presents a comprehensive workflow—including single‑attribute rules, multi‑attribute cross‑validation, and auxiliary checks—to detect and remediate low‑quality or erroneous store information, thereby improving user experience.

Data QualityO2OOperations
0 likes · 8 min read
Ensuring Store Data Quality in O2O Products: Processes and Rules
Baidu Intelligent Testing
Baidu Intelligent Testing
Jul 5, 2016 · Operations

O2O Data Quality Assurance Process for Online Movie Seat Selection

The article outlines a comprehensive O2O data quality assurance workflow for online movie seat selection, detailing background challenges, a three‑stage process, evaluation metrics, and a concrete case study that demonstrates how real‑time data monitoring and issue handling improve user experience.

Data QualityO2OOperations
0 likes · 6 min read
O2O Data Quality Assurance Process for Online Movie Seat Selection