Tagged articles
184 articles
Page 2 of 2
Architects Research Society
Architects Research Society
May 24, 2022 · Big Data

Understanding Data Fabric: Key Pillars for Data & Analytics Leaders

The article explains the emerging concept of Data Fabric (data weaving), its design principles, how it integrates metadata, knowledge graphs, and AI/ML to automate data integration across hybrid and multi‑cloud environments, and outlines four essential pillars that leaders must master to deliver business value.

AI/MLData FabricKnowledge Graph
0 likes · 8 min read
Understanding Data Fabric: Key Pillars for Data & Analytics Leaders
Bilibili Tech
Bilibili Tech
May 24, 2022 · Big Data

Metadata Infrastructure and Governance in Bilibili Data Platform

Bilibili’s data platform consolidates scattered metadata into a unified URN‑based model stored across TiDB, Elasticsearch, and HugeGraph, offering batch‑pull and embedded collection, flexible SQL‑like queries, comprehensive lineage mapping, and powering data‑map, lineage‑map, and impact‑analysis tools while planning expanded quality assurance and self‑service dictionaries.

Data GovernanceData LineageData Platform
0 likes · 21 min read
Metadata Infrastructure and Governance in Bilibili Data Platform
Liangxu Linux
Liangxu Linux
Apr 28, 2022 · Fundamentals

What Is an Inode? Understanding Linux File Metadata and Links

This article explains the concept of inodes in Unix/Linux filesystems, detailing their structure, stored metadata, size calculations, inode numbers, directory handling, hard and soft links, and the special behaviors that arise from separating file names from inode identifiers.

FilesystemHard Linkinode
0 likes · 9 min read
What Is an Inode? Understanding Linux File Metadata and Links
Python Programming Learning Circle
Python Programming Learning Circle
Apr 8, 2022 · Fundamentals

A Comprehensive Guide to Python Decorators and Aspect-Oriented Programming (AOP)

This article explains the concept of Aspect‑Oriented Programming (AOP) and demonstrates how Python decorators—both function‑based and class‑based—can be used to implement AOP features such as pre‑ and post‑execution logic, handling arguments, preserving metadata with functools.wraps, and stacking multiple decorators.

Aspect Oriented ProgrammingPythonaop
0 likes · 18 min read
A Comprehensive Guide to Python Decorators and Aspect-Oriented Programming (AOP)
dbaplus Community
dbaplus Community
Dec 22, 2021 · Fundamentals

How Xiaomi Built a Scalable Metadata Platform for Data Governance

This article details Xiaomi's end‑to‑end metadata platform, covering its three‑layer architecture, the evolution of full‑domain metadata, real‑time lineage, precise measurement, and how these capabilities enable data map, governance, cost control, and quality improvements for future business empowerment.

Data GovernanceData QualityXiaomi
0 likes · 20 min read
How Xiaomi Built a Scalable Metadata Platform for Data Governance
Top Architect
Top Architect
Dec 13, 2021 · Big Data

Design and Implementation of BanYu's Big Data Access Control System

This article describes the evolution from an unsecured data warehouse to a comprehensive big‑data access control system at BanYu, detailing the background, data access methods, design goals, authentication and authorization mechanisms, policy configuration, integration with Metabase, and the overall workflow that balances security with efficiency.

Big DataHiveLDAP
0 likes · 15 min read
Design and Implementation of BanYu's Big Data Access Control System
DataFunTalk
DataFunTalk
Dec 9, 2021 · Big Data

Mobile Cloud LakeHouse: Cloud‑Native Big Data Analytics Architecture and Practices

This article introduces the cloud‑native LakeHouse solution from China Mobile Cloud, covering its lake‑warehouse integration concept, overall architecture, core functions such as storage‑compute separation, one‑click data ingestion, intelligent metadata discovery, serverless execution, JDBC support, incremental updates, and typical application scenarios in public and private clouds.

Big DataCloud NativeData Integration
0 likes · 17 min read
Mobile Cloud LakeHouse: Cloud‑Native Big Data Analytics Architecture and Practices
DataFunTalk
DataFunTalk
Nov 27, 2021 · Big Data

iQIYI Data Middle Platform: Architecture, Data Governance Practices, and Future Plans

The article details iQIYI’s data middle platform architecture and its comprehensive data governance practices, covering platform overview, data flow, unified standards, metadata management, production quality assurance, and future AI‑driven enhancements, illustrating how centralized data services improve reliability, efficiency, and security.

Big DataData GovernanceData Quality
0 likes · 27 min read
iQIYI Data Middle Platform: Architecture, Data Governance Practices, and Future Plans
Big Data Technology & Architecture
Big Data Technology & Architecture
Nov 10, 2021 · Fundamentals

What Is Data Governance and How to Implement It: Concepts, Goals, Methodology, Tools, and Case Studies

This article explains data governance—from its definition and why it’s needed, to its goals, core components, PDCA‑based methodology, essential tools, and real‑world implementations at Meituan and Ant Financial—providing a comprehensive guide for organizations seeking to manage and leverage their data assets effectively.

data securitymetadata
0 likes · 25 min read
What Is Data Governance and How to Implement It: Concepts, Goals, Methodology, Tools, and Case Studies
Aikesheng Open Source Community
Aikesheng Open Source Community
Oct 14, 2021 · Databases

Understanding reload @@config_all vs reload @@metadata in Dble and How to Resolve Metadata Issues

This article explains the differences between the Dble commands reload @@config_all and reload @@metadata, analyzes why metadata errors occur after table changes without configuration updates, and provides step‑by‑step guidance to synchronize metadata correctly using these commands and check full @@metadata.

metadatareload
0 likes · 7 min read
Understanding reload @@config_all vs reload @@metadata in Dble and How to Resolve Metadata Issues
Architects' Tech Alliance
Architects' Tech Alliance
Sep 11, 2021 · Big Data

Understanding Data Warehouses: Definitions, Differences, Architecture, Modeling, and Best Practices

This article explains what a data warehouse is, contrasts it with traditional databases, outlines how to design and build a warehouse—including model selection, subject‑area definition, bus matrix, layering, and data quality—while also covering related concepts such as data middle platforms, data lakes, metadata, and modeling techniques.

Big DataData QualityData Warehouse
0 likes · 16 min read
Understanding Data Warehouses: Definitions, Differences, Architecture, Modeling, and Best Practices
Ctrip Technology
Ctrip Technology
Sep 9, 2021 · Big Data

Building Data Lineage at Ctrip: Architecture, Implementation, and Real‑World Applications

This article describes how Ctrip built a data lineage system for its big data platform, covering the concept of data lineage, collection methods, open‑source tools such as Apache Atlas and DataHub, the in‑house table‑level and field‑level solutions, implementation details for Hive, Spark and Presto, storage in JanusGraph, and practical applications in data governance, metadata management, scheduling and sensitivity labeling.

Big DataHiveJanusGraph
0 likes · 16 min read
Building Data Lineage at Ctrip: Architecture, Implementation, and Real‑World Applications
IT Architects Alliance
IT Architects Alliance
Sep 1, 2021 · Big Data

Understanding Data Middle Platform Architecture and Its Core Components

The article explains the concept of a data middle platform, describing its architecture, the essential big‑data foundation, metadata management, data service components such as BI and tag systems, and how these layers together enable unified data access, governance, and business intelligence across enterprises.

Business IntelligenceData ArchitectureTag Management
0 likes · 14 min read
Understanding Data Middle Platform Architecture and Its Core Components
Efficient Ops
Efficient Ops
Aug 31, 2021 · Cloud Computing

Why Object Storage Is the New Backbone of Cloud Data Management

This article explains how object storage emerged as a cloud-native solution that surpasses traditional DAS, SAN, and NAS architectures by offering virtually unlimited capacity, robust metadata handling, and simple RESTful APIs for modern applications and large‑scale data workloads.

Data ArchitectureScalabilitycloud storage
0 likes · 11 min read
Why Object Storage Is the New Backbone of Cloud Data Management
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 23, 2021 · Databases

How MySQL 8.0’s Data Dictionary Eliminates Metadata Redundancy and Boosts Performance

MySQL 8.0 replaces duplicated server‑level and engine‑level metadata with a unified data dictionary stored in InnoDB, introduces a two‑level cache (local and shared) built on templated hash maps, and provides atomic DDL operations, dramatically improving metadata consistency, performance, and management simplicity.

DDLcache architecturedata dictionary
0 likes · 21 min read
How MySQL 8.0’s Data Dictionary Eliminates Metadata Redundancy and Boosts Performance
IT Architects Alliance
IT Architects Alliance
Aug 9, 2021 · Big Data

Data Warehouse Architecture Overview: Layers, Sources, Modeling, Storage, and Management

This article explains the logical layered architecture of modern data warehouses, covering data sources, ODS, DW/DWS layers, collection, storage on HDFS, synchronization tools, dimensional modeling (star, snowflake, constellation), metadata management, and task scheduling and monitoring, highlighting best practices for scalable big‑data solutions.

Data WarehouseETLmetadata
0 likes · 12 min read
Data Warehouse Architecture Overview: Layers, Sources, Modeling, Storage, and Management
Top Architect
Top Architect
Aug 3, 2021 · Fundamentals

Design and Considerations of Distributed File Systems

This article provides a comprehensive overview of distributed file systems, covering their historical evolution, essential requirements such as POSIX compliance, persistence, scalability, and security, and comparing centralized (e.g., GFS) and decentralized (e.g., Ceph) architectures along with strategies for high availability, performance optimization, and data consistency.

ConsistencyDistributed File SystemScalability
0 likes · 19 min read
Design and Considerations of Distributed File Systems
Tencent Cloud Middleware
Tencent Cloud Middleware
Jun 28, 2021 · Big Data

Getting Started with Kafka’s New KRaft Mode: A Step‑by‑Step Guide

This article introduces Apache Kafka’s KRaft (Kafka Raft) mode, explains its architectural differences from ZooKeeper‑based deployments, details essential configuration parameters, and provides a complete step‑by‑step procedure—including commands and utility tools—to set up and operate a KRaft cluster.

ConfigurationDeploymentDistributed Systems
0 likes · 14 min read
Getting Started with Kafka’s New KRaft Mode: A Step‑by‑Step Guide
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Jun 1, 2021 · Fundamentals

How Huawei Built a Comprehensive Data Governance Framework for Digital Transformation

Huawei’s 2017 digital‑transformation vision led to a five‑step data‑governance blueprint that evolved through two phases, defining a detailed data‑classification framework, structured and unstructured data management methods, metadata governance, and compliance‑driven external data handling to support enterprise‑wide intelligent operations.

Data Governancedata classificationmetadata
0 likes · 20 min read
How Huawei Built a Comprehensive Data Governance Framework for Digital Transformation
58 Tech
58 Tech
May 19, 2021 · Mobile Development

Hooking Swift Functions by Modifying the Virtual Table (VTable)

This article explains a novel Swift hooking technique that modifies the virtual function table (VTable) to replace method implementations, detailing Swift's runtime structures such as TypeContext, Metadata, OverrideTable, and providing concrete ARM64 assembly and Swift code examples.

HookingRuntimeSwift
0 likes · 19 min read
Hooking Swift Functions by Modifying the Virtual Table (VTable)
Big Data Technology & Architecture
Big Data Technology & Architecture
May 19, 2021 · Big Data

Comprehensive Guide to Data Governance: Metadata, Data Quality, Standards, and Asset Management

This article provides an extensive overview of data governance in the big‑data era, covering common pitfalls, the role of metadata, data quality management, data standardization, and data asset management, and offers practical recommendations for organizations to implement effective governance practices.

Big DataData Asset ManagementData Governance
0 likes · 42 min read
Comprehensive Guide to Data Governance: Metadata, Data Quality, Standards, and Asset Management
Meituan Technology Team
Meituan Technology Team
May 6, 2021 · Backend Development

GraphQL‑Based BFF Architecture with Metadata‑Driven Data Aggregation

The article describes a backend‑for‑frontend architecture that pushes GraphQL into the BFF layer, separates fetch and display units, drives execution with metadata, unifies query models, applies caching and parallel processing optimizations, and demonstrates over 50 % logic reuse and doubled development efficiency in production.

BFFBackend DevelopmentGraphQL
0 likes · 35 min read
GraphQL‑Based BFF Architecture with Metadata‑Driven Data Aggregation
DataFunTalk
DataFunTalk
Apr 30, 2021 · Cloud Native

JuiceFS: A Cloud‑Native Distributed File System for Big Data and AI Workloads

This article presents JuiceFS, an open‑source cloud‑native distributed file system that addresses the limitations of object storage for big‑data and AI workloads by providing strong consistency, high‑performance metadata, multi‑protocol support, small‑file management, and deep Kubernetes integration.

Cloud NativeDistributed File Systemartificial intelligence
0 likes · 13 min read
JuiceFS: A Cloud‑Native Distributed File System for Big Data and AI Workloads
Big Data Technology Architecture
Big Data Technology Architecture
Apr 19, 2021 · Big Data

Reframing Apache Hudi as a Data Lake Platform: Vision, Capabilities, and Future Directions

Apache Hudi is being re‑positioned from a simple table format to a full‑featured data lake platform, offering transactional storage, MVCC concurrency, metadata services, Deltastreamer ingestion, and plans for cache and timeline metadata services, aligning its vision with modern lakehouse architectures.

Apache HudiTransactional Storagemetadata
0 likes · 5 min read
Reframing Apache Hudi as a Data Lake Platform: Vision, Capabilities, and Future Directions
DataFunTalk
DataFunTalk
Apr 14, 2021 · Big Data

Beike's Data Development Platform: Evolution, Architecture, and Future Outlook

The talk by Beike senior engineer Yang Zongqiang details the evolution of the company's data development platform, covering background, three architecture upgrades, platform features such as metadata management, data integration, scheduling, quality assurance, and future directions for building an enterprise‑grade big‑data system.

Data PlatformData Qualitymetadata
0 likes · 21 min read
Beike's Data Development Platform: Evolution, Architecture, and Future Outlook
Big Data Technology Architecture
Big Data Technology Architecture
Apr 5, 2021 · Big Data

Understanding Apache Iceberg: Table Format Architecture, Comparison with Hive Metastore, and Business Benefits

This article introduces Apache Iceberg as an open table format for massive analytic datasets, explains its underlying concepts such as schema, partitioning, statistics, and read/write APIs, compares it with Hive Metastore, outlines its ACID commit process, highlights the performance and operational advantages for big‑data workloads, and previews upcoming community features.

ACIDApache IcebergParquet
0 likes · 19 min read
Understanding Apache Iceberg: Table Format Architecture, Comparison with Hive Metastore, and Business Benefits
21CTO
21CTO
Feb 14, 2021 · Cloud Computing

How Metadata‑Driven Multi‑Tenant Architecture Powers Scalable SaaS Platforms

This article explains how a metadata‑driven multi‑tenant data model decouples logical and physical schemas, enabling rapid SaaS product rollout, seamless scaling, fine‑grained customization, and zero‑downtime schema changes while ensuring data isolation, security, and high performance across millions of tenants.

SaaSmetadatamulti-tenant
0 likes · 45 min read
How Metadata‑Driven Multi‑Tenant Architecture Powers Scalable SaaS Platforms
DataFunTalk
DataFunTalk
Feb 2, 2021 · Big Data

Metadata Management: Concepts, Architecture, and Applications in Data Warehousing

This article explains the fundamentals and value of metadata, describes a comprehensive metadata management system and its layered architecture, outlines key technologies such as automatic SQL metadata extraction, and showcases practical applications like metadata query, impact analysis, data lineage, and business‑driven data needs within modern data warehouses.

Data LineageData WarehouseSQL parsing
0 likes · 17 min read
Metadata Management: Concepts, Architecture, and Applications in Data Warehousing
DataFunTalk
DataFunTalk
Jan 29, 2021 · Artificial Intelligence

Content Embedding Practices and Challenges at Hulu

This article presents Hulu's multi‑layered approach to content understanding and embedding, describing tag‑based graph embeddings, metadata‑BERT enhancements, multimodal video/audio feature aggregation, and various applications such as similarity search, ranking, cold‑start retrieval, and collection modeling, while also discussing current limitations and open research questions.

HuluRecommendation Systemscontent embedding
0 likes · 12 min read
Content Embedding Practices and Challenges at Hulu
360 Smart Cloud
360 Smart Cloud
Jan 28, 2021 · Big Data

Overview of the Qirin Big Data Platform: Architecture, Modules, and Capabilities

The article provides a comprehensive overview of the Qirin big‑data platform, detailing its architecture, core modules such as resource management, metadata, data ingestion, task development, interactive query, and self‑service analysis, and outlines future development plans for the system.

Data PlatformResource Managementdata ingestion
0 likes · 12 min read
Overview of the Qirin Big Data Platform: Architecture, Modules, and Capabilities
DataFunTalk
DataFunTalk
Jan 21, 2021 · Big Data

Kuaishou Metadata Platform: Evolution, Architecture, and Application Scenarios

This article introduces the development history, current architecture, abstraction methods, and key application scenarios of Kuaishou's metadata platform, highlighting challenges such as heterogeneous data integration, large-scale asset management, and the platform's role in data search, lineage, governance, and future enhancements.

Data LineageKuaishouSearch
0 likes · 16 min read
Kuaishou Metadata Platform: Evolution, Architecture, and Application Scenarios
360 Tech Engineering
360 Tech Engineering
Jan 7, 2021 · Big Data

Overview of the Qirin Big Data Platform Architecture and Core Modules

The article introduces the Qirin big data platform—a one‑stop solution covering resource management, metadata, data ingestion, task development, interactive querying, and self‑service analysis—detailing its modular architecture, typical processing workflow, and future development plans for enterprise‑wide data services.

Big DataData PlatformResource Management
0 likes · 11 min read
Overview of the Qirin Big Data Platform Architecture and Core Modules
Youzan Coder
Youzan Coder
Dec 25, 2020 · Big Data

Metadata Governance and Collection in a Data Asset Platform

The platform implements comprehensive metadata governance by extracting, standardizing, and ingesting basic, trend, resource, lineage, and task metadata from offline and real‑time systems via a Kafka‑based SDK, enabling unified storage, monitoring, alerts, and future automation to improve data asset visibility and quality.

Big DataData GovernanceSDK
0 likes · 18 min read
Metadata Governance and Collection in a Data Asset Platform
DataFunTalk
DataFunTalk
Dec 19, 2020 · Big Data

Evolution of iQIYI Data Warehouse from 1.0 to 2.0: Architecture, Modeling Practices, and Future Directions

This article details iQIYI's transition from a fragmented Data Warehouse 1.0 to a unified, standardized Data Warehouse 2.0, covering layered architecture, dimension and metric design, modeling workflows, metadata management, data lineage, and upcoming intelligent and automated data platform initiatives.

Data LineageData Warehousedata modeling
0 likes · 25 min read
Evolution of iQIYI Data Warehouse from 1.0 to 2.0: Architecture, Modeling Practices, and Future Directions
Sohu Tech Products
Sohu Tech Products
Dec 2, 2020 · Big Data

Optimizing Hive SQL Lineage Parsing: Techniques, Implementation, and Practical Insights

This article presents a comprehensive overview of Hive SQL lineage parsing, detailing the challenges of data provenance in large‑scale data warehouses, introducing ANTLR‑based parsing techniques, and describing a series of optimizations—including AST pruning, CTE handling, UDF registration, and metadata service integration—to improve both table‑level and column‑level lineage extraction and visualization.

ANTLRData WarehouseHive
0 likes · 18 min read
Optimizing Hive SQL Lineage Parsing: Techniques, Implementation, and Practical Insights
MaGe Linux Operations
MaGe Linux Operations
Dec 1, 2020 · Fundamentals

Why Journaling Keeps File Systems Safe: Write-Ahead Logging Explained

File systems risk data corruption during power loss or crashes because writes are not atomic, so journaling—recording intended operations in a write‑ahead log before committing them—ensures metadata and user data consistency, with variations like data journaling and ordered (metadata) journaling improving performance and reliability.

Write-Ahead Loggingdata integrityjournaling
0 likes · 6 min read
Why Journaling Keeps File Systems Safe: Write-Ahead Logging Explained
Big Data Technology Architecture
Big Data Technology Architecture
Nov 21, 2020 · Big Data

Multi-Engine Support and Future Directions of Alibaba Cloud Data Lake Building Service

The article explains how Alibaba Cloud's Data Lake Building Service enables fine‑grained lake management by integrating multiple compute engines—including EMR, MaxCompute, Blink, Hologres, PAI, and open‑source Hive, Spark, and Presto—through unified metadata and OSS storage, while outlining current features, special format support, and planned future enhancements.

Alibaba CloudEMROSS
0 likes · 9 min read
Multi-Engine Support and Future Directions of Alibaba Cloud Data Lake Building Service
iQIYI Technical Product Team
iQIYI Technical Product Team
Nov 13, 2020 · Big Data

Evolution of iQIYI Data Warehouse from 1.0 to 2.0: Architecture, Modeling, Metadata, and Data Lineage

The talk chronicles iQIYI’s shift from a fragmented five‑layer Data Warehouse 1.0 to a unified 2.0 architecture featuring a central Dimension Layer, business‑focused data marts, and subject‑oriented warehouses, while detailing platform services, rigorous metadata management, lineage tracking, and future goals of intelligent, automated, service‑oriented, model‑driven data governance.

Data Lineagedata modelingiQIYI
0 likes · 23 min read
Evolution of iQIYI Data Warehouse from 1.0 to 2.0: Architecture, Modeling, Metadata, and Data Lineage
Efficient Ops
Efficient Ops
Nov 4, 2020 · Fundamentals

How Journal File Systems Prevent Data Corruption After Crashes

Journal file systems use write‑ahead logging to record each write operation as a transaction, ensuring that after power loss or crashes the system can replay logs and maintain metadata and user‑data consistency, avoiding corruption and space waste through techniques like data, ordered, and metadata journaling.

Data ConsistencyWrite-Ahead Loggingfile system
0 likes · 8 min read
How Journal File Systems Prevent Data Corruption After Crashes
JavaEdge
JavaEdge
Sep 15, 2020 · Backend Development

How Kafka Uses ZooKeeper for Metadata Management and Client Coordination

This article explains how Kafka relies on ZooKeeper to store cluster metadata, detailing the ZK node hierarchy, the process by which clients locate brokers, the broker‑side handling of metadata requests, and recommended practices for large‑scale deployments.

Backend DevelopmentKafkametadata
0 likes · 8 min read
How Kafka Uses ZooKeeper for Metadata Management and Client Coordination
Youku Technology
Youku Technology
Aug 17, 2020 · Backend Development

Improving Development Efficiency with a Metadata Center: Architecture, Implementation, and Performance

The Metadata Center, built by Alibaba Entertainment, streamlines development by offering a searchable Data Source Plaza and a configurable Custom Interface engine that abstracts service calls, adds unified monitoring and circuit‑breaker safeguards, and leverages optimized scripting, cutting upfront coding effort and accelerating feature delivery across Youku applications.

GroovyService Integrationcircuit breaker
0 likes · 12 min read
Improving Development Efficiency with a Metadata Center: Architecture, Implementation, and Performance
Efficient Ops
Efficient Ops
Aug 5, 2020 · Cloud Computing

Why Object Storage Is the Next Big Thing in Cloud Computing

This article explains the fundamentals of object storage, compares it with block and file storage, outlines its architecture, components, advantages, use cases, and limitations, showing why it has become the dominant storage model in modern cloud environments.

Data ArchitectureScalabilitycloud storage
0 likes · 11 min read
Why Object Storage Is the Next Big Thing in Cloud Computing
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 12, 2020 · Big Data

Common Metadata Management Patterns in Storage Systems

This article explains why metadata management is crucial for storage systems and reviews four typical approaches—initial external‑DB storage, in‑memory loading, partitioned services with a proxy layer, and tiered caching/persistence—illustrated with diagrams and real‑world examples.

Storage Systemsmetadatatiered architecture
0 likes · 5 min read
Common Metadata Management Patterns in Storage Systems
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 11, 2020 · Big Data

Alluxio Tiered Metadata Management and Asynchronous Cache Eviction Implementation

The article explains Alluxio's tiered metadata management architecture, describing how the system separates hot and cold metadata into cached and persisted layers, and details the custom asynchronous eviction thread and cache implementation that replace Guava cache for efficient large‑scale metadata handling.

AlluxioCachedistributed storage
0 likes · 15 min read
Alluxio Tiered Metadata Management and Asynchronous Cache Eviction Implementation
Architects Research Society
Architects Research Society
Jun 15, 2020 · Databases

Overview of Data Modeling, Architecture, Master Data Management, Metadata, and Data Quality

This article explains the concepts of data modeling and architecture, including logical data, process, and rule modeling, various data model types, master data management principles, metadata categories, and data quality management practices, highlighting their roles in enterprise information systems.

Data QualityMaster Data Managementdata modeling
0 likes · 9 min read
Overview of Data Modeling, Architecture, Master Data Management, Metadata, and Data Quality
Big Data Technology & Architecture
Big Data Technology & Architecture
May 24, 2020 · Big Data

Data Governance Core Areas and Practices for Banking

The article provides a comprehensive overview of banking data governance, covering core domains such as data models, metadata, standards, quality, lifecycle, distribution, exchange, security, and services, and explains how big‑data techniques can improve risk control, product innovation, and operational efficiency.

BankingData Qualitydata security
0 likes · 16 min read
Data Governance Core Areas and Practices for Banking
Youzan Coder
Youzan Coder
Mar 18, 2020 · Big Data

The Evolution of Youzan’s Data Warehouse in a Big Data Environment

The article traces Youzan’s data warehouse from its chaotic early days lacking structure, through a 2016 Airflow‑driven construction phase that introduced layered ODS/DW/Data Mart architecture and naming standards, to a mature stage focused on efficiency, security, SparkSQL, dimensional modeling, metadata, and ongoing real‑time and governance challenges.

AirflowBig DataData Governance
0 likes · 20 min read
The Evolution of Youzan’s Data Warehouse in a Big Data Environment
Meituan Technology Team
Meituan Technology Team
Mar 12, 2020 · Big Data

Data Governance Practices in Meituan Delivery: Architecture, Standards, and Security

Meituan Delivery’s data‑governance framework combines a four‑layer warehouse architecture with comprehensive business, technical, security, and resource‑management standards, continuous metadata and security controls, and tools such as Wherehows and QuickSight, delivering standardized, secure, and easily shareable data while guiding future optimization and emerging‑technology adoption.

Big DataData ArchitectureData Governance
0 likes · 27 min read
Data Governance Practices in Meituan Delivery: Architecture, Standards, and Security
ITPUB
ITPUB
Jan 10, 2020 · Fundamentals

Understanding Inodes: How Unix/Linux Stores File Metadata

This article explains Unix/Linux inodes—the metadata structures that store file information—covering their purpose, contents, size considerations, inode numbers, directory handling, hard and soft links, and special inode-related operations, with practical command examples and visual illustrations.

Hard LinkLinuxUnix
0 likes · 10 min read
Understanding Inodes: How Unix/Linux Stores File Metadata
vivo Internet Technology
vivo Internet Technology
Dec 18, 2019 · Big Data

Comprehensive Overview of Big Data Architecture, Lambda/Kappa Models, and End-to-End Data Platform Design

The article surveys modern big‑data architecture, contrasting Lambda and Kappa models, highlights common governance and integration pain points, and proposes an end‑to‑end platform featuring unified metadata, stream‑batch processing, one‑click ingestion, standardized modeling, intelligent query abstraction, and a comprehensive development IDE.

Big DataData PlatformETL
0 likes · 13 min read
Comprehensive Overview of Big Data Architecture, Lambda/Kappa Models, and End-to-End Data Platform Design
Programmer DD
Programmer DD
Nov 7, 2019 · Backend Development

Master Spring Boot Configuration Processor to Generate Accurate Metadata

This tutorial explains how to use Spring Boot's Configuration Processor to generate JSON metadata for configuration properties, covering dependency setup, Java bean definitions, property files, tests, and how the resulting metadata improves IDE auto‑completion and documentation.

Configuration ProcessorConfigurationPropertiesJava
0 likes · 10 min read
Master Spring Boot Configuration Processor to Generate Accurate Metadata
DevOps Cloud Academy
DevOps Cloud Academy
Aug 11, 2019 · Big Data

Overview of MFS Distributed File System Architecture Similar to GoogleFS

The article explains the MFS distributed file system, detailing its four components—Master, Metalogger, Chunkserver, and Client—along with hardware recommendations, metadata handling, replication strategies, and FUSE‑based client mounting, providing a comprehensive guide to building a GoogleFS‑like storage cluster.

Big DataDistributed File SystemMFS
0 likes · 5 min read
Overview of MFS Distributed File System Architecture Similar to GoogleFS
21CTO
21CTO
Jul 2, 2019 · Backend Development

Designing a Scalable Feed Stream System for Billions of Users

This article explains how to design a high‑performance feed‑stream architecture—including product definition, data modeling, storage choices, synchronization modes, metadata handling, commenting, likes, sorting, search, and deletion—so that a system can support tens of millions to billions of users while remaining reliable and scalable.

ScalabilitySearchSynchronization
0 likes · 21 min read
Designing a Scalable Feed Stream System for Billions of Users
Tencent Database Technology
Tencent Database Technology
Mar 12, 2019 · Databases

Understanding MySQL 8.0 Data Dictionary, Atomic DDL, and Persistent Autoincrement

This article explains the evolution of MySQL's data dictionary from pre‑8.0 scattered metadata to the unified InnoDB dictionary tables in MySQL 8.0, covering storage structures, dictionary caching, information_schema changes, serialized dictionary information (SDI), atomic DDL mechanisms, persistent autoincrement handling, upgrade considerations, and provides practical code examples.

Atomic DDLInnoDBSQL
0 likes · 22 min read
Understanding MySQL 8.0 Data Dictionary, Atomic DDL, and Persistent Autoincrement
Efficient Ops
Efficient Ops
Mar 3, 2019 · Fundamentals

How Journal File Systems Prevent Data Loss After Crashes

Journal file systems protect against data corruption caused by power loss or crashes by recording each write operation as a transaction in a dedicated log, then committing the changes only after the log is safely stored, enabling replay to restore consistency.

Data ConsistencyWrite-Ahead Loggingfile system
0 likes · 6 min read
How Journal File Systems Prevent Data Loss After Crashes
Java Backend Technology
Java Backend Technology
Feb 27, 2019 · Backend Development

Revamping Dubbo Service Governance: Inside the New Dubbo Admin 0.1

Dubbo Admin 0.1, a freshly refactored standalone project, replaces the old Webx backend with Spring Boot, adopts Vue and Vuetify for the UI, integrates Swagger, and introduces updated configuration, tag routing, application‑level service governance, and metadata‑driven testing to fully support Dubbo 2.7 features.

ConfigurationDubboSpring Boot
0 likes · 8 min read
Revamping Dubbo Service Governance: Inside the New Dubbo Admin 0.1
Architects' Tech Alliance
Architects' Tech Alliance
Oct 15, 2018 · Databases

Data Sharding in Distributed Systems: Partitioning Strategies, Metadata Management, and Consistency Mechanisms

The article explains how distributed storage systems solve the fundamental problems of data sharding and redundancy by describing three sharding methods (hash, consistent‑hash, and range‑based), the criteria for choosing a shard key, the role of metadata servers, and consistency techniques such as leasing, all illustrated with concrete examples and code snippets.

Distributed Systemsconsistent hashingdatabases
0 likes · 25 min read
Data Sharding in Distributed Systems: Partitioning Strategies, Metadata Management, and Consistency Mechanisms
dbaplus Community
dbaplus Community
Sep 27, 2018 · Databases

How We Built an Automated DBA Platform: Architecture, Design, and Lessons

This article outlines the journey of a financial services company from manual DBA tasks through tool‑assisted operations to a fully automated platform, detailing the platform’s technical stack, functional modules, metadata design principles, evolving SQL audit workflow, and future directions for intelligent database operations.

AutomationDBASQL Auditing
0 likes · 17 min read
How We Built an Automated DBA Platform: Architecture, Design, and Lessons
Youzan Coder
Youzan Coder
Sep 15, 2018 · Big Data

How Data Empowers Operations: Insights from Youzan & NetEase’s Big Data Summit

On September 15, Youzan’s big-data team and NetEase YouShu hosted a technical sharing titled “The Road to Data-Driven Operations,” where speakers explored the evolution of Youzan’s data warehouse metadata system, the architecture of its big-data development platform, and the application of functional programming in visual data analysis, highlighting current trends and future directions.

Data WarehouseData visualizationOperations
0 likes · 4 min read
How Data Empowers Operations: Insights from Youzan & NetEase’s Big Data Summit
ITPUB
ITPUB
Sep 30, 2017 · Big Data

Designing Scalable Open‑Source ETL Systems: Lessons from Baidu Waimai

This talk details Baidu Waimai's end‑to‑end ETL design, covering demand sources, data flow patterns, multi‑stage system evolution, storage choices, scheduling architecture, configuration‑driven processing, quality monitoring, and how data lineage enables transparent, self‑service data delivery.

Big DataData QualityData Warehouse
0 likes · 25 min read
Designing Scalable Open‑Source ETL Systems: Lessons from Baidu Waimai
ITPUB
ITPUB
Sep 29, 2017 · Big Data

Designing an Open ETL System: Baidu Waimai’s Scalable Data Pipeline Practices

In this talk, a Baidu Waimai engineer explains the motivations, requirements, and architectural choices behind their open‑source ETL platform, covering data flow patterns, logical mappings, storage options, scheduling, metadata management, and quality monitoring to achieve scalable, transparent, and explainable data delivery.

Big DataETLScheduling
0 likes · 26 min read
Designing an Open ETL System: Baidu Waimai’s Scalable Data Pipeline Practices
21CTO
21CTO
Aug 4, 2017 · Artificial Intelligence

AI Behind Hulu's Video Recommendations: From Collaborative Filtering to Neural Nets

In this talk, Hulu’s research director Zhou Hanning explains the key factors influencing recommendation system performance, describes optimization goals, explores collaborative filtering, matrix factorization, and neural‑network approaches—including metadata‑driven transfer learning and cold‑start solutions for live streaming—and shares practical AI implementations that improve user experience and engagement.

AIRecommendation SystemsVideo Streaming
0 likes · 10 min read
AI Behind Hulu's Video Recommendations: From Collaborative Filtering to Neural Nets
Architect
Architect
Apr 19, 2017 · Fundamentals

Analysis of Ceph Bluestore Storage Engine Architecture

This article examines Ceph's Bluestore storage engine, describing its raw‑device management, metadata structures, write and read I/O processing, allocation strategies, and clone handling, highlighting how it reduces write amplification and optimizes for SSDs compared to the older Filestore.

BlueStoreCephI/O Architecture
0 likes · 9 min read
Analysis of Ceph Bluestore Storage Engine Architecture
Architecture Digest
Architecture Digest
Apr 15, 2016 · Operations

Google's Midas Package Manager (MPM): Architecture, Build Process, and Security Features

The article explains Google’s internal Midas Package Manager (MPM), detailing its build definition files, immutable and mutable metadata stored in Bigtable, distributed replication via Colossus, client‑side pull and P2P copying, as well as its access‑control, encryption, and signing mechanisms that enable massive, conflict‑free software deployment at Google’s scale.

GoogleMPMmetadata
0 likes · 11 min read
Google's Midas Package Manager (MPM): Architecture, Build Process, and Security Features
dbaplus Community
dbaplus Community
Dec 16, 2015 · Databases

Understanding Oracle ASM Metadata: Files, AU, Disk Headers, and Recovery

This article explains Oracle ASM metadata concepts, including ASM file types, allocation units, physical and virtual metadata structures, disk header composition, and practical methods for querying, validating, backing up, and restoring ASM metadata using tools like KFOD, KFED, and X$KFFXP.

ASMAllocation UnitBackup
0 likes · 14 min read
Understanding Oracle ASM Metadata: Files, AU, Disk Headers, and Recovery

Challenges and Strategies for Metadata Protection in Cloud Object Storage

This article examines the problems and trade‑offs of metadata protection in cloud object storage, highlighting the importance of reliability, availability, and consistency, and comparing primary‑secondary and multi‑replica models, including the WRN algorithm and practical operational considerations.

ConsistencyDistributed SystemsOperations
0 likes · 28 min read
Challenges and Strategies for Metadata Protection in Cloud Object Storage