Tagged articles
212 articles
Page 1 of 3
DataFunTalk
DataFunTalk
May 6, 2026 · Big Data

How Xiaohongshu Evolved Its Data Architecture for the Big AI Data Era

The article details Xiaohongshu's four‑stage data‑platform evolution—from a simple ClickHouse ad‑hoc setup to a Lambda‑based 2.0 design and finally a lakehouse‑driven 3.0 architecture—highlighting the adoption of general incremental compute, cost‑reduction to one‑third, performance gains of up to ten‑fold, and the SPOT standards that guide the new system.

Big DataData ArchitectureFlink
0 likes · 21 min read
How Xiaohongshu Evolved Its Data Architecture for the Big AI Data Era
DataFunTalk
DataFunTalk
Apr 29, 2026 · Big Data

How Xiaohongshu Revamped Its Data Architecture for the Big AI Data Era

Xiaohongshu transformed its data platform from a simple ClickHouse‑based analytics stack to a unified lakehouse with generic incremental compute, cutting architecture complexity, resource cost, and development effort by roughly one‑third while supporting petabyte‑scale, sub‑second queries across its 350 million‑user app.

Big DataData ArchitectureFlink
0 likes · 22 min read
How Xiaohongshu Revamped Its Data Architecture for the Big AI Data Era
DataFunTalk
DataFunTalk
Apr 22, 2026 · Industry Insights

How Xiaohongshu Cut Data Platform Costs by Two‑Thirds with Incremental Computing

This article details Xiaohongshu's journey from a ClickHouse‑based batch analytics stack to a unified lakehouse architecture powered by generic incremental computing, showing how the company reduced architecture complexity, resource consumption and development effort each to roughly one‑third while supporting trillions of daily events with sub‑10‑second query latency.

Big DataData ArchitectureLakehouse
0 likes · 24 min read
How Xiaohongshu Cut Data Platform Costs by Two‑Thirds with Incremental Computing
JD Tech
JD Tech
Apr 16, 2026 · Industry Insights

How JD Revolutionized Coupon Search with a Stream‑Batch Unified Architecture

This article analyzes JD's end‑to‑end upgrade of its retail coupon search infrastructure, detailing the business drivers, data‑skew challenges, the shift from dual KV and batch pipelines to a unified stream‑batch model built on Apache Doris, and the resulting performance, resource and stability gains across multiple scenarios.

Apache DorisBatch ProcessingCoupon Search
0 likes · 12 min read
How JD Revolutionized Coupon Search with a Stream‑Batch Unified Architecture
DataFunTalk
DataFunTalk
Apr 16, 2026 · Big Data

How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing

This article details Xiaohongshu's data platform evolution from a simple ClickHouse‑based ad‑hoc system to a Lambda‑style architecture and finally a lakehouse solution, highlighting how the adoption of a new incremental computing model reduced architectural complexity, resource consumption and development effort each to roughly one‑third while delivering sub‑second query performance on petabyte‑scale data.

Big DataData ArchitectureLakehouse
0 likes · 21 min read
How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing
DataFunTalk
DataFunTalk
Apr 10, 2026 · Big Data

How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing

This article analyzes Xiaohongshu's data platform evolution—from a simple ClickHouse‑based analytics layer to a Lambda architecture and finally a lakehouse design—highlighting how adopting a new incremental computing model reduced architecture complexity, resource consumption, and development effort each to roughly one‑third while delivering sub‑second query performance on petabyte‑scale data.

Big DataData ArchitectureLakehouse
0 likes · 22 min read
How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing
AI Info Trend
AI Info Trend
Apr 8, 2026 · Artificial Intelligence

Why Strong Data Foundations Are Crucial for Scaling Agentic AI

A McKinsey report reveals that while two‑thirds of enterprises have tried agentic AI, less than 10% achieve scalable value, and robust, modern data architectures—built on seven concrete principles and a four‑step implementation plan—are the decisive factor.

AI scalingAgentic AIData Architecture
0 likes · 7 min read
Why Strong Data Foundations Are Crucial for Scaling Agentic AI
AI Large-Model Wave and Transformation Guide
AI Large-Model Wave and Transformation Guide
Apr 7, 2026 · Artificial Intelligence

How a Four‑Layer Pyramid Powers AI‑Driven Autonomous Ship Navigation

The article dissects the four‑layer data architecture that transforms raw maritime charts into structured knowledge, enables large models to reason about navigation scenarios, and combines algorithmic route screening with LLM‑based refinement to deliver safe, compliant, and efficient autonomous ship routing.

AIData ArchitectureMaritime
0 likes · 12 min read
How a Four‑Layer Pyramid Powers AI‑Driven Autonomous Ship Navigation
DataFunSummit
DataFunSummit
Mar 27, 2026 · Industry Insights

Why Traditional Data Platforms Fail and How Ontology Delivers Triple‑Digit ROI

The article examines costly data platform failures—such as a $40 million payroll system collapse and a healthcare.gov outage—highlighting why traditional data middle platforms become data swamps, then explains how Palantir’s ontology approach, with its three‑layer semantic, dynamics, and decision architecture, can turn data into actionable insights and achieve triple‑digit ROI.

Data ArchitectureData PlatformOntology
0 likes · 4 min read
Why Traditional Data Platforms Fail and How Ontology Delivers Triple‑Digit ROI
DataFunSummit
DataFunSummit
Mar 26, 2026 · Industry Insights

Why Traditional Data Platforms Fail and How Ontology Drives Triple‑Digit ROI

The article analyzes costly data‑platform failures—such as a $40 million school‑district payroll system and a collapsed Healthcare.gov launch—identifies the root cause as ineffective data middle platforms, and explains how Palantir’s ontology‑based three‑layer architecture (semantic, dynamics, decision) transforms raw data into automated business actions, delivering measurable ROI across multiple industries.

Data ArchitectureData PlatformDecision automation
0 likes · 5 min read
Why Traditional Data Platforms Fail and How Ontology Drives Triple‑Digit ROI
vivo Internet Technology
vivo Internet Technology
Mar 25, 2026 · Industry Insights

How Vivo Scaled Marketing Automation with Presto, Bitmap, and StarRocks

This case study details how Vivo’s marketing automation platform evolved its data‑driven architecture—from a Presto‑based wide‑table design, through a Bitmap optimization, to a StarRocks migration—addressing performance bottlenecks, reducing resource costs, and enhancing data security.

Big DataBitmapData Architecture
0 likes · 11 min read
How Vivo Scaled Marketing Automation with Presto, Bitmap, and StarRocks
DataFunSummit
DataFunSummit
Mar 19, 2026 · Industry Insights

Why Traditional Data Platforms Fail and How Ontology Delivers Triple‑Digit ROI

The article examines why costly traditional data middle platforms often become data swamps, contrasts them with Palantir's ontology‑based approach that acts like a navigation system, and outlines a three‑layer architecture that turns data into automated business actions, delivering multi‑hundred‑percent ROI.

Business IntelligenceData ArchitectureDigital Twin
0 likes · 4 min read
Why Traditional Data Platforms Fail and How Ontology Delivers Triple‑Digit ROI
StarRocks
StarRocks
Mar 5, 2026 · Big Data

How Fanatics Scaled to PB‑Level Data with StarRocks & Apache Iceberg Lakehouse

Fanatics unified its fragmented data stack by building a StarRocks‑powered Lakehouse on Apache Iceberg, replacing Redshift, Snowflake, Athena, and Druid, which cut costs by up to 95%, delivered sub‑second dashboard queries on petabyte‑scale data, and enabled real‑time and historical analytics on a single platform.

Apache IcebergData ArchitectureFanatics
0 likes · 10 min read
How Fanatics Scaled to PB‑Level Data with StarRocks & Apache Iceberg Lakehouse
StarRocks
StarRocks
Jan 22, 2026 · Big Data

How Paimon + StarRocks Accelerates Double‑11 OLAP Queries by 80% Refresh Speed

This article explains how Taotian Group unified real‑time and offline data using Paimon as lake storage and StarRocks for high‑performance OLAP, eliminating costly sync pipelines, cutting refresh time by about 80%, saving nearly ten million yuan annually, and detailing the architecture, cluster safeguards, configuration tweaks, monitoring, and future roadmap for large‑scale promotional events.

Big DataData ArchitectureOLAP
0 likes · 24 min read
How Paimon + StarRocks Accelerates Double‑11 OLAP Queries by 80% Refresh Speed
iQIYI Technical Product Team
iQIYI Technical Product Team
Jan 8, 2026 · Big Data

How iQIYI Cut Stream Data Costs by 70%: From Private‑Cloud Kafka to AutoMQ

This article details iQIYI's evolution from a tightly coupled private‑cloud Kafka setup to a cloud‑native AutoMQ architecture, describing the challenges of scaling, the development of the Stream platform and Stream‑SDK, the migration to hybrid and public‑cloud Kafka, and the resulting cost and elasticity improvements.

AutoMQData ArchitectureKafka
0 likes · 12 min read
How iQIYI Cut Stream Data Costs by 70%: From Private‑Cloud Kafka to AutoMQ
dbaplus Community
dbaplus Community
Nov 13, 2025 · Databases

Why Modern DBAs Must Evolve into Data‑Business Architects (DBA²)

The article explores the challenges faced by traditional DBAs, redefines the DBA role as a blend of data, business, administration and architecture, presents multiple real‑world cases, and shows how AI and a DBA² mindset can dramatically improve efficiency and career prospects.

AI integrationBusiness strategyCareer Development
0 likes · 19 min read
Why Modern DBAs Must Evolve into Data‑Business Architects (DBA²)
DataFunSummit
DataFunSummit
Nov 10, 2025 · Big Data

How Xiaohongshu Cut Data Architecture Costs by One‑Third with Incremental Computing

This article explains how Xiaohongshu, a lifestyle community with over 350 million monthly users, transformed its data platform from a traditional Lambda architecture to a next‑generation incremental computing model, reducing architectural complexity, resource consumption and development effort each by roughly two‑thirds while supporting massive real‑time and offline data demands.

AIBig DataData Architecture
0 likes · 6 min read
How Xiaohongshu Cut Data Architecture Costs by One‑Third with Incremental Computing
Instant Consumer Technology Team
Instant Consumer Technology Team
Oct 29, 2025 · Big Data

Revolutionizing Feature Engineering with Distributed Tech & Configurable Services

Facing PB‑scale user behavior data and millions of feature dimensions, the platform transformed its search, advertising, and recommendation pipelines by adopting a distributed, configurable‑service architecture that delivers high‑throughput streaming, elastic storage, rapid feature iteration, and robust fault‑tolerance for AI‑driven personalization.

Big DataData ArchitectureDistributed Systems
0 likes · 17 min read
Revolutionizing Feature Engineering with Distributed Tech & Configurable Services
DataFunTalk
DataFunTalk
Oct 18, 2025 · Big Data

Inside Ant Group’s Big Data Governance: Key Practices and Insights

This article shares Ant Group’s practical experience in large-scale data governance, outlining four main topics—overall governance overview, data quality management, data storage-processing governance, and future considerations—while emphasizing the five critical aspects of architecture, security, compliance, quality, and value that drive effective big-data operations.

Data ArchitectureData GovernanceData Quality
0 likes · 4 min read
Inside Ant Group’s Big Data Governance: Key Practices and Insights
DataFunSummit
DataFunSummit
Oct 11, 2025 · Big Data

What Small Banks Can Learn from Cutting-Edge Data Governance Practices

This article shares a data‑governance roadmap for small and medium banks, covering industry pain points, high‑quality data sets, a three‑step governance path, data standards, metadata management, master‑data strategy, business data modeling, a hybrid Greenplum‑Hadoop platform, quality monitoring, and a maturity assessment framework.

BankingBig DataData Architecture
0 likes · 21 min read
What Small Banks Can Learn from Cutting-Edge Data Governance Practices
DataFunTalk
DataFunTalk
Oct 6, 2025 · Big Data

What Ant Group Learned: 5 Pillars of Effective Data Governance

Ant Group shares its practical experience in big data governance, outlining five key focus areas—architecture, security, compliance, quality, and value—through four structured sections and detailed discussions on data quality and storage governance, while also exploring future challenges and the economics of data.

Ant GroupBig DataData Architecture
0 likes · 4 min read
What Ant Group Learned: 5 Pillars of Effective Data Governance
IT Architects Alliance
IT Architects Alliance
Sep 21, 2025 · Big Data

From Data Warehouses to Lakehouses: Why Data Architecture Keeps Evolving

This article traces the three‑generation evolution of data architecture—from the structured‑data era of data warehouses, through the flexible, multi‑format data lake, to the unified lakehouse model—explaining the drivers, benefits, challenges, and future trends shaping modern data platforms.

Data ArchitectureData LakeLakehouse
0 likes · 11 min read
From Data Warehouses to Lakehouses: Why Data Architecture Keeps Evolving
Efficient Ops
Efficient Ops
Sep 18, 2025 · Artificial Intelligence

How ICBC Revolutionized Credit‑Card Risk Management with AI‑Driven Data Architecture

ICBC’s Software Development Center built an AI‑powered, multi‑layer data platform and decision engine that enables real‑time, precise risk monitoring and automated response for credit‑card operations, dramatically improving detection speed, coverage, and warning quality while supporting a full‑process intelligent risk‑control loop.

AI risk managementData ArchitectureDigital Transformation
0 likes · 11 min read
How ICBC Revolutionized Credit‑Card Risk Management with AI‑Driven Data Architecture
Big Data Tech Team
Big Data Tech Team
Sep 17, 2025 · Big Data

How to Build a Scalable Tag System for Recommendation Engines

This article explains why a robust tag system is essential for recommendation and mining strategies, outlines the hierarchy of entity, concept, and theme tags, and provides practical principles, architecture, and step‑by‑step methods for constructing and managing tags in large‑scale data platforms.

Big DataData Architecturedata labeling
0 likes · 14 min read
How to Build a Scalable Tag System for Recommendation Engines
DataFunTalk
DataFunTalk
Aug 28, 2025 · Big Data

How JD Retail Tackles Data Governance Challenges to Boost Efficiency

JD Retail faces growing data volume, redundant models, and resource‑intensive storage, prompting a comprehensive data‑governance strategy that defines standards, streamlines architecture, isolates development, and optimizes compute and storage costs, ultimately enabling more efficient, secure, and agile data operations across the enterprise.

Big DataData ArchitectureData Governance
0 likes · 8 min read
How JD Retail Tackles Data Governance Challenges to Boost Efficiency
DataFunTalk
DataFunTalk
Aug 27, 2025 · Big Data

How JD Retail Overcomes Data Governance Challenges to Boost Efficiency

JD Retail confronts growing data volume, redundant models, shared account risks, and rising storage costs, and responds with a comprehensive data governance framework that standardizes data, streamlines architecture, isolates development, and optimizes resources to achieve efficient, secure, and cost‑effective data operations.

Big DataData ArchitectureData Governance
0 likes · 8 min read
How JD Retail Overcomes Data Governance Challenges to Boost Efficiency
StarRocks
StarRocks
Aug 19, 2025 · Big Data

How Joydata Scaled to 150 Billion Daily Events with StarRocks: A Data Architecture Journey

Facing daily data growth from millions to 150 billion records, Joydata‑U transformed its analytics platform through three architectural stages—Hadoop, Hadoop + Trino, and finally StarRocks—introducing resource isolation, Flat JSON acceleration, and Bitmap indexing to cut query latency by up to seven times and achieve sub‑2‑minute data freshness across BI, ad‑tech, game analytics, and CRM workloads.

Bitmap IndexData ArchitectureFlat JSON
0 likes · 12 min read
How Joydata Scaled to 150 Billion Daily Events with StarRocks: A Data Architecture Journey
DataFunSummit
DataFunSummit
Jul 20, 2025 · Big Data

How Beike Scaled to 600 PB: The Evolution of a Data‑Fusion Architecture

This article details Beike's data‑fusion architecture evolution, covering industry trends, multi‑stage Hadoop upgrades, storage cost optimization with erasure coding, remote shuffle integration, GPU‑centric training stability, and future hybrid‑cloud strategies, while also sharing organizational and operational lessons learned.

AIData ArchitectureHadoop
0 likes · 16 min read
How Beike Scaled to 600 PB: The Evolution of a Data‑Fusion Architecture
DataFunTalk
DataFunTalk
Jul 13, 2025 · Big Data

Unlock Real-Time Multidimensional Insights with Cloud Lakehouse Technology

This guide presents a series of expert case studies and insights on how Cloud Lakehouse solutions enable real‑time, fully managed multidimensional data analysis, improve user data experiences, balance performance and cost, and power large‑scale IoT and big‑data platforms across industries.

Data ArchitectureIoTLakehouse
0 likes · 2 min read
Unlock Real-Time Multidimensional Insights with Cloud Lakehouse Technology
DataFunTalk
DataFunTalk
Jul 9, 2025 · Big Data

How Lakehouse Is Transforming Real‑Time Multi‑Dimensional Analytics

This article compiles a series of expert case studies and insights on real‑time intelligent fully‑managed Lakehouse technology, illustrating how companies such as SalesEasy, Chang’an Auto, Kuaishou, Tencent, and JD.com leverage lakehouse architectures to achieve advanced multi‑dimensional analytics, cost‑performance balance, and effective data governance in the digital economy.

Case StudiesData ArchitectureData Governance
0 likes · 2 min read
How Lakehouse Is Transforming Real‑Time Multi‑Dimensional Analytics
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Jun 13, 2025 · Artificial Intelligence

Designing AI-Ready Data Architecture: Key Features and Future Trends

AI-era data architecture must handle massive, multimodal datasets with real-time processing, prioritize data quality over quantity, support scalability, provenance, and native ML/AI integration, while addressing governance, security, and ethical challenges through emerging technologies like data fabric, mesh, and federated learning.

AIBig DataData Architecture
0 likes · 6 min read
Designing AI-Ready Data Architecture: Key Features and Future Trends
Data Thinking Notes
Data Thinking Notes
Dec 29, 2024 · Information Security

A Complete Blueprint for Enterprise Digital Transformation Architecture

This article presents a comprehensive visual guide to enterprise digital transformation, covering overall digital planning, application system architecture, data architecture, information security architecture, and digital organization and governance, illustrating each layer with detailed diagrams to aid strategic implementation.

Data ArchitectureDigital TransformationInformation Security
0 likes · 5 min read
A Complete Blueprint for Enterprise Digital Transformation Architecture
DataFunSummit
DataFunSummit
Dec 29, 2024 · Big Data

Ant Group Data Architecture Practice and Upgrade Strategy

This article shares Ant Group's practical experience and evolution of data architecture, covering theoretical foundations, analysis of current challenges, proposed upgrade solutions based on domain‑driven design, and future outlook through complex network theory to improve scalability, governance, and resilience of large‑scale data systems.

Ant GroupData ArchitectureDomain-Driven Design
0 likes · 19 min read
Ant Group Data Architecture Practice and Upgrade Strategy
DataFunSummit
DataFunSummit
Dec 21, 2024 · Big Data

Big Data Implementation Practices and Architecture in a Foreign Bank

This article shares the foreign bank's big data implementation journey, covering background and goals, overall planning and architecture, practical insights, phased rollout, data governance, security, and Q&A, illustrating how a unified data platform, storage‑compute separation, and AI‑driven tools drive business innovation.

AIBankingData Architecture
0 likes · 19 min read
Big Data Implementation Practices and Architecture in a Foreign Bank
58 Tech
58 Tech
Dec 19, 2024 · Big Data

Architecture Evolution and Implementation of the Intelligent Acceleration Engine in the 58 Big Data Platform

The article details the background, architectural analysis, multi‑tenant redesign, engine selection enhancements, compatibility adaptations, stability fixes, containerized deployment, performance optimizations, and measurable business outcomes of the Intelligent Acceleration Engine upgrade using Apache Kyuubi and StarRocks within the 58 big data platform.

Apache KyuubiBig DataContainerization
0 likes · 12 min read
Architecture Evolution and Implementation of the Intelligent Acceleration Engine in the 58 Big Data Platform
AntData
AntData
Nov 18, 2024 · Databases

Modern Data Paradigms: From Relational Databases to Vector Retrieval and AI

This article surveys the evolution of modern data technologies—from the 4V characteristics of big data and the limitations of traditional relational databases, through the rise of NoSQL and polyglot persistence, to embedding‑driven vector search, hybrid retrieval and RAG, illustrating how each paradigm frees applications from data constraints.

Artificial IntelligenceBig DataData Architecture
0 likes · 30 min read
Modern Data Paradigms: From Relational Databases to Vector Retrieval and AI
Bilibili Tech
Bilibili Tech
Oct 25, 2024 · Big Data

DataFunSummit2024: Next-Generation Data Architecture Technology Summit

DataFunSummit2024, co-hosted by Bilibili, convenes industry experts, scholars, and enterprise leaders across six forums to discuss next‑generation data architecture, showcasing Bilibili’s Iceberg‑based stream‑batch innovations, AI‑BI analytics, NoETL practices, and emerging alternatives to Lambda architecture.

AI+BIBig DataData Architecture
0 likes · 3 min read
DataFunSummit2024: Next-Generation Data Architecture Technology Summit
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 22, 2024 · Big Data

Key Frameworks and Characteristics of Lakehouse Architecture: A Ground‑Level Perspective

This article reviews the emerging lakehouse architecture, outlines its core frameworks such as Hudi, Iceberg, Paimon, Flink, and Doris, discusses their storage‑compute separation, read‑write optimizations, and highlights how companies of different sizes adopt these technologies based on cost, efficiency, and specific business scenarios.

Data ArchitectureFlinkLakehouse
0 likes · 6 min read
Key Frameworks and Characteristics of Lakehouse Architecture: A Ground‑Level Perspective
DataFunTalk
DataFunTalk
Oct 3, 2024 · Big Data

Data Lake Technology Maturity Curve: Architecture, Design Principles, Core Functions, and Open‑Source Solutions

Amid growing data demands, this article explains the data lake technology maturity curve, detailing lake‑warehouse architectural patterns, design principles, core functionalities, and the four leading open‑source solutions (Hudi, Iceberg, Delta Lake, Paimon) to guide enterprises in building flexible, scalable, and governed data platforms.

Big DataData ArchitectureData Lake
0 likes · 10 min read
Data Lake Technology Maturity Curve: Architecture, Design Principles, Core Functions, and Open‑Source Solutions
JD Tech
JD Tech
Sep 28, 2024 · Big Data

From Early Coding to Big Data Architecture: A Personal Journey Through Data Platforms, Cloud Migration, and System Design

The article chronicles the author’s 30‑year programming career, detailing early experiences, the evolution from JavaScript projects to large‑scale big‑data architectures, cloud migration, business‑agnostic framework design, interactive analytics, and reflections on becoming an independent software architect.

Big DataData Architecturecareer journey
0 likes · 24 min read
From Early Coding to Big Data Architecture: A Personal Journey Through Data Platforms, Cloud Migration, and System Design
DataFunSummit
DataFunSummit
Sep 7, 2024 · Big Data

Observations on the Third Evolution of Data Infrastructure and the Next‑Generation Data Platform Architecture

This article reviews the current state of data platforms, analyzes the third wave of data infrastructure evolution driven by databases, big data and generative AI, proposes next‑generation lakehouse and cloud‑native architectural directions, and outlines future trends and unresolved challenges for AI‑centric data platforms.

Cloud NativeData Architecture
0 likes · 21 min read
Observations on the Third Evolution of Data Infrastructure and the Next‑Generation Data Platform Architecture
Data Thinking Notes
Data Thinking Notes
Jul 29, 2024 · Big Data

What Is a Data Middle Platform and How Does It Transform Enterprise Data Management?

This article explains the concept, design principles, and core components of a data middle platform, detailing its overall, functional, layered, logical, and data architectures, as well as the specific platforms for data collection, processing, organization, governance, quality, sharing, and visualization, illustrated with diagrams.

Big DataData ArchitectureData Governance
0 likes · 27 min read
What Is a Data Middle Platform and How Does It Transform Enterprise Data Management?
Data Thinking Notes
Data Thinking Notes
Jul 22, 2024 · Fundamentals

Why Data Architecture Governance Is the Key to Successful Digital Transformation

Data architecture governance, encompassing standards, security, modeling, quality, and lifecycle management, is essential for digital transformation in fast‑growing industries like express delivery, and this article outlines current challenges, traditional approaches, and a practical, phased methodology with platform support to implement effective governance.

Data ArchitectureData GovernanceDigital Transformation
0 likes · 12 min read
Why Data Architecture Governance Is the Key to Successful Digital Transformation
DataFunTalk
DataFunTalk
Jul 17, 2024 · Databases

From DIKW to Distributed Data Warebase: Letting Data Emerge as Intelligence

This article explores the DIKW hierarchy, explains how data evolves into information, knowledge, and wisdom, examines traditional data models and products, critiques existing multi‑system architectures, and proposes a new distributed Data Warebase that unifies structured, semi‑structured, and vectorized knowledge to enable intelligent data-driven applications.

DIKWData ArchitectureData Systems
0 likes · 24 min read
From DIKW to Distributed Data Warebase: Letting Data Emerge as Intelligence
DataFunSummit
DataFunSummit
Jul 2, 2024 · Cloud Computing

Global Perspective on Multi-Cloud Data Architecture

The forum presents a series of technical talks on multi‑cloud data architecture, covering Xiaomi’s lake‑warehouse practice, cross‑border e‑commerce data platforms, Alluxio‑based machine‑learning acceleration, Qichacha’s cost‑effective data solutions, and Kuaishou’s Flink on Kubernetes migration, highlighting strategies, implementations, and audience benefits.

Big DataData ArchitectureData Platform
0 likes · 8 min read
Global Perspective on Multi-Cloud Data Architecture
DevOps
DevOps
Jun 20, 2024 · Fundamentals

Understanding the 4A Enterprise Architecture: Business, Technology, Application, and Data Architecture

This article explains the TOGAF framework and its 4A architecture model—Business, Technology, Application, and Data Architecture—detailing each domain's definition, purpose, value, and step‑by‑step guidance for creating enterprise architecture diagrams to align strategy, processes, and technology.

Data ArchitectureTOGAFTechnology Architecture
0 likes · 8 min read
Understanding the 4A Enterprise Architecture: Business, Technology, Application, and Data Architecture
Code Ape Tech Column
Code Ape Tech Column
May 28, 2024 · Databases

Query Separation: A Practical Approach to Optimizing Large Table Reads

This article explains the concept of query separation, outlines its suitable scenarios, compares implementation methods such as synchronous, asynchronous, and binlog approaches, discusses storage system choices like MongoDB, HBase, and Elasticsearch, and addresses consistency and operational challenges when decoupling read workloads from write workloads.

Data ArchitectureElasticsearchMQ
0 likes · 8 min read
Query Separation: A Practical Approach to Optimizing Large Table Reads
DataFunTalk
DataFunTalk
May 1, 2024 · Artificial Intelligence

A Comprehensive Guide to Vector Database Architecture and Application Scenarios

This article provides a detailed overview of vector database structures, their evolution, enterprise challenges, functional features, future trends, and key use cases, illustrating how they serve as the memory engine for large AI models and support multimodal data processing in modern data architectures.

AIData ArchitectureEnterprise
0 likes · 11 min read
A Comprehensive Guide to Vector Database Architecture and Application Scenarios
DataFunTalk
DataFunTalk
Mar 29, 2024 · Big Data

Best Practices for Building an International Ride‑Hailing Data Metric System at Didi

This article outlines Didi’s best‑practice approach to constructing a global ride‑hailing data metric system, covering business scenarios, metric categories, pain points such as definition and technical challenges, and a comprehensive solution involving organizational structure, processes, model design, tooling, timezone handling, and governance.

Data ArchitectureData GovernanceDidi
0 likes · 13 min read
Best Practices for Building an International Ride‑Hailing Data Metric System at Didi
Sohu Tech Products
Sohu Tech Products
Mar 13, 2024 · Databases

DingoDB Multi-Modal Vector Database: Design Philosophy, Architecture and Applications

DingoDB is a multi‑modal vector database that unifies storage and analysis of structured, semi‑structured and unstructured data through a Raft‑based distributed architecture, offering MySQL‑compatible SQL, high‑performance APIs, automatic sharding, real‑time index optimization, and hybrid scalar‑vector queries for enterprise knowledge bases, LLM memory, and real‑time decision‑making.

Data ArchitectureDingoDBLLM applications
0 likes · 11 min read
DingoDB Multi-Modal Vector Database: Design Philosophy, Architecture and Applications
DataFunSummit
DataFunSummit
Feb 19, 2024 · Big Data

Yipay Data Warehouse Construction and Data Governance Practices

This presentation by senior data warehouse engineer Huang Luo details Yipay's end‑to‑end data warehouse build, covering background challenges, governance framework, platform development, layered architecture, naming standards, monitoring, and future plans, offering practical insights for data engineers, architects, and business stakeholders.

Big DataData ArchitectureData Quality
0 likes · 14 min read
Yipay Data Warehouse Construction and Data Governance Practices
vivo Internet Technology
vivo Internet Technology
Jan 24, 2024 · Big Data

Evolution of Vivo's Trillions-Scale Data Architecture: Dual-Active Real-Time and Offline Computing

Vivo’s trillion‑scale data platform evolved into a dual‑active real‑time and offline architecture that leverages multi‑datacenter clusters, Kafka/Pulsar caching, a unified sorting layer, HBase‑backed dimension tables, and micro‑batch Spark jobs to deliver low‑cost, high‑performance processing, 99.9% availability, and 99.9995% data‑integrity.

Data ArchitectureHBaseOffline Computing
0 likes · 16 min read
Evolution of Vivo's Trillions-Scale Data Architecture: Dual-Active Real-Time and Offline Computing
Architects Research Society
Architects Research Society
Jan 15, 2024 · Fundamentals

Evolution of Software Architecture Styles and Domains

This article outlines the evolution of software architecture styles, describing various architectural domains and sub‑domains—from web and mobile applications to integration, data, and analytics architectures—and their typical implementations, illustrated with a detailed classification table.

AnalyticsData ArchitectureDomain Architecture
0 likes · 8 min read
Evolution of Software Architecture Styles and Domains
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jan 10, 2024 · Big Data

CaoCao Mobility's Real‑Time Data Warehouse: Hologres + Flink

This article details how CaoCao Mobility transformed its ride‑hailing platform by replacing a traditional Lambda architecture with an enterprise‑grade real‑time data warehouse built on Hologres and Flink, covering business motivations, architectural design, component capabilities, performance optimizations, operational safeguards, and future roadmap.

Data ArchitectureFlinkHologres
0 likes · 19 min read
CaoCao Mobility's Real‑Time Data Warehouse: Hologres + Flink
Data Thinking Notes
Data Thinking Notes
Jan 9, 2024 · Fundamentals

How to Build an Effective Enterprise Data Governance Framework

Designing a data governance system involves defining data architecture, standards, and master data, documenting mechanisms, aligning them with actual business and analytical data processes, forming joint teams for specialized governance activities, and institutionalizing these practices through formal processes to sustain continuous data capability across the enterprise.

Data ArchitectureData GovernanceEnterprise Data
0 likes · 3 min read
How to Build an Effective Enterprise Data Governance Framework
Architects Research Society
Architects Research Society
Jan 2, 2024 · Big Data

Understanding Data Lakes: Concepts, Benefits, Challenges, and Comparison with Data Warehouses

This article explains what a data lake is, its origins, key characteristics such as collecting all data, enabling diverse user access, and flexible processing, compares it with traditional data warehouses, discusses cost advantages, potential pitfalls like data swamps, and outlines best‑practice considerations for enterprise adoption.

AnalyticsData ArchitectureData Lake
0 likes · 10 min read
Understanding Data Lakes: Concepts, Benefits, Challenges, and Comparison with Data Warehouses
DataFunTalk
DataFunTalk
Dec 18, 2023 · Big Data

Unified Data Architecture: Balancing Freshness, Cost, and Performance with Incremental Computing

The article explains why unified data architecture is essential to avoid duplication and inefficiency, discusses differing performance trade‑offs among batch, streaming, and interactive analytics, introduces an incremental computation model that unifies these modes, and invites readers to a Dec 19, 2023 technical sharing event.

Batch ProcessingBig DataData Architecture
0 likes · 3 min read
Unified Data Architecture: Balancing Freshness, Cost, and Performance with Incremental Computing
Architects Research Society
Architects Research Society
Nov 26, 2023 · Big Data

Data Lake vs Data Warehouse: Key Differences and How to Choose

Data lakes and data warehouses serve different purposes in big‑data architectures; this article explains their definitions, core attributes, five major distinctions—including data retention, type support, user coverage, adaptability, and insight speed—and offers guidance on selecting or combining the two approaches.

AnalyticsData ArchitectureData Lake
0 likes · 12 min read
Data Lake vs Data Warehouse: Key Differences and How to Choose
DataFunSummit
DataFunSummit
Oct 22, 2023 · Big Data

How Kuaishou E‑commerce Leverages OLAP and a Unified Data Architecture to Solve Business Data Challenges

This article explains how Kuaishou's e‑commerce team built a unified OLAP‑based data platform—covering data ingestion, consistent dimensional and fact layers, metric management, and real‑time services—to address rapid growth, metric inconsistency, and operational inefficiencies across multiple business scenarios.

Big DataData ArchitectureE‑commerce
0 likes · 20 min read
How Kuaishou E‑commerce Leverages OLAP and a Unified Data Architecture to Solve Business Data Challenges
DataFunSummit
DataFunSummit
Oct 18, 2023 · Big Data

Kuaishou Data Lake Construction with Apache Hudi: Architecture, Challenges, and Solutions

This article explains why Kuaishou built a data lake, outlines the shortcomings of its previous Lambda architecture, describes the adoption of Apache Hudi for unified batch‑stream processing, and details the five major technical challenges and the corresponding solutions implemented to improve performance, consistency, and operational reliability.

Apache HudiBig DataData Architecture
0 likes · 17 min read
Kuaishou Data Lake Construction with Apache Hudi: Architecture, Challenges, and Solutions
DataFunSummit
DataFunSummit
Aug 5, 2023 · Big Data

Manbang Group's Real-Time Computing, Data Architecture, and Product Practices

Manbang Group shares its practical experiences and insights on real-time computing, multi‑cloud platform architecture, data warehousing with Flink and Holo, real‑time decision and feature platforms, and future plans for scaling these systems to support logistics and recommendation algorithms.

Cloud NativeData ArchitectureFlink
0 likes · 16 min read
Manbang Group's Real-Time Computing, Data Architecture, and Product Practices
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 31, 2023 · Big Data

From BI to Kappa: How Data Architecture Evolved in the Big Data Era

This article traces the evolution of data architecture from early BI systems through traditional big‑data stacks, streaming, Lambda and Kappa designs, and explains how a unified stream‑batch model simplifies development while keeping logic consistent across data‑analysis and pipeline applications.

BI systemsBig DataData Architecture
0 likes · 16 min read
From BI to Kappa: How Data Architecture Evolved in the Big Data Era
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 8, 2023 · Big Data

The Pros and Cons of the Middle‑Platform Model in Large Enterprises: A Data Department Perspective

This article examines the middle‑platform concept in large companies, especially data departments, outlining its benefits such as resource centralisation, reduced duplication and experience sharing, while also highlighting drawbacks like slow response, excessive authority, and the need to bring platform capabilities closer to business units.

Data ArchitectureEnterprisemiddle platform
0 likes · 7 min read
The Pros and Cons of the Middle‑Platform Model in Large Enterprises: A Data Department Perspective
Big Data Technology Architecture
Big Data Technology Architecture
Jun 20, 2023 · Big Data

Data Platform Evolution and the Future of Snowflake in China: Insights from Industry Leaders

The panel discusses the three‑stage evolution of data platforms, compares US and Chinese market dynamics, evaluates Snowflake’s success factors, and outlines the criteria and opportunities for a China‑specific Snowflake‑like solution, while also sharing investment perspectives on data‑driven startups.

Data ArchitectureData PlatformsMarket Trends
0 likes · 24 min read
Data Platform Evolution and the Future of Snowflake in China: Insights from Industry Leaders
Data Thinking Notes
Data Thinking Notes
Jun 18, 2023 · Big Data

Data Lake vs Data Warehouse: Uncover the Real Differences

This article explores the evolving concept of data lakes, compares them with traditional data warehouses across storage, modeling, tooling, and user roles, and examines the emerging lake‑warehouse integration, highlighting why both remain essential in modern big‑data architectures.

Big DataData ArchitectureData Lake
0 likes · 12 min read
Data Lake vs Data Warehouse: Uncover the Real Differences
Top Architect
Top Architect
May 4, 2023 · Big Data

Data Middle Platform: General Architecture and Core Components

The article explains the concept, benefits, and detailed modular architecture of a data middle platform, covering data storage, acquisition, processing, governance, security, and operation frameworks, and illustrates how enterprises can build and evolve such platforms to turn data into valuable services.

Big DataData ArchitectureData Governance
0 likes · 19 min read
Data Middle Platform: General Architecture and Core Components
DataFunTalk
DataFunTalk
Apr 21, 2023 · Fundamentals

Data Architecture and Data Modeling Overview, Solutions, and Enterprise Case Studies

This article explains data architecture and data modeling fundamentals, presents DAMA DMBOK concepts, outlines four practical solutions for model design, standard management, automated change control, and business mapping, and shares an enterprise manufacturing case study with Q&A on governance and efficiency.

Data ArchitectureEnterprise Datadata modeling
0 likes · 21 min read
Data Architecture and Data Modeling Overview, Solutions, and Enterprise Case Studies
DataFunTalk
DataFunTalk
Apr 6, 2023 · Product Management

Designing and Implementing a Channel Data Product for Growth

This article explains why channel data products are needed, outlines the stages of growth business, describes how to design and implement a channel data product—including its architecture, database schema, and operational workflow—and concludes with a practical summary and Q&A.

Business IntelligenceData AnalyticsData Architecture
0 likes · 11 min read
Designing and Implementing a Channel Data Product for Growth
DataFunTalk
DataFunTalk
Mar 15, 2023 · Big Data

Evolution of Next‑Generation Cloud Data Platform Architecture

This technical presentation reviews the historical development of big data platforms, outlines the four generations of cloud data platform architectures, details the modern cloud‑native stack—including unified metadata, scheduling, and integration systems—and showcases a real‑world industrial manufacturing case with a Q&A session.

Cloud Data PlatformData ArchitectureScheduling
0 likes · 23 min read
Evolution of Next‑Generation Cloud Data Platform Architecture
DataFunSummit
DataFunSummit
Mar 9, 2023 · Big Data

Designing Efficient and Agile Real-Time Big Data Analytics Platforms for Enterprises

The article explains how enterprises can build a comprehensive big data analytics platform—covering data collection, storage, computation, and decision layers—by clarifying business scenarios, choosing appropriate on‑premise or cloud deployment, selecting suitable architectures such as Lambda/Kappa, and addressing component choices and emerging technical trends.

Big DataData ArchitectureReal-time analytics
0 likes · 9 min read
Designing Efficient and Agile Real-Time Big Data Analytics Platforms for Enterprises
macrozheng
macrozheng
Feb 28, 2023 · Big Data

How Tencent Music Scaled Its Content Data Platform with Apache Doris: From ClickHouse to 4.0 Architecture

This article details the evolution of Tencent Music's content data platform from version 1.0 to 4.0, describing the migration from ClickHouse to Apache Doris, the introduction of a semantic layer, optimization of data ingestion, query performance, and cost reduction strategies that dramatically improved data timeliness, operational efficiency, and storage costs.

Apache DorisBig DataData Architecture
0 likes · 23 min read
How Tencent Music Scaled Its Content Data Platform with Apache Doris: From ClickHouse to 4.0 Architecture
iQIYI Technical Product Team
iQIYI Technical Product Team
Feb 3, 2023 · Big Data

Data Lake Concepts, Benefits, and Iceberg‑Based Implementations at iQIYI

iQIYI’s data lake combines public‑cloud and private storage with Apache Iceberg’s snapshot‑based table format to enable near‑real‑time, unified batch‑and‑stream analytics, reducing costs, simplifying architecture, and improving data freshness across use cases such as log collection, audit, pingback, and member order processing.

Apache IcebergData ArchitectureData Lake
0 likes · 25 min read
Data Lake Concepts, Benefits, and Iceberg‑Based Implementations at iQIYI
dbaplus Community
dbaplus Community
Dec 13, 2022 · Big Data

How ClickHouse Powers Real-Time Self-Service Analytics at Scale

Facing massive daily data volumes and complex, ad‑hoc analytical needs, Zhaozhuan’s engineering team evaluated multiple OLAP engines and chose ClickHouse, then built a four‑layer self‑service analytics platform, detailing architecture, use‑cases, performance tuning, large‑scale joins, and future roadmap challenges.

Big DataData ArchitectureOLAP
0 likes · 14 min read
How ClickHouse Powers Real-Time Self-Service Analytics at Scale
ITPUB
ITPUB
Dec 10, 2022 · Big Data

How ClickHouse Powers Real-Time Self-Service Analytics at Scale

This article examines why ClickHouse was chosen as the OLAP engine for a massive self‑service analytics platform, describes the system architecture, shares concrete memory and performance tuning parameters, and outlines current challenges and future roadmap for large‑scale real‑time data analysis.

Big DataData ArchitectureOLAP
0 likes · 14 min read
How ClickHouse Powers Real-Time Self-Service Analytics at Scale
Zhuanzhuan Tech
Zhuanzhuan Tech
Dec 7, 2022 · Databases

ClickHouse in Self‑Service Analytics: OLAP Selection, Platform Architecture, Optimization Practices, and Future Outlook

This article examines the selection of ClickHouse as the OLAP engine for a self‑service analytics platform, describes the platform’s architecture, details memory and performance tuning techniques, discusses large‑scale join handling, and outlines current challenges and future development directions for ClickHouse.

Data ArchitectureOLAPSelf-Service Analytics
0 likes · 12 min read
ClickHouse in Self‑Service Analytics: OLAP Selection, Platform Architecture, Optimization Practices, and Future Outlook
Architects Research Society
Architects Research Society
Nov 18, 2022 · Databases

Formal Naming of Data Schemas, Structures, and Models: Distinctions and Hierarchies

The article explains how to formally name data schemas, structures, and models, clarifies the differences between internal, external, logical, and physical schemas, and proposes a systematic hierarchy—including strategic and tactical schemas—to improve data architecture and reduce confusion in enterprise data design.

Data ArchitectureDatabase designInformation Systems
0 likes · 11 min read
Formal Naming of Data Schemas, Structures, and Models: Distinctions and Hierarchies
DevOps Cloud Academy
DevOps Cloud Academy
Nov 5, 2022 · Fundamentals

Understanding Data Architecture: Definitions, Problems Solved, Core Components, and Future Trends

This article explains what data architecture is, why it is essential for linking business and technology, outlines its main components such as data models, data flows, value streams and standards, and discusses emerging trends toward service‑oriented, consumption‑focused data architectures.

Data ArchitectureData GovernanceData Management
0 likes · 9 min read
Understanding Data Architecture: Definitions, Problems Solved, Core Components, and Future Trends
DataFunTalk
DataFunTalk
Oct 17, 2022 · Big Data

How Data Empowers the Fast‑Moving Consumer Goods Industry: Baicaowei’s End‑to‑End Data Platform Evolution

This article details Baicaowei’s journey from a Hadoop‑based data platform to a modern StarRocks‑driven architecture, illustrating how digitalization, evolving business needs, and streamlined data pipelines empower the fast‑moving consumer goods sector through efficient data collection, modeling, and analytics.

Big DataData ArchitectureDigital Transformation
0 likes · 10 min read
How Data Empowers the Fast‑Moving Consumer Goods Industry: Baicaowei’s End‑to‑End Data Platform Evolution
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Sep 13, 2022 · Big Data

From Hadoop to Cloud‑Native: The Evolution of Data Lakes and Modern Architecture

This article traces the history of data lakes from their 2010 inception with Hadoop through cloud‑native object storage, lakehouse formats like Delta Lake, and Alibaba Cloud's multi‑layer solution, outlining key architectural stages and practical construction challenges for enterprise‑grade implementations.

Alibaba CloudBig DataCloud Native
0 likes · 9 min read
From Hadoop to Cloud‑Native: The Evolution of Data Lakes and Modern Architecture
Efficient Ops
Efficient Ops
Aug 14, 2022 · Databases

How TDengine 3.0 Redefines Cloud‑Native Time‑Series Databases

The TDengine Developer Conference in Beijing unveiled the open‑source, cloud‑native TDengine 3.0, detailing its revolutionary architecture that tackles high‑cardinality challenges, introduces RAFT‑based distribution, and showcases real‑world IoT and IT‑operations case studies where enterprises dramatically improved performance and reduced costs.

Data ArchitectureIoTTDengine
0 likes · 11 min read
How TDengine 3.0 Redefines Cloud‑Native Time‑Series Databases
GuanYuan Data Tech Team
GuanYuan Data Tech Team
Aug 4, 2022 · Cloud Native

What Is a Cloud‑Native Data Platform? Architecture, Components, and Best Practices

This article explores the evolution and architecture of cloud‑native data platforms, covering their historical roots, modern components such as storage layers, ingestion, processing, metadata, and consumption, and offers practical guidance on selecting tools, designing pipelines, and implementing best‑practice strategies for scalable, flexible data infrastructure.

Data Architecturebig-datacloud-native
0 likes · 41 min read
What Is a Cloud‑Native Data Platform? Architecture, Components, and Best Practices
High Availability Architecture
High Availability Architecture
Jul 7, 2022 · Big Data

Interview with Tencent Cloud’s Zhang Zhigang on Lakehouse Architecture and Cloud‑Native Integration

In this interview, Tencent Cloud expert Zhang Zhigang explains the fundamentals and key technologies of lakehouse architecture, discusses how cloud‑native practices enhance its performance and operability, and offers practical advice for big‑data professionals ahead of the 2022 GIAC Global Internet Architecture Conference in Shenzhen.

Cloud NativeData ArchitectureLakehouse
0 likes · 10 min read
Interview with Tencent Cloud’s Zhang Zhigang on Lakehouse Architecture and Cloud‑Native Integration
Baidu Geek Talk
Baidu Geek Talk
Jul 1, 2022 · Big Data

Evolution of Data Platform Technology: From Data Warehouse to Lakehouse Architecture

The article traces the evolution of data platforms from early data warehouses—using schema‑on‑write, columnar storage, and MPP engines—to data lakes that retain raw data with schema‑on‑read, and finally to lakehouse architectures that merge storage and compute, offering unified metadata, versioning, and support for BI, big‑data, AI, and HPC workloads.

Data ArchitectureLakehouseOLAP
0 likes · 25 min read
Evolution of Data Platform Technology: From Data Warehouse to Lakehouse Architecture
Architects Research Society
Architects Research Society
Jun 25, 2022 · Fundamentals

Data Architecture: Definition, Goals, Principles, Components, and Best Practices

This article explains data architecture as the transformation of business needs into data and system requirements, outlines its objectives, core principles, essential components, the relationship with data modeling, relevant frameworks, and modern best‑practice guidelines for building scalable, cloud‑native, AI‑enabled architectures.

AI integrationCloud NativeData Architecture
0 likes · 10 min read
Data Architecture: Definition, Goals, Principles, Components, and Best Practices