Tagged articles
8 articles
Page 1 of 1
DataFunSummit
DataFunSummit
May 20, 2026 · Big Data

How Kuaishou’s Real‑Time Data Lake Boosts AI and BI Architecture

The article explains how Kuaishou partnered with Apache Hudi to overhaul its ODS‑based data lake, addressing latency, storage cost, and complexity for AI and BI workloads, detailing the evolution from mysql‑to‑hive to mysql‑to‑hudi 1.0 and 2.0, the resulting performance gains, cost savings, and future roadmap.

AIBIBig Data
0 likes · 20 min read
How Kuaishou’s Real‑Time Data Lake Boosts AI and BI Architecture
DataFunSummit
DataFunSummit
May 17, 2024 · Big Data

Comprehensive Hudi Real-Time Data Lake Ingestion Solutions

This article presents a complete guide to Hudi-based real-time data lake ingestion, covering overall data integration architecture, batch and streaming ingestion strategies, advanced table design, and practical recommendations for handling challenges such as deduplication, latency, partitioning, and performance optimization.

Batch ProcessingBig DataData Lake
0 likes · 12 min read
Comprehensive Hudi Real-Time Data Lake Ingestion Solutions
DataFunTalk
DataFunTalk
Oct 28, 2023 · Big Data

Data Lake Architecture, Ingestion Options, Real-time Optimization, and Query Practices

This article presents a comprehensive overview of a unified data lake architecture, evaluates three ingestion solutions, details real‑time ingestion optimizations for Flink‑Hudi pipelines, and describes how Kyuubi enables unified query access across multiple engines, offering practical guidance for large‑scale data processing.

Big DataData LakeFlink
0 likes · 14 min read
Data Lake Architecture, Ingestion Options, Real-time Optimization, and Query Practices
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Aug 4, 2023 · Databases

How GaussDB(DWS) HStore Tables Enable Real‑Time Ingestion and Lightning‑Fast Queries

Discover how GaussDB(DWS) HStore tables combine columnar storage with Delta capabilities to support high‑concurrency real‑time data ingestion, upsert operations, and ultra‑fast analytical queries, while offering full ACID consistency, strong compression, and practical configuration tips for developers.

ColumnarData WarehousingDatabase Storage
0 likes · 8 min read
How GaussDB(DWS) HStore Tables Enable Real‑Time Ingestion and Lightning‑Fast Queries
Volcano Engine Developer Services
Volcano Engine Developer Services
Mar 29, 2023 · Backend Development

How ByteHouse Achieves High‑Availability Real‑Time Data Ingestion with HaKafka

ByteHouse evolved its real‑time import pipeline from a community ClickHouse architecture to a custom HaKafka engine and a cloud‑native design, addressing node failures, read‑write conflicts, scaling costs, and latency by introducing two‑level concurrency, memory tables, exactly‑once semantics, and robust fault‑tolerance.

Distributed SystemsKafkaReal-time Ingestion
0 likes · 15 min read
How ByteHouse Achieves High‑Availability Real‑Time Data Ingestion with HaKafka
DataFunTalk
DataFunTalk
Mar 29, 2023 · Big Data

Evolution of ByteHouse Real‑Time Ingestion: From Internal Demands to a Cloud‑Native Architecture

This article details the motivation, architectural evolution, and technical implementations of ByteHouse's real‑time ingestion pipeline, covering internal business requirements, distributed‑system challenges, the custom HaKafka engine, memory‑table optimizations, and the transition to a cloud‑native design that delivers high availability, low‑latency, and exactly‑once semantics.

ByteHouseKafkaReal-time Ingestion
0 likes · 13 min read
Evolution of ByteHouse Real‑Time Ingestion: From Internal Demands to a Cloud‑Native Architecture
DataFunTalk
DataFunTalk
Nov 3, 2020 · Big Data

Xiaomi Growth Analytics System: Architecture Evolution and Doris Optimization

The article details Xiaomi's growth analytics platform evolution from a Lambda architecture using SparkSQL, Kudu, and HDFS to a streamlined MPP solution with Apache Doris, covering performance gains, real‑time data ingestion, query tuning, and operational improvements for large‑scale analytics.

Apache DorisOLAPReal-time Ingestion
0 likes · 20 min read
Xiaomi Growth Analytics System: Architecture Evolution and Doris Optimization