Tagged articles
201 articles
Page 1 of 3
DataFunSummit
DataFunSummit
May 20, 2026 · Databases

Apache Doris 4.1: A Unified Data Store and Retrieval Engine for AI & Search

Apache Doris 4.1 introduces a systematic evolution for AI and search workloads, adding low‑cost massive vector storage, unified structured, full‑text and vector search, 100 MB JSON document support, Segment V3 metadata decoupling, sparse column optimizations, lakehouse lifecycle management, and a suite of performance‑boosting features such as aggregate push‑down, condition cache, and spill‑to‑disk, all backed by detailed benchmark results.

AIApache DorisLakehouse
0 likes · 30 min read
Apache Doris 4.1: A Unified Data Store and Retrieval Engine for AI & Search
DataFunSummit
DataFunSummit
May 14, 2026 · Big Data

How Gravitino, Daft, and Lance Enable Secure, AI‑Driven Multimodal Lakehouse

The article examines the challenges of multimodal data in modern lakehouses and presents a three‑tool stack—Gravitino, Daft, and Lance—that provides unified metadata, distributed multimodal compute, and high‑performance storage, while detailing security governance, integration paths, and future directions.

DaftGravitinoLakehouse
0 likes · 11 min read
How Gravitino, Daft, and Lance Enable Secure, AI‑Driven Multimodal Lakehouse
DataFunTalk
DataFunTalk
May 11, 2026 · Big Data

How Xiaohongshu Re‑engineered Its Data Architecture for the Big AI Data Era

Xiaohongshu transformed its data platform from a simple ClickHouse‑based ad‑hoc analysis to a Lambda‑style architecture and finally to a lakehouse built on Iceberg, StarRocks, Flink and Spark, cutting architecture complexity, resource and development costs by two‑thirds while supporting trillions of daily events with sub‑second query latency.

Big DataClickHouseFlink
0 likes · 22 min read
How Xiaohongshu Re‑engineered Its Data Architecture for the Big AI Data Era
DataFunTalk
DataFunTalk
May 6, 2026 · Big Data

How Xiaohongshu Evolved Its Data Architecture for the Big AI Data Era

The article details Xiaohongshu's four‑stage data‑platform evolution—from a simple ClickHouse ad‑hoc setup to a Lambda‑based 2.0 design and finally a lakehouse‑driven 3.0 architecture—highlighting the adoption of general incremental compute, cost‑reduction to one‑third, performance gains of up to ten‑fold, and the SPOT standards that guide the new system.

Big DataClickHouseData Architecture
0 likes · 21 min read
How Xiaohongshu Evolved Its Data Architecture for the Big AI Data Era
DataFunTalk
DataFunTalk
Apr 29, 2026 · Big Data

How Xiaohongshu Revamped Its Data Architecture for the Big AI Data Era

Xiaohongshu transformed its data platform from a simple ClickHouse‑based analytics stack to a unified lakehouse with generic incremental compute, cutting architecture complexity, resource cost, and development effort by roughly one‑third while supporting petabyte‑scale, sub‑second queries across its 350 million‑user app.

Big DataClickHouseData Architecture
0 likes · 22 min read
How Xiaohongshu Revamped Its Data Architecture for the Big AI Data Era
DataFunTalk
DataFunTalk
Apr 22, 2026 · Industry Insights

How Xiaohongshu Cut Data Platform Costs by Two‑Thirds with Incremental Computing

This article details Xiaohongshu's journey from a ClickHouse‑based batch analytics stack to a unified lakehouse architecture powered by generic incremental computing, showing how the company reduced architecture complexity, resource consumption and development effort each to roughly one‑third while supporting trillions of daily events with sub‑10‑second query latency.

Big DataData ArchitectureLakehouse
0 likes · 24 min read
How Xiaohongshu Cut Data Platform Costs by Two‑Thirds with Incremental Computing
StarRocks
StarRocks
Apr 16, 2026 · Databases

Why Traditional Databases Stall AI Agents—and How StarRocks Overcomes the Bottleneck

Traditional databases were built for low‑frequency, human‑driven queries, but AI agents generate dozens of concurrent, sub‑second queries that expose architectural limits, and StarRocks addresses these challenges with self‑healing optimization, real‑time data pipelines, extreme concurrency handling, and seamless lakehouse access.

Database ConcurrencyLakehouseReal-time analytics
0 likes · 13 min read
Why Traditional Databases Stall AI Agents—and How StarRocks Overcomes the Bottleneck
DataFunTalk
DataFunTalk
Apr 16, 2026 · Big Data

How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing

This article details Xiaohongshu's data platform evolution from a simple ClickHouse‑based ad‑hoc system to a Lambda‑style architecture and finally a lakehouse solution, highlighting how the adoption of a new incremental computing model reduced architectural complexity, resource consumption and development effort each to roughly one‑third while delivering sub‑second query performance on petabyte‑scale data.

Big DataData ArchitectureLakehouse
0 likes · 21 min read
How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing
DataFunTalk
DataFunTalk
Apr 10, 2026 · Big Data

How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing

This article analyzes Xiaohongshu's data platform evolution—from a simple ClickHouse‑based analytics layer to a Lambda architecture and finally a lakehouse design—highlighting how adopting a new incremental computing model reduced architecture complexity, resource consumption, and development effort each to roughly one‑third while delivering sub‑second query performance on petabyte‑scale data.

Big DataData ArchitectureLakehouse
0 likes · 22 min read
How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing
Baidu Geek Talk
Baidu Geek Talk
Mar 23, 2026 · Databases

How Baidu’s MEG Platform Revamped ClickHouse with a Lakehouse Architecture

This article analyzes the challenges of scaling ClickHouse within Baidu’s MEG data platform and details a lake‑house solution that decouples storage and compute, integrates a meta‑service for transparent data access, optimizes query performance through caching, data roll‑up and layout tuning, and introduces a unified query gateway that gracefully falls back to Spark for complex workloads.

ClickHouseData PlatformLakehouse
0 likes · 25 min read
How Baidu’s MEG Platform Revamped ClickHouse with a Lakehouse Architecture
StarRocks
StarRocks
Mar 5, 2026 · Big Data

How Fanatics Scaled to PB‑Level Data with StarRocks & Apache Iceberg Lakehouse

Fanatics unified its fragmented data stack by building a StarRocks‑powered Lakehouse on Apache Iceberg, replacing Redshift, Snowflake, Athena, and Druid, which cut costs by up to 95%, delivered sub‑second dashboard queries on petabyte‑scale data, and enabled real‑time and historical analytics on a single platform.

Apache IcebergData ArchitectureFanatics
0 likes · 10 min read
How Fanatics Scaled to PB‑Level Data with StarRocks & Apache Iceberg Lakehouse
StarRocks
StarRocks
Feb 11, 2026 · Big Data

How StarRocks and Apache Paimon Build a True Lakehouse Native Engine

This article details the deep integration of StarRocks with Apache Paimon, describing the unified architecture, version evolution, performance enhancements, time‑travel queries, native readers/writers, distributed planning, and future roadmap for achieving lakehouse‑native analytics at scale.

Apache PaimonData LakeLakehouse
0 likes · 10 min read
How StarRocks and Apache Paimon Build a True Lakehouse Native Engine
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 2, 2026 · Big Data

Real‑Time Analytics with Alibaba Cloud Serverless Spark & Paimon for Taobao Flash Sale

This article details how Alibaba Cloud EMR Serverless Spark combined with the Paimon lakehouse framework enables Taobao Flash Sale’s retail data team to achieve low‑latency, high‑throughput real‑time analytics, batch processing, and feature generation, outlining architecture evolution, performance gains, and practical Spark tuning techniques.

Big DataLakehousePaimon
0 likes · 18 min read
Real‑Time Analytics with Alibaba Cloud Serverless Spark & Paimon for Taobao Flash Sale
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 2, 2026 · Big Data

How We Built a Scalable Lakehouse Architecture with StarRocks, Paimon, and Flink

This article details the evolution of a data warehouse at RenliJia from a MaxCompute‑centric setup to a modern lakehouse using StarRocks, Paimon, Flink, and Fluss, describing design goals, technical evaluations, implementation steps for offline, OLAP, and real‑time workloads, and the challenges and future plans that emerged.

Big DataData WarehouseFlink
0 likes · 25 min read
How We Built a Scalable Lakehouse Architecture with StarRocks, Paimon, and Flink
StarRocks
StarRocks
Jan 15, 2026 · Artificial Intelligence

How AI‑First Lakehouse Redefines Data Platforms for Multimodal Analytics

The article outlines the evolution from traditional OLAP to an AI‑first Lakehouse, detailing unified multimodal storage, CPU/GPU heterogeneous scheduling, native vector search, in‑database AI inference, agent‑centric execution, and self‑evolving platform capabilities that together reshape modern data analytics.

AIAgent ArchitectureBig Data
0 likes · 11 min read
How AI‑First Lakehouse Redefines Data Platforms for Multimodal Analytics
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jan 8, 2026 · Big Data

How Gaode Maps Built a Real‑Time Lakehouse for Billion‑Scale Trajectory Data

This article details Gaode Maps' end‑to‑end lakehouse solution for massive, high‑frequency trajectory data, covering the challenges of real‑time visibility, query performance, and storage cost, and explaining how a hot‑warm‑cold tiering architecture built on Apache Flink, Paimon, StarRocks, Redis and Lindorm delivers millisecond‑level queries while cutting storage expenses.

Apache FlinkApache PaimonData Tiering
0 likes · 19 min read
How Gaode Maps Built a Real‑Time Lakehouse for Billion‑Scale Trajectory Data
StarRocks
StarRocks
Jan 7, 2026 · Big Data

How Gaode Maps Built a Real‑Time Lakehouse for Billion‑Scale Trajectory Data

This article details Gaode Maps' end‑to‑end lakehouse solution for handling high‑frequency, high‑volume trajectory data, covering the challenges of real‑time visibility, multi‑scenario queries, storage cost, and data silos, and describing the layered storage architecture, performance validation, and future expansion plans.

Apache FlinkData TieringLakehouse
0 likes · 21 min read
How Gaode Maps Built a Real‑Time Lakehouse for Billion‑Scale Trajectory Data
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Jan 6, 2026 · Industry Insights

Apache Paimon: Boosting Real-Time Data Lakes for Fraud Detection & Manufacturing

This article examines Apache Paimon’s innovative lakehouse architecture, detailing its LSM‑Tree storage, flexible merge engine, and multi‑engine integration, and showcases two real‑world deployments—an operator’s real‑time fraud‑prevention system and a manufacturing firm’s unified data platform—highlighting performance gains and cost reductions.

Apache PaimonBig DataCase Study
0 likes · 15 min read
Apache Paimon: Boosting Real-Time Data Lakes for Fraud Detection & Manufacturing
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 30, 2025 · Big Data

How StarRocks and Apache Paimon Unite to Build a True Lakehouse Native Engine

StarRocks and Apache Paimon have been progressively integrated across multiple releases, enabling a unified lakehouse architecture that supports multi-source federated analysis, time-travel queries, native readers/writers, distributed planning, and advanced profiling, while delivering performance gains that bring Paimon query speed on par with native StarRocks tables.

Apache PaimonData IntegrationLakehouse
0 likes · 9 min read
How StarRocks and Apache Paimon Unite to Build a True Lakehouse Native Engine
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 24, 2025 · Big Data

How Paimon’s Column‑Separation Architecture Powers Real‑Time Multi‑Modal Lakehouse for AI

This article explains the challenges of frequent column changes in AI feature engineering, introduces Paimon’s column‑separation storage with a global continuous Row ID, details its Blob data type for efficient multi‑modal handling, and outlines production results and future roadmap for building an AI‑native data lakehouse.

Apache PaimonBig DataBlob
0 likes · 11 min read
How Paimon’s Column‑Separation Architecture Powers Real‑Time Multi‑Modal Lakehouse for AI
StarRocks
StarRocks
Dec 18, 2025 · Databases

How Fresha Scaled Real‑Time Analytics with StarRocks: A Deep Dive into Their Hybrid Architecture

Facing Postgres overload and costly Snowflake queries, Fresha rebuilt its analytics platform by introducing StarRocks as a unified SQL entry point, combining federated lakehouse queries with high‑performance internal tables, which reduced homepage query latency to around 200 ms and achieved minute‑level data freshness across real‑time, historical, and search workloads.

Compute-Storage SeparationHybrid ArchitectureLakehouse
0 likes · 20 min read
How Fresha Scaled Real‑Time Analytics with StarRocks: A Deep Dive into Their Hybrid Architecture
DataFunSummit
DataFunSummit
Dec 1, 2025 · Big Data

7 Cutting-Edge Data Engineering Practices Shaping AI-Driven Data Lakes

This article collection showcases seven advanced data engineering solutions—from Tencent Cloud's Iceberg batch‑stream integration and Apache Gravitino metadata lineage to Xiaohongshu's Lakehouse evolution and multimodal AI data lake implementations—highlighting architectural innovations, performance optimizations, and real‑world deployment insights for modern big‑data platforms.

Apache GravitinoApache IcebergBatch-Stream Integration
0 likes · 7 min read
7 Cutting-Edge Data Engineering Practices Shaping AI-Driven Data Lakes
Big Data Technology & Architecture
Big Data Technology & Architecture
Nov 28, 2025 · Big Data

What’s New in Apache Paimon 2025? Core Performance, AI Integration & Real‑Time Lakehouse Updates

The 2025 Apache Paimon release brings major performance boosts, AI‑centric multimodal storage, deeper streaming‑batch integration, and broader engine compatibility, detailing query and write optimizations, memory management tweaks, and a unified lake format for structured and unstructured data.

AI integrationApache PaimonBig Data
0 likes · 6 min read
What’s New in Apache Paimon 2025? Core Performance, AI Integration & Real‑Time Lakehouse Updates
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 20, 2025 · Big Data

Mastering Large‑Scale Data Migration: Challenges, Strategies and Real‑World Solutions

This article explains why data migration is the essential first step for cloud modernization, outlines the technical challenges of moving terabytes to petabytes, compares physical and logical migration methods, and presents practical solutions and real‑world case studies across Hive, cloud warehouses, lake‑house formats and analytic databases.

Big DataData MigrationETL
0 likes · 56 min read
Mastering Large‑Scale Data Migration: Challenges, Strategies and Real‑World Solutions
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 13, 2025 · Databases

Apache Doris 3.1 Unveiled: Variant, Index, and Lakehouse Boosts

The Apache Doris 3.1 release strengthens lake‑house capabilities with major upgrades to the VARIANT data type, vertical compaction, inverted index storage, new tokenizers, enhanced materialized view support for Iceberg/Paimon/Hudi, and numerous query‑performance optimizations such as faster partition pruning and dynamic partition clipping, offering smoother handling of thousands of columns and large‑scale semi‑structured data.

Apache DorisLakehouseVARIANT
0 likes · 8 min read
Apache Doris 3.1 Unveiled: Variant, Index, and Lakehouse Boosts
DataFunSummit
DataFunSummit
Oct 4, 2025 · Big Data

How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing

This article details how Xiaohongshu's massive user‑generated data platform evolved from a simple ClickHouse‑based architecture to a multi‑stage lakehouse design, adopting a new incremental computing model that reduced architecture complexity, resource and development costs by one‑third while boosting query performance for petabyte‑scale data.

Lakehouseincremental computing
0 likes · 19 min read
How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing
IT Architects Alliance
IT Architects Alliance
Sep 21, 2025 · Big Data

From Data Warehouses to Lakehouses: Why Data Architecture Keeps Evolving

This article traces the three‑generation evolution of data architecture—from the structured‑data era of data warehouses, through the flexible, multi‑format data lake, to the unified lakehouse model—explaining the drivers, benefits, challenges, and future trends shaping modern data platforms.

Data ArchitectureData LakeData Warehouse
0 likes · 11 min read
From Data Warehouses to Lakehouses: Why Data Architecture Keeps Evolving
High Availability Architecture
High Availability Architecture
Sep 10, 2025 · Big Data

How Ctrip Business Travel Built a Near‑Real‑Time Lakehouse with Flink CDC & Paimon

This article details Ctrip Business Travel’s implementation of a near‑real‑time data warehouse using Flink CDC and the Paimon lakehouse engine, covering order wide‑table construction, ticket refund alerts, ad attribution, batch‑stream integration, and practical lessons on Partial Update, Aggregation, and Tag‑based incremental processing.

?=Batch-Stream IntegrationFlink
0 likes · 17 min read
How Ctrip Business Travel Built a Near‑Real‑Time Lakehouse with Flink CDC & Paimon
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Sep 8, 2025 · Big Data

How Ele.me Revolutionized Real‑Time Data Warehousing with Flink‑Paimon Lakehouse

In this detailed case study, Alibaba’s Ele.me team explains how they evolved from siloed, chimney‑style real‑time warehouses to a unified Flink‑Paimon lakehouse, highlighting the three development stages, technology evaluations, the Alake platform’s one‑stop capabilities, production results, and future directions such as Fluss and AI integration.

AlakeFlinkLakehouse
0 likes · 17 min read
How Ele.me Revolutionized Real‑Time Data Warehousing with Flink‑Paimon Lakehouse
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Sep 5, 2025 · Big Data

How StarRocks + Paimon Powered Real‑Time Analytics for Alibaba’s Taobao Flash Sale

Facing minute‑level decision demands and billions of marketing events during Taobao's Flash Sale, the Ele.me data team built a real‑time lakehouse with StarRocks and Paimon, leveraging asynchronous materialized views, RoaringBitmap de‑duplication, and resource isolation to achieve sub‑second query latency, lower storage costs, and stable high‑concurrency.

LakehouseMaterialized ViewsPaimon
0 likes · 25 min read
How StarRocks + Paimon Powered Real‑Time Analytics for Alibaba’s Taobao Flash Sale
StarRocks
StarRocks
Sep 2, 2025 · Big Data

How StarRocks + Paimon Powered Real‑Time Analytics for Alibaba’s Flash Sale

Faced with billions of marketing events and minute‑level decision requirements during Taobao's flash‑sale campaign, the e‑commerce data team built a real‑time lakehouse using StarRocks and Paimon, leveraged asynchronous materialized views and RoaringBitmap deduplication, and achieved sub‑second query latency, massive cost savings, and stable high‑concurrency performance.

Big DataLakehouseMaterialized Views
0 likes · 26 min read
How StarRocks + Paimon Powered Real‑Time Analytics for Alibaba’s Flash Sale
Ctrip Technology
Ctrip Technology
Sep 2, 2025 · Big Data

How Ctrip Built a Near‑Real‑Time Lakehouse with Flink & Paimon

This article details Ctrip Business Travel’s implementation of a near‑real‑time data warehouse and lakehouse using Flink CDC and Apache Paimon, covering order wide‑table construction, automated ticket reminders, ad attribution, batch‑stream integration, and lessons on Partial Update, Aggregation, and Tag‑based incremental processing.

Batch-Stream IntegrationFlinkLakehouse
0 likes · 17 min read
How Ctrip Built a Near‑Real‑Time Lakehouse with Flink & Paimon
StarRocks
StarRocks
Aug 19, 2025 · Big Data

How Joydata Scaled to 150 Billion Daily Events with StarRocks: A Data Architecture Journey

Facing daily data growth from millions to 150 billion records, Joydata‑U transformed its analytics platform through three architectural stages—Hadoop, Hadoop + Trino, and finally StarRocks—introducing resource isolation, Flat JSON acceleration, and Bitmap indexing to cut query latency by up to seven times and achieve sub‑2‑minute data freshness across BI, ad‑tech, game analytics, and CRM workloads.

Bitmap IndexData ArchitectureFlat JSON
0 likes · 12 min read
How Joydata Scaled to 150 Billion Daily Events with StarRocks: A Data Architecture Journey
Big Data Technology Tribe
Big Data Technology Tribe
Aug 12, 2025 · Databases

Why Lakehouse Architecture Is Redefining Modern Data Platforms

This article explains the evolution from traditional data warehouses and data lakes to the unified Lakehouse architecture, detailing its design, benefits, challenges, and research directions for delivering high‑performance SQL and advanced analytics on open‑format storage.

Big DataData LakeData Warehouse
0 likes · 20 min read
Why Lakehouse Architecture Is Redefining Modern Data Platforms
58 Tech
58 Tech
Aug 7, 2025 · Big Data

Transform Real‑Time Data Warehousing with Paimon: From Flink ROW_NUMBER to Streaming Lakehouse

This article details how a real‑time data warehouse built on Flink, Kafka, HBase and MySQL was redesigned using Paimon to eliminate costly deduplication, handle out‑of‑order events, enable streaming reads, simplify aggregation, replace multiple lookup sources, and achieve faster, more reliable batch repairs, resulting in major resource and operational gains.

Data WarehouseFlinkLakehouse
0 likes · 24 min read
Transform Real‑Time Data Warehousing with Paimon: From Flink ROW_NUMBER to Streaming Lakehouse
StarRocks
StarRocks
Jul 23, 2025 · Big Data

How StarRocks Powers Intelligent BI with AI‑Native Lakehouse Architecture

This article explores the evolution of business intelligence toward intelligent BI, detailing traditional BI limitations, agile BI improvements, and how StarRocks' MPP lakehouse engine combined with large language models enables natural‑language analytics, real‑time performance, AI‑driven insights, and scalable enterprise deployments.

AI integrationIntelligent BILakehouse
0 likes · 19 min read
How StarRocks Powers Intelligent BI with AI‑Native Lakehouse Architecture
DataFunSummit
DataFunSummit
Jul 18, 2025 · Databases

Boosting ClickHouse on WeChat: Performance Tools, Lakehouse Hacks & AI

This article explores how ClickHouse is deployed across WeChat for real‑time analytics, introduces a suite of performance‑monitoring tools, details lakehouse read and bitmap optimizations, and describes the integration of AI‑driven vector search, showcasing substantial speedups and scalability improvements.

AIBig DataClickHouse
0 likes · 12 min read
Boosting ClickHouse on WeChat: Performance Tools, Lakehouse Hacks & AI
DataFunSummit
DataFunSummit
Jul 18, 2025 · Big Data

Data Lake & Lakehouse Innovations: Real-Time Analytics and Industry Case Studies

This article presents a curated collection of cutting‑edge data lake and lakehouse case studies—including real‑time analytics, cloud‑native architectures, industry implementations from sales platforms to automotive IoT, and the latest advancements in open‑source projects—offering insights into modern big‑data strategies and governance.

Big DataData LakeLakehouse
0 likes · 2 min read
Data Lake & Lakehouse Innovations: Real-Time Analytics and Industry Case Studies
DataFunTalk
DataFunTalk
Jul 13, 2025 · Big Data

Unlock Real-Time Multidimensional Insights with Cloud Lakehouse Technology

This guide presents a series of expert case studies and insights on how Cloud Lakehouse solutions enable real‑time, fully managed multidimensional data analysis, improve user data experiences, balance performance and cost, and power large‑scale IoT and big‑data platforms across industries.

Data ArchitectureIoTLakehouse
0 likes · 2 min read
Unlock Real-Time Multidimensional Insights with Cloud Lakehouse Technology
DataFunTalk
DataFunTalk
Jul 9, 2025 · Big Data

How Lakehouse Is Transforming Real‑Time Multi‑Dimensional Analytics

This article compiles a series of expert case studies and insights on real‑time intelligent fully‑managed Lakehouse technology, illustrating how companies such as SalesEasy, Chang’an Auto, Kuaishou, Tencent, and JD.com leverage lakehouse architectures to achieve advanced multi‑dimensional analytics, cost‑performance balance, and effective data governance in the digital economy.

Case StudiesData ArchitectureData Governance
0 likes · 2 min read
How Lakehouse Is Transforming Real‑Time Multi‑Dimensional Analytics
DataFunTalk
DataFunTalk
Jul 8, 2025 · Big Data

Explore Cutting-Edge Lakehouse Solutions: Real-Time Analytics & Data Governance

This guide presents a curated collection of case studies and insights on cloud-native Lakehouse architectures, real‑time analytics, data‑driven user experiences, and data governance, showcasing implementations from companies like SalesEasy, Changan Auto, TikTok, Tencent, JD.com, and more.

Case StudiesData GovernanceLakehouse
0 likes · 2 min read
Explore Cutting-Edge Lakehouse Solutions: Real-Time Analytics & Data Governance
DataFunTalk
DataFunTalk
Jul 7, 2025 · Big Data

Unlock Real-Time Analytics with Cloud Lakehouse: A Complete Guide

This article presents a curated list of sessions covering cloud Lakehouse technology for real-time, multidimensional data analysis, including case studies from SalesEasy, Changan Auto, Tencent, and JD, as well as discussions on data lake adoption, streaming lake Paimon, and the relevance of metadata‑driven data governance in the digital economy.

Big DataCase StudyData Governance
0 likes · 2 min read
Unlock Real-Time Analytics with Cloud Lakehouse: A Complete Guide
DataFunTalk
DataFunTalk
Jul 6, 2025 · Big Data

How Cloud Lakehouse Is Redefining Real-Time Multi-Dimensional Data Analytics

This article presents a curated list of case studies and insights on cloud Lakehouse technology, covering real-time intelligent analytics, data architecture simplification, IoT big‑data platforms, integrated data platforms, and the evolving role of metadata‑driven data governance in the digital economy.

Big DataCase StudiesData Governance
0 likes · 2 min read
How Cloud Lakehouse Is Redefining Real-Time Multi-Dimensional Data Analytics
DataFunSummit
DataFunSummit
Jun 22, 2025 · Databases

Unlocking Apache Doris: How Lakehouse Integration Supercharges Data Analytics

This article walks through Apache Doris’s lakehouse‑in‑one architecture, explains its core value and paradigm, details the system’s components and use cases, examines technical challenges such as file‑format diversity and I/O stability, and presents a suite of optimizations—from predicate push‑down and partition pruning to metadata caching and dynamic scheduling—that dramatically improve query performance and resource utilization, while also outlining future roadmap plans.

Apache DorisBig DataData Warehouse
0 likes · 22 min read
Unlocking Apache Doris: How Lakehouse Integration Supercharges Data Analytics
DataFunSummit
DataFunSummit
Jun 3, 2025 · Big Data

BiFang: A Unified Lake‑Stream Storage Engine for Real‑Time and Batch Data Processing

BiFang is a lake‑stream integrated storage engine that merges Apache Pulsar message‑queue capabilities with Iceberg data‑lake features, providing a single unified data store with full‑incremental queries, sub‑second visibility, exactly‑once semantics, and seamless integration with Flink, Spark, and StarRocks for both real‑time analytics and batch processing.

Apache IcebergApache PulsarLakehouse
0 likes · 13 min read
BiFang: A Unified Lake‑Stream Storage Engine for Real‑Time and Batch Data Processing
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 29, 2025 · Big Data

Big Data Interview Preparation: Data Governance, Iceberg Metadata, Lakehouse Best Practices, and Xiaohongshu HR Updates

The article reports Xiaohongshu’s cancellation of the big‑small week schedule and non‑compete clause, then provides a collection of big‑data interview questions—including data governance, Iceberg metadata management, and lakehouse production best practices—along with concise answers and resources for candidates.

Data GovernanceIcebergLakehouse
0 likes · 7 min read
Big Data Interview Preparation: Data Governance, Iceberg Metadata, Lakehouse Best Practices, and Xiaohongshu HR Updates
Big Data Tech Team
Big Data Tech Team
Mar 25, 2025 · Big Data

How Apache Paimon Transforms Real‑Time Lakehouse Architecture

This article analyzes the limitations of a traditional Flink + Talos + Iceberg real‑time lakehouse, introduces Apache Paimon's lakehouse table format and LSM storage, and demonstrates three practical use cases—partial‑update widening, streaming upsert, and lookup join—showing cost, stability, and performance improvements while outlining future roadmap.

Apache PaimonFlinkLakehouse
0 likes · 16 min read
How Apache Paimon Transforms Real‑Time Lakehouse Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 17, 2025 · Big Data

Lakehouse Implementations at Leading Companies: Challenges, Solutions, and Benefits

This article reviews how major tech firms such as Alibaba, Tencent, ByteDance, and Kuaishou tackled lakehouse challenges—including architecture fragmentation, cost, scalability, and complex multimodal data—by adopting real‑time lakehouse solutions like Flink + Paimon, Iceberg + StarRocks, Hudi + LAS, and Doris + Alluxio, and outlines the resulting performance and cost gains.

FlinkLakehousePaimon
0 likes · 9 min read
Lakehouse Implementations at Leading Companies: Challenges, Solutions, and Benefits
Alimama Tech
Alimama Tech
Feb 21, 2025 · Industry Insights

How Paimon + Dolphin Transform Alibaba’s Brand Data Warehouse for Real‑Time Insights

This article analyzes the challenges of Alibaba Mama's brand advertising data warehouse built on a Lambda architecture, introduces Apache Paimon lake storage and Dolphin OLAP engine as a unified lakehouse solution, details implementation steps, performance gains, and business benefits across multiple advertising scenarios.

Big DataData WarehouseDolphin
0 likes · 15 min read
How Paimon + Dolphin Transform Alibaba’s Brand Data Warehouse for Real‑Time Insights
StarRocks
StarRocks
Feb 20, 2025 · Big Data

How RedBI Boosted Query Speed 3× with StarRocks & Iceberg Lakehouse

The article details how Xiaohongshu's RedBI self‑service analytics platform transformed its architecture by integrating StarRocks and Iceberg, replacing ClickHouse‑based storage with Parquet, introducing DataCache, Z‑Order sorting and intelligent key selection, achieving a three‑fold P90 query speed improvement, sub‑10‑second latency, and halving storage consumption.

DataCacheIcebergLakehouse
0 likes · 19 min read
How RedBI Boosted Query Speed 3× with StarRocks & Iceberg Lakehouse
DataFunSummit
DataFunSummit
Jan 14, 2025 · Big Data

Tencent Real-Time Lakehouse Intelligent Optimization Practice

This presentation details Tencent's real‑time lakehouse architecture and the four key topics—lakehouse design, intelligent optimization services, scenario‑driven capabilities, and future outlook—covering components such as Spark, Flink, Iceberg, Auto‑Optimize Service, indexing, clustering, AutoEngine, and PyIceberg implementations.

Auto OptimizeBig DataFlink
0 likes · 12 min read
Tencent Real-Time Lakehouse Intelligent Optimization Practice
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jan 14, 2025 · Big Data

How Fluss Unifies Lake and Stream for Real‑Time Analytics: Architecture, Benefits, and Future Roadmap

This article summarizes a talk by Alibaba Cloud senior engineer and Flink Committer Luo Yuxia on the challenges of separating lake and stream storage, introduces the Fluss lake‑stream unified architecture, explains its technical benefits such as second‑level data freshness, unified metadata, efficient changelog generation, and outlines future plans for broader ecosystem integration.

Data LakeFlinkFluss
0 likes · 13 min read
How Fluss Unifies Lake and Stream for Real‑Time Analytics: Architecture, Benefits, and Future Roadmap
DataFunSummit
DataFunSummit
Jan 3, 2025 · Big Data

Tencent Real‑Time Lakehouse Intelligent Optimization Practices

This article presents Tencent's end‑to‑end real‑time lakehouse architecture, detailing its three‑layer design, the Auto Optimize Service modules such as compaction, indexing, clustering and engine acceleration, as well as scenario‑driven capabilities like multi‑stream joins, primary‑key tables, in‑place migration and PyIceberg support, and concludes with future optimization directions.

Big DataFlinkIceberg
0 likes · 11 min read
Tencent Real‑Time Lakehouse Intelligent Optimization Practices
StarRocks
StarRocks
Jan 2, 2025 · Big Data

StarRocks Compute‑Storage Separation Cuts Costs 40% and Boosts Efficiency 20% at DMALL

DMALL upgraded its big‑data platform by adopting StarRocks 3.x with compute‑storage separation, lakehouse external tables, and Kubernetes deployment, achieving 20% higher compute utilization, 40% lower storage cost, faster cluster provisioning, and notable improvements in development and operations efficiency.

Big DataCompute-Storage SeparationKubernetes
0 likes · 25 min read
StarRocks Compute‑Storage Separation Cuts Costs 40% and Boosts Efficiency 20% at DMALL
DataFunSummit
DataFunSummit
Dec 27, 2024 · Big Data

Tencent Real-time Lakehouse Intelligent Optimization Practice

This presentation describes Tencent's real-time lakehouse architecture, including data lake compute, management, and storage layers, and details the intelligent optimization services—such as compaction, indexing, clustering, and auto-engine—designed to improve query performance, storage cost, and operational efficiency for large-scale data processing.

AutoEngineFlinkIceberg
0 likes · 11 min read
Tencent Real-time Lakehouse Intelligent Optimization Practice
StarRocks
StarRocks
Dec 2, 2024 · Big Data

How Paimon Revamps Lakehouse Management and Supercharges Queries with StarRocks

This article details Tongcheng Travel's migration from Hive/Kudu/Hudi to Paimon for lakehouse integration, highlighting a 30% resource reduction, three‑fold write speed gains, significant query acceleration via StarRocks, the end‑to‑end architecture across ODS‑DWD‑DWS‑ADS layers, and future roadmap plans.

Big DataFlinkLakehouse
0 likes · 18 min read
How Paimon Revamps Lakehouse Management and Supercharges Queries with StarRocks
DataFunSummit
DataFunSummit
Dec 2, 2024 · Big Data

Gravitino Powers TBDS Product Architecture Upgrade with a Unified Metadata Lake

This article explains how Tencent Cloud's TBDS platform evolves its architecture by adopting Apache Gravitino as a unified metadata lake, detailing the challenges of legacy versus new lakehouse designs, storage and compute separation, unified data access, permission management, and the resulting benefits for big‑data and AI workloads.

Big DataGravitinoLakehouse
0 likes · 15 min read
Gravitino Powers TBDS Product Architecture Upgrade with a Unified Metadata Lake
Big Data Technology & Architecture
Big Data Technology & Architecture
Nov 1, 2024 · Big Data

Real‑Time Lakehouse Architecture at Ximalaya Live: Leveraging Flink, Paimon, and StarRocks

This article details Ximalaya Live's transition from an offline‑centric data warehouse to a real‑time lakehouse using Flink, Paimon, and StarRocks, covering business background, architectural challenges, technology evaluation, implementation steps, encountered issues, performance gains, and future expansion plans.

FlinkLakehousePaimon
0 likes · 12 min read
Real‑Time Lakehouse Architecture at Ximalaya Live: Leveraging Flink, Paimon, and StarRocks
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Oct 31, 2024 · Big Data

How EMR Serverless Spark Powers the Next‑Gen Lakehouse Era

This article traces the evolution of data platforms, explains the rise of lakehouse architecture, and details how Alibaba Cloud's EMR Serverless Spark delivers one‑stop development, high performance, and full ecosystem compatibility, illustrated with real‑world case studies from Midea and Eagle Network.

AIBig DataData Platform
0 likes · 16 min read
How EMR Serverless Spark Powers the Next‑Gen Lakehouse Era
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Oct 25, 2024 · Big Data

How Real-Time Flink Powers Automotive Big Data: Architecture & Case Studies

This article, based on Alibaba Cloud expert Li Lubing’s presentation, examines the rapid growth of China’s new energy vehicle market, outlines typical automotive big‑data architectures, compares Lambda and real‑time lakehouse solutions built with Flink and Apache Paimon, and showcases real‑world customer deployments.

Big DataFlinkLakehouse
0 likes · 18 min read
How Real-Time Flink Powers Automotive Big Data: Architecture & Case Studies
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 22, 2024 · Big Data

Key Frameworks and Characteristics of Lakehouse Architecture: A Ground‑Level Perspective

This article reviews the emerging lakehouse architecture, outlines its core frameworks such as Hudi, Iceberg, Paimon, Flink, and Doris, discusses their storage‑compute separation, read‑write optimizations, and highlights how companies of different sizes adopt these technologies based on cost, efficiency, and specific business scenarios.

Data ArchitectureFlinkLakehouse
0 likes · 6 min read
Key Frameworks and Characteristics of Lakehouse Architecture: A Ground‑Level Perspective
StarRocks
StarRocks
Oct 16, 2024 · Big Data

How to Build a High‑Performance Lakehouse with StarRocks and Apache Hive

This guide walks through the core concepts of Apache Hive, its architecture and key features, then shows how to integrate Hive with StarRocks via the Hive Catalog, construct ODS/DWD/DWS/ADS tables, enable DataCache, use materialized views, and handle automatic partition detection for fast lakehouse analytics.

Apache HiveBig DataDataCache
0 likes · 17 min read
How to Build a High‑Performance Lakehouse with StarRocks and Apache Hive
DataFunTalk
DataFunTalk
Oct 3, 2024 · Big Data

Data Lake Technology Maturity Curve: Architecture, Design Principles, Core Functions, and Open‑Source Solutions

Amid growing data demands, this article explains the data lake technology maturity curve, detailing lake‑warehouse architectural patterns, design principles, core functionalities, and the four leading open‑source solutions (Hudi, Iceberg, Delta Lake, Paimon) to guide enterprises in building flexible, scalable, and governed data platforms.

Big DataData ArchitectureData Lake
0 likes · 10 min read
Data Lake Technology Maturity Curve: Architecture, Design Principles, Core Functions, and Open‑Source Solutions
DataFunTalk
DataFunTalk
Sep 24, 2024 · Big Data

Data Lake Technology Maturity Curve: Architecture Modes, Design Principles, Core Functions, and Applications

This article explains the rapid growth of data-driven businesses, the challenges of traditional data warehouses, and how modern data lake technologies such as Delta Lake, Hudi, Iceberg, and Paimon form a maturity curve that guides enterprises in architecture choices, design principles, core capabilities, and practical applications.

Big DataData LakeDelta Lake
0 likes · 12 min read
Data Lake Technology Maturity Curve: Architecture Modes, Design Principles, Core Functions, and Applications
Sohu Tech Products
Sohu Tech Products
Sep 11, 2024 · Big Data

Tencent Real-time Lakehouse Intelligent Optimization Practice

Tencent’s real‑time lakehouse combines Spark, Flink, StarRocks and Presto compute layers with Iceberg‑based management and HDFS/COS storage, and its Intelligent Optimize Service—comprising Compaction, Expiration, Cleaning, Clustering, Index and Auto‑Engine modules—automatically reduces merge time, improves query performance, enables secondary indexing, and dynamically routes hot partitions, while future plans target cold/hot separation, materialized view acceleration, and AI‑driven optimizations.

Big DataLakehousePyIceberg
0 likes · 12 min read
Tencent Real-time Lakehouse Intelligent Optimization Practice
ITPUB
ITPUB
Sep 11, 2024 · Big Data

Is Storage‑Compute Separation the Future? Unpacking the Lakehouse Debate

The article examines the concepts of storage‑compute separation and the lake‑warehouse (lakehouse) model, tracing their evolution from physical Hadoop clusters to containerized compute and object storage, and argues that true separation requires MPP systems to adopt open standards, effectively merging lake and warehouse architectures.

Big Data ArchitectureHadoopLakehouse
0 likes · 7 min read
Is Storage‑Compute Separation the Future? Unpacking the Lakehouse Debate
StarRocks
StarRocks
Sep 5, 2024 · Big Data

Accelerate Lakehouse Queries: A Hands‑On Guide to StarRocks + Apache Iceberg

This tutorial walks you through the fundamentals of Apache Iceberg, its architecture and key features, explains why it’s advantageous for lakehouse workloads, and provides a step‑by‑step Docker‑Compose setup to integrate Iceberg with StarRocks for fast, ACID‑compliant analytics on real‑world taxi data.

Apache IcebergDockerLakehouse
0 likes · 15 min read
Accelerate Lakehouse Queries: A Hands‑On Guide to StarRocks + Apache Iceberg
DataFunSummit
DataFunSummit
Aug 26, 2024 · Big Data

Building a Doris‑Based Lakehouse Integrated Analytics System at Kuaishou

This article presents Kuaishou's experience of designing and implementing a Doris‑driven lakehouse integrated analytics system, covering the current OLAP landscape, challenges of data duplication and governance, the new architecture with caching and auto‑materialization, implementation details, performance impact, and future work.

Auto MaterializationBig DataData Warehouse
0 likes · 24 min read
Building a Doris‑Based Lakehouse Integrated Analytics System at Kuaishou
StarRocks
StarRocks
Aug 14, 2024 · Big Data

Mastering StarRocks & Apache Paimon: A Fast‑Track Lakehouse Guide

This guide provides a comprehensive overview of Apache Paimon’s architecture, key features, and advantages, explains how to integrate it with StarRocks for real‑time lakehouse analytics, and walks through a complete quick‑start setup including component installation, Flink and Kafka deployment, data ingestion, table creation, and query execution with time‑travel support.

Apache PaimonFlinkKafka
0 likes · 18 min read
Mastering StarRocks & Apache Paimon: A Fast‑Track Lakehouse Guide
DataFunSummit
DataFunSummit
Aug 6, 2024 · Big Data

Implementing a Multi‑Tenant Lakehouse Data Platform for Real‑Time Analytics at a SaaS CRM Company

This article details how a SaaS CRM provider built a cloud‑native Lakehouse platform to support multi‑tenant real‑time analytics, describing data challenges, metadata‑driven architecture, virtual database design, query optimization, BI integration, AI readiness, migration steps, and the resulting performance and scalability gains.

Big DataData PlatformLakehouse
0 likes · 19 min read
Implementing a Multi‑Tenant Lakehouse Data Platform for Real‑Time Analytics at a SaaS CRM Company
Wukong Talks Architecture
Wukong Talks Architecture
Aug 6, 2024 · Databases

Migrating Tencent Music's Data Infrastructure from ClickHouse and Druid to StarRocks: Strategy, Implementation, and Best Practices

This article details how Tencent Music’s data‑infrastructure team migrated thousands of ClickHouse and Druid nodes to a StarRocks compute‑storage‑separated lakehouse, achieving 40‑50% cost reduction while maintaining query performance, and shares the technical challenges, solutions, and best‑practice recommendations gathered during the process.

ClickHouseCost reductionData Migration
0 likes · 19 min read
Migrating Tencent Music's Data Infrastructure from ClickHouse and Druid to StarRocks: Strategy, Implementation, and Best Practices
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jul 22, 2024 · Databases

Why StarRocks Is Redefining Fast Unified OLAP Analytics

StarRocks combines vectorized execution, a new cost‑based optimizer, materialized views, a real‑time storage engine, pipeline execution, and distributed joins to deliver a unified, high‑performance OLAP solution that supports both traditional and lakehouse analytics while reducing operational complexity.

CBOLakehouseOLAP
0 likes · 14 min read
Why StarRocks Is Redefining Fast Unified OLAP Analytics
DataFunSummit
DataFunSummit
Jul 12, 2024 · Big Data

Data Lake Development Trends, Architecture, Integration, Lakehouse Core Capabilities, and Open Design

This article examines the current evolution of data lakes, detailing their overall architecture, batch and real‑time integration methods, Lakehouse core functionalities such as enhanced DML, schema evolution, ACID support, and open‑design principles that enable multi‑cloud deployment and seamless interaction with diverse compute engines.

Batch ProcessingBig Data ArchitectureData Lake
0 likes · 12 min read
Data Lake Development Trends, Architecture, Integration, Lakehouse Core Capabilities, and Open Design
StarRocks
StarRocks
Jul 2, 2024 · Big Data

What’s New in StarRocks 3.3? Deep Dive into Lakehouse‑Optimized Performance and Features

StarRocks 3.3 introduces a comprehensive set of enhancements—including maturity levels, ARM‑optimized performance, advanced caching, materialized‑view rewrites, storage optimizations, and expanded lakehouse ecosystem support—that together boost stability, query speed, and usability for large‑scale analytics workloads.

Big DataLakehouseStarRocks
0 likes · 15 min read
What’s New in StarRocks 3.3? Deep Dive into Lakehouse‑Optimized Performance and Features
DataFunTalk
DataFunTalk
Jul 1, 2024 · Big Data

DataFunCon2024 Beijing: Real‑Time Lakehouse and Big Data Sessions

The DataFunCon2024 Beijing conference on July 5‑6 showcases a series of technical talks about real‑time lakehouse architectures, big‑data analytics, and cloud‑native data warehouses, offering practitioners insights into Apache Paimon, SelectDB, and Doris implementations for faster, more agile data processing.

Apache PaimonLakehouseSelectDB
0 likes · 8 min read
DataFunCon2024 Beijing: Real‑Time Lakehouse and Big Data Sessions
DataFunTalk
DataFunTalk
Jun 10, 2024 · Big Data

Data Lake Development Trends, Architecture, Integration, and Lakehouse Core Capabilities

This article reviews the latest developments in data lakes, including trend analysis, overall architecture, data integration methods, Lakehouse core capabilities, open design principles, stream‑batch unified processing, real‑time OLAP, and lake‑internal warehousing, highlighting how these advances reduce complexity and cost while improving data sharing and performance.

Big Data ArchitectureLakehouseReal-time OLAP
0 likes · 14 min read
Data Lake Development Trends, Architecture, Integration, and Lakehouse Core Capabilities
DataFunTalk
DataFunTalk
Jun 9, 2024 · Big Data

Optimizing ClickHouse Performance in WeChat: Observation Tools, Lakehouse Reading, Bitmap Acceleration, and AI Integration

This article details how the WeChat team leverages ClickHouse at massive scale, introduces a suite of performance observation tools, describes lakehouse reading and bitmap optimizations, and explains the integration of AI workloads, demonstrating overall query speedups of up to tenfold across diverse scenarios.

Big DataBitmapClickHouse
0 likes · 10 min read
Optimizing ClickHouse Performance in WeChat: Observation Tools, Lakehouse Reading, Bitmap Acceleration, and AI Integration
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jun 6, 2024 · Databases

How StarRocks Redefines Lakehouse Architecture with Ultra-Fast Unified Analytics

StarRocks combines extreme query speed and a unified architecture to deliver a lakehouse solution that separates storage and compute, supports multi‑warehouse resource isolation, offers Trino compatibility, materialized‑view acceleration, and cost‑effective scaling, making it suitable for real‑time analytics, data‑lake queries, and traditional OLAP workloads.

Big DataLakehouseReal-time analytics
0 likes · 23 min read
How StarRocks Redefines Lakehouse Architecture with Ultra-Fast Unified Analytics
StarRocks
StarRocks
May 14, 2024 · Artificial Intelligence

How Tencent Games Boosted AI‑Generated SQL Accuracy to 89% with a Lakehouse Architecture

Tencent Games tackled the low accuracy of AI‑generated SQL in production by combining large language models with a StarRocks lake‑warehouse, introducing a semantic layer, async materialized views, and an agent‑based multi‑intelligence framework, ultimately raising one‑shot SQL correctness to 89% and cutting delivery time from 2 hours to 0.33 hours.

AILLMLakehouse
0 likes · 13 min read
How Tencent Games Boosted AI‑Generated SQL Accuracy to 89% with a Lakehouse Architecture
DataFunSummit
DataFunSummit
May 12, 2024 · Big Data

Practice of Lakehouse‑Integrated Data Platform Architecture in the Financial Innovation Sector

This article presents the evolution of data platform architectures, the specific challenges of financial‑sector information‑technology innovation, and the design, core components, deployment path, and real‑world case studies of the cloud‑native lakehouse solution DataCyber developed by Shuxin Network.

Big DataData PlatformFinancial Innovation
0 likes · 21 min read
Practice of Lakehouse‑Integrated Data Platform Architecture in the Financial Innovation Sector
DataFunSummit
DataFunSummit
May 5, 2024 · Big Data

Alluxio in Lakehouse Architecture: Benefits, Challenges, and Real‑World Use Cases

This article explains how Alluxio enables a unified lake‑warehouse architecture by decoupling compute and storage, outlines its core capabilities, evaluates the cost‑saving and performance benefits, discusses the technical challenges, and presents several practical deployment scenarios in finance and AI workloads.

AlluxioBig DataData Orchestration
0 likes · 15 min read
Alluxio in Lakehouse Architecture: Benefits, Challenges, and Real‑World Use Cases
DataFunSummit
DataFunSummit
Apr 25, 2024 · Big Data

Paimon Project Overview: Recent Developments, Core Capabilities, and Future Roadmap

This article presents a comprehensive overview of the Apache‑incubated Paimon project, covering its evolution from Flink Table Store, the current features of primary‑key and log tables, management tools such as snapshots, tags and branches, performance optimizations for Flink and Spark, and a detailed roadmap of upcoming functionalities.

Big DataData ManagementFlink
0 likes · 23 min read
Paimon Project Overview: Recent Developments, Core Capabilities, and Future Roadmap
DataFunTalk
DataFunTalk
Apr 23, 2024 · Big Data

Apache Paimon Graduates to Top‑Level Project – Milestones, Core Capabilities, and Community Highlights

Apache Paimon, originally launched as Flink Table Store, has graduated to an Apache Top‑Level Project after a year of incubation, showcasing real‑time lakehouse capabilities, extensive ecosystem integration, and strong adoption by major enterprises, marking a significant milestone for streaming and batch data processing.

Apache PaimonBig DataLakehouse
0 likes · 9 min read
Apache Paimon Graduates to Top‑Level Project – Milestones, Core Capabilities, and Community Highlights
DataFunTalk
DataFunTalk
Apr 20, 2024 · Big Data

Tencent Video Metrics Middle Platform and Lakehouse Integration: Architecture, Governance, and Practices

This article details Tencent Video’s data business, describing the design and implementation of its metrics middle platform and lake‑warehouse integration, covering architecture, governance, consistency, timeliness, usability, cost optimization, and future plans, with insights into technology choices such as Iceberg, StarRocks, and MQL.

Big DataData GovernanceLakehouse
0 likes · 18 min read
Tencent Video Metrics Middle Platform and Lakehouse Integration: Architecture, Governance, and Practices