Big Data Technology Architecture
Author

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

290
Articles
0
Likes
602
Views
0
Comments
Recent Articles

Latest from Big Data Technology Architecture

100 recent articles max
Big Data Technology Architecture
Big Data Technology Architecture
Jul 8, 2025 · Big Data

Why Fluss Is the Next Big Leap in Real‑Time Stream Storage

The Fluss project, an open‑source next‑generation stream storage engine donated by Alibaba, has entered the Apache Software Foundation incubator, offering columnar streaming, real‑time updates, lake‑flow integration, impressive performance metrics, and a growing global developer community.

Apache IncubatorFlink IntegrationFluss
0 likes · 7 min read
Why Fluss Is the Next Big Leap in Real‑Time Stream Storage
Big Data Technology Architecture
Big Data Technology Architecture
Mar 25, 2025 · Big Data

Kafka 4.0 Release: KRaft Architecture, Consumer Group Optimizations, and New Queue Features

Kafka 4.0 marks a milestone release that replaces ZooKeeper with the KRaft consensus engine, improves scalability and performance, introduces a server‑side consumer‑group protocol, adds shared‑group queue capabilities, and updates Java requirements and documentation, delivering a more robust and flexible streaming platform.

Distributed StreamingJava11KRaft
0 likes · 6 min read
Kafka 4.0 Release: KRaft Architecture, Consumer Group Optimizations, and New Queue Features
Big Data Technology Architecture
Big Data Technology Architecture
Mar 8, 2025 · Artificial Intelligence

Understanding General, Intelligent, and Super Computing: Concepts, Processor Types, and Application Scenarios

This article explains the three main types of computing power—general (通算), intelligent (智算), and supercomputing (超算)—detailing their definitions, typical processor architectures, and real‑world application scenarios across everyday office tasks, AI workloads, and large‑scale scientific research.

AI computingIntelligent ComputingSupercomputing
0 likes · 8 min read
Understanding General, Intelligent, and Super Computing: Concepts, Processor Types, and Application Scenarios
Big Data Technology Architecture
Big Data Technology Architecture
Mar 1, 2025 · Big Data

Core Principles and Practical Guide to Flink CDC

This article explains CDC fundamentals, details Flink CDC's architecture and advantages, provides setup steps, code examples for SQL and DataStream APIs, discusses performance tuning, consistency, common issues, and typical real‑time data integration scenarios.

CDCChange Data CaptureDebezium
0 likes · 7 min read
Core Principles and Practical Guide to Flink CDC
Big Data Technology Architecture
Big Data Technology Architecture
Feb 8, 2025 · Big Data

How AI Can Accelerate Data Engineering: Practical DeepSeek Use Cases and Tips

This article shows how AI tools like DeepSeek can dramatically speed up data‑engineering tasks—such as fixing long‑running SQL queries, building real‑time data pipelines with Flink, and deciphering legacy stored procedures—while offering concrete prompts, real‑world case studies, and five time‑saving techniques.

AutomationDeepSeekSQL Optimization
0 likes · 6 min read
How AI Can Accelerate Data Engineering: Practical DeepSeek Use Cases and Tips
Big Data Technology Architecture
Big Data Technology Architecture
Feb 7, 2025 · Artificial Intelligence

How to Build a DeepSeek AI Assistant on DingTalk

This guide explains why DeepSeek is a valuable AI assistant, outlines the challenges of high demand, and provides step‑by‑step instructions for creating, configuring, testing, and publishing a DeepSeek AI assistant within the DingTalk platform to ensure stable access.

AI AssistantDeepSeekDingTalk
0 likes · 4 min read
How to Build a DeepSeek AI Assistant on DingTalk
Big Data Technology Architecture
Big Data Technology Architecture
Nov 29, 2023 · Big Data

Building Real-Time Wide Tables with Partial-Update Using Apache Paimon for NetEase News Recommendation

The article describes how NetEase News' recommendation team replaced a slow, batch‑oriented data‑warehouse pipeline with a Flink‑based, Apache Paimon real‑time wide‑table solution that supports partial updates, reduces latency from hours to minutes, and lowers processing costs while handling both deduplication and non‑deduplication recommendation scenarios.

Apache PaimonFlinkPartial Update
0 likes · 8 min read
Building Real-Time Wide Tables with Partial-Update Using Apache Paimon for NetEase News Recommendation
Big Data Technology Architecture
Big Data Technology Architecture
Nov 14, 2023 · Big Data

Open Source Big Data Platform 3.0: Streaming Lakehouse, Serverless Architecture, and AI Integration

The talk outlines the evolution of Alibaba Cloud's open‑source big data platform from Hadoop‑based EMR to a 3.0 architecture featuring a streaming lakehouse, full serverless compute and storage, AI‑driven operations, and upcoming vector search services, highlighting technical motivations, challenges, and product releases.

Streamingbig datacloud-native
0 likes · 14 min read
Open Source Big Data Platform 3.0: Streaming Lakehouse, Serverless Architecture, and AI Integration