Big Data Technology & Architecture
Author

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

1.0k
Articles
0
Likes
41
Views
0
Comments
Recent Articles

Latest from Big Data Technology & Architecture

100 recent articles max
Big Data Technology & Architecture
Big Data Technology & Architecture
Aug 20, 2024 · Big Data

Practical Insights on Using Apache Paimon for Real-World Data Lake Scenarios

This article shares a personal, experience‑driven overview of Apache Paimon, highlighting its design simplicity, key capabilities such as schema evolution, stream‑batch unified processing, primary‑key support, and closed‑loop data handling, while discussing when its features are appropriate for production environments.

Apache PaimonBatch ProcessingStreaming
0 likes · 5 min read
Practical Insights on Using Apache Paimon for Real-World Data Lake Scenarios
Big Data Technology & Architecture
Big Data Technology & Architecture
Aug 5, 2024 · Big Data

Key Features of Apache Flink 1.20: Materialized Tables, DISTRIBUTED BY, and State/Checkpoint Optimizations

The article reviews Apache Flink 1.20, highlighting the new Materialized Table concept, the DISTRIBUTED BY support for load‑balanced storage and join performance, and state/checkpoint file merging improvements, while providing code examples and practical insights for users.

Apache FlinkCheckpoint OptimizationDistributed By
0 likes · 7 min read
Key Features of Apache Flink 1.20: Materialized Tables, DISTRIBUTED BY, and State/Checkpoint Optimizations
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 26, 2024 · Databases

Apache Doris Architecture and Common Q&A: Read/Write Flow, Replication Consistency, Storage, and High Availability

This article provides a comprehensive overview of Apache Doris, explaining its frontend and backend nodes, storage structures such as tablets, rowsets, and segments, replication mechanisms, partitioning versus bucketing, indexing types, compaction processes, and high‑availability strategies through a detailed Q&A format.

Apache DorisDatabase ArchitectureStorage Engine
0 likes · 22 min read
Apache Doris Architecture and Common Q&A: Read/Write Flow, Replication Consistency, Storage, and High Availability
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 25, 2024 · Big Data

Fundamental Concepts and File Layout of Paimon: Snapshots, Partitions, Buckets, Consistency, and Compaction

This article explains Paimon's core concepts—including snapshots, partitions, buckets, consistency guarantees, file layout, LSM‑tree organization, and compaction strategies—while also covering table management tasks such as snapshot expiration, rollback, partition expiration, and small‑file mitigation techniques.

BucketsLSM‑TreePaimon
0 likes · 12 min read
Fundamental Concepts and File Layout of Paimon: Snapshots, Partitions, Buckets, Consistency, and Compaction
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 3, 2024 · Databases

Optimizing High-Concurrency Point Queries in Doris with Row Store, Short Query Path, and PreparedStatement

This guide explains how to enable row store, configure short query path, and use PreparedStatement in Doris to reduce I/O and CPU overhead for high‑concurrency primary‑key point queries, including DDL examples, JDBC usage, row cache settings, performance tips, and verification methods.

PreparedStatementRow StoreSQL
0 likes · 9 min read
Optimizing High-Concurrency Point Queries in Doris with Row Store, Short Query Path, and PreparedStatement
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 1, 2024 · Big Data

Applying Data Lake (Hudi) at Kuaishou: Architecture Evolution, Use Cases, and Practice

This article details Kuashou's journey of adopting the Hudi data lake, covering business challenges, migration from Hive to Hudi, architectural redesign, promotion strategy, real‑world use cases such as CDC sync and batch‑stream integration, and key lessons learned for large‑scale data engineering.

Big Data ArchitectureData WarehouseHudi
0 likes · 11 min read
Applying Data Lake (Hudi) at Kuaishou: Architecture Evolution, Use Cases, and Practice
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 24, 2024 · Big Data

How to Address Data Inconsistency and Validation Challenges Between Data and Algorithm Teams

This article discusses practical strategies for data and algorithm teams to handle real‑time data inconsistencies, validation difficulties, and communication gaps by emphasizing clear scope definition, realistic technical assessments, proactive risk identification, and the importance of specialized, well‑qualified talent.

algorithm collaborationreal-time data
0 likes · 6 min read
How to Address Data Inconsistency and Validation Challenges Between Data and Algorithm Teams