Tagged articles
47 articles
Page 1 of 1
21CTO
21CTO
Apr 20, 2025 · Databases

Choosing the Right Database: B‑Tree vs LSM‑Tree – A Story‑Driven Deep Dive

This article walks you through the inner workings of modern databases—explaining storage engines, query parsing, execution, and transaction models—while comparing classic B‑Tree structures with newer LSM‑Tree designs to help you decide whether SQL or NoSQL best fits your performance and consistency needs.

B+TreeLSM‑TreeStorage Engines
0 likes · 8 min read
Choosing the Right Database: B‑Tree vs LSM‑Tree – A Story‑Driven Deep Dive
Big Data Technology & Architecture
Big Data Technology & Architecture
Jan 2, 2025 · Big Data

Apache Paimon: Core Capabilities, Table Types, LSM Tree, Buckets, Merge Engines, and Operational Details

This article provides a comprehensive overview of Apache Paimon, covering its real‑time lake ingestion, unified stream‑batch processing, table types (primary‑key and append‑only), LSM‑tree storage, bucket mechanisms, merge‑engine options, compaction strategies, concurrency control, consumption methods, tag management, data cleanup, and system tables for big‑data workloads.

Apache PaimonBig DataFlink
0 likes · 25 min read
Apache Paimon: Core Capabilities, Table Types, LSM Tree, Buckets, Merge Engines, and Operational Details
G7 EasyFlow Tech Circle
G7 EasyFlow Tech Circle
Sep 24, 2024 · Databases

Why Modern Databases Prefer LSM Trees Over B‑Trees: Hardware, Workloads, and More

Modern databases have largely shifted from B‑tree based storage to LSM‑tree engines due to SSD hardware characteristics, high‑write workloads, concurrency advantages, simpler implementation, and evolving application demands, with additional insights into Paxos/Raft consensus, common database jargon, and performance optimizations.

Database JargonDatabase StorageLSM‑Tree
0 likes · 11 min read
Why Modern Databases Prefer LSM Trees Over B‑Trees: Hardware, Workloads, and More
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 25, 2024 · Big Data

Fundamental Concepts and File Layout of Paimon: Snapshots, Partitions, Buckets, Consistency, and Compaction

This article explains Paimon's core concepts—including snapshots, partitions, buckets, consistency guarantees, file layout, LSM‑tree organization, and compaction strategies—while also covering table management tasks such as snapshot expiration, rollback, partition expiration, and small‑file mitigation techniques.

Big DataBucketsLSM‑Tree
0 likes · 12 min read
Fundamental Concepts and File Layout of Paimon: Snapshots, Partitions, Buckets, Consistency, and Compaction
DataFunTalk
DataFunTalk
Apr 25, 2024 · Big Data

Apache Hudi 1.0: Design Reconsiderations and Key New Features

This article provides a comprehensive overview of Apache Hudi 1.0, detailing its architectural redesign, five major development directions, and the most important new capabilities such as LSM‑tree timeline, function indexes, file‑group readers/writers, partial updates, and non‑blocking concurrency control, along with performance evaluations and resource links.

Apache HudiBig DataFunction Index
0 likes · 14 min read
Apache Hudi 1.0: Design Reconsiderations and Key New Features
Huolala Tech
Huolala Tech
Dec 27, 2023 · Big Data

How HBase Compaction Tuning Boosts Performance at Scale

This article explains LSM‑Tree based HBase compaction concepts, compares Minor and Major compactions, and shares practical tuning steps—including disabling automatic major compactions, controlling merge size, leveraging off‑peak windows, and improving merge efficiency—to reduce I/O, CPU usage, and latency in production environments.

Big DataDatabase OptimizationHBase
0 likes · 11 min read
How HBase Compaction Tuning Boosts Performance at Scale
AntTech
AntTech
Dec 15, 2023 · Databases

CStore: A Native Graph Storage Engine for Large-Scale Graph Analysis

CStore is a Rust‑implemented native graph storage engine designed for large‑scale graph analysis, supporting petabyte‑level data, offering array‑plus‑linked‑list storage, multi‑level indexing, efficient compaction, and providing detailed build instructions and future roadmap for open‑source development.

CStoreLSM‑TreeRust
0 likes · 12 min read
CStore: A Native Graph Storage Engine for Large-Scale Graph Analysis
Sohu Tech Products
Sohu Tech Products
Dec 13, 2023 · Databases

Fundamentals of RocksDB and Its Application in Vivo Message Push System

The article explains RocksDB’s LSM‑based architecture, column‑family isolation, and snapshot features, and shows how Vivo’s VPUSH mapping service uses these capabilities to store billions of registerId‑to‑ClientId mappings with high‑concurrency, low‑cost, fault‑tolerant performance across multiple replicated servers.

Column FamilyLSM‑TreeMessage Push
0 likes · 24 min read
Fundamentals of RocksDB and Its Application in Vivo Message Push System
Big Data Technology & Architecture
Big Data Technology & Architecture
Dec 8, 2023 · Big Data

Comprehensive Guide to Apache Paimon and Advanced Flink Integration

This article provides an in‑depth overview of Apache Paimon as a streaming lakehouse, explains its core features, file layout, consistency guarantees, and offers detailed guidance on integrating and tuning Paimon with Apache Flink for both write and read performance, multi‑writer concurrency, table management, and bucket rescaling.

Apache PaimonBig DataData Lake
0 likes · 23 min read
Comprehensive Guide to Apache Paimon and Advanced Flink Integration
vivo Internet Technology
vivo Internet Technology
Dec 6, 2023 · Databases

RocksDB Fundamentals and Its Application in Vivo Message Push System

The article explains RocksDB’s LSM‑based architecture, column‑family isolation, and snapshot features, and shows how Vivo’s VPUSH MappingTransformServer uses these capabilities with C++ code to store billions of registerId‑to‑ClientId mappings across multiple replicated servers for high‑concurrency, low‑latency, and fast service expansion.

Column FamilyLSM‑TreeMessage Push
0 likes · 25 min read
RocksDB Fundamentals and Its Application in Vivo Message Push System
Aikesheng Open Source Community
Aikesheng Open Source Community
May 15, 2023 · Databases

Performance Degradation After Data Updates in OceanBase and Its Optimization Techniques

The article investigates why pure‑read QPS drops significantly after bulk updates in OceanBase, reproduces the issue with a sysbench workload, analyses flame‑graph and SQL audit data, explains the LSM‑Tree read‑amplification mechanism, and proposes practical mitigation steps such as major freeze, plan binding, index creation, and the queuing‑table feature.

LSM‑TreeMajor FreezeOceanBase
0 likes · 16 min read
Performance Degradation After Data Updates in OceanBase and Its Optimization Techniques
Aikesheng Open Source Community
Aikesheng Open Source Community
Mar 9, 2023 · Databases

In‑Depth Exploration of OceanBase Hierarchical Dump and Compaction Mechanisms

This article explains the LSM‑Tree foundation of OceanBase, details its tiered and leveled compaction strategies, and presents two experiments that observe Mini and Minor compactions under different configuration parameters, revealing how minor freeze and trigger settings affect data movement between L0 and L1 layers.

Database StorageLSM‑TreeMini Compaction
0 likes · 13 min read
In‑Depth Exploration of OceanBase Hierarchical Dump and Compaction Mechanisms
dbaplus Community
dbaplus Community
Oct 29, 2022 · Databases

B‑Tree vs LSM‑Tree: Which Storage Engine Fits Your Database Workload?

This article examines the fundamental differences between B‑Tree and LSM‑Tree storage structures in distributed and relational databases, detailing their write and read paths, performance trade‑offs, update handling, lock conflicts, and high‑availability considerations to help engineers choose the right engine for their workloads.

B-TreeDatabase StorageLSM‑Tree
0 likes · 25 min read
B‑Tree vs LSM‑Tree: Which Storage Engine Fits Your Database Workload?
DaTaobao Tech
DaTaobao Tech
Oct 19, 2022 · Databases

Overview of LSM‑Tree Architecture and Its Use in Modern Databases

LSM‑Tree stores writes in an in‑memory MemTable then flushes ordered SSTables to disk, using Bloom filters and indexes to speed reads, while periodic compactions merge files; modern systems such as LevelDB, HBase, and ClickHouse adopt this design to achieve high write throughput despite slower point and range queries and occasional compaction overhead.

ClickHouseHBaseLSM‑Tree
0 likes · 11 min read
Overview of LSM‑Tree Architecture and Its Use in Modern Databases
DataFunTalk
DataFunTalk
Oct 19, 2022 · Big Data

Understanding Flink Table Store: Design, Usage, and Roadmap

Flink Table Store, an Apache Flink subproject, provides a unified stream‑batch storage layer with SQL‑based table APIs, addressing real‑time and offline data needs, detailing its design goals, usage patterns, architectural layers, implementation choices, and upcoming roadmap.

FlinkLSM‑TreeStreaming
0 likes · 14 min read
Understanding Flink Table Store: Design, Usage, and Roadmap
Big Data Technology & Architecture
Big Data Technology & Architecture
May 12, 2022 · Databases

Understanding B+ Trees and Log‑Structured Merge (LSM) Trees and Their Use in HBase

This article explains the fundamentals of B+ trees, introduces log‑structured merge (LSM) trees as a modern alternative for write‑intensive workloads, and demonstrates how HBase leverages LSM trees—including MemStore, HFile, compaction, and Bloom filters—to achieve efficient storage and retrieval in NoSQL environments.

B+TreeHBaseLSM‑Tree
0 likes · 7 min read
Understanding B+ Trees and Log‑Structured Merge (LSM) Trees and Their Use in HBase
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Mar 16, 2022 · Databases

RDB: Cloud Music's Customized Algorithm Feature KV Storage System Based on RocksDB

To meet Cloud Music’s massive algorithm‑feature KV storage needs, the team built RDB—a RocksDB‑based engine within Tair—adding bulk‑load, dual‑version imports, KV‑separation, in‑place sequence appends and protobuf field updates, cutting storage cost, write amplification and latency while scaling to billions of records and millions of QPS.

Algorithm FeaturesKV SeparationKV storage
0 likes · 16 min read
RDB: Cloud Music's Customized Algorithm Feature KV Storage System Based on RocksDB
IT Xianyu
IT Xianyu
Oct 14, 2021 · Databases

Comparing MySQL and HBase: Architecture, Engine, and Application Scenarios

This article compares MySQL and HBase by examining their architectural designs, storage engines, data access patterns, and ecosystem features, highlighting the strengths and trade‑offs of each system and outlining the scenarios where HBase is a suitable complement to MySQL.

B+TreeBig DataHBase
0 likes · 5 min read
Comparing MySQL and HBase: Architecture, Engine, and Application Scenarios
Qingyun Technology Community
Qingyun Technology Community
Sep 14, 2021 · Fundamentals

How KVSSD Integrates LSM Trees and Flash Translation to Slash Write Amplification

This article reviews the KVSSD paper presented at DATE 2018, explaining how close integration of LSM trees with the flash translation layer reduces write amplification, outlines the design optimizations such as K2P mapping, remapping compaction, hot‑cold separation, and discusses performance results and industry progress.

KVSSDLSM‑TreeNVMe
0 likes · 15 min read
How KVSSD Integrates LSM Trees and Flash Translation to Slash Write Amplification
DataFunTalk
DataFunTalk
Jul 20, 2021 · Databases

Time‑Series Database Series: Trends, Design Principles, and Comparative Analysis of OpenTSDB, InfluxDB, and Apache IoTDB

This article explores the evolution and current landscape of time‑series databases, detailing design principles, storage structures such as B‑Tree, B+Tree, and LSM‑Tree, and providing an in‑depth comparison of OpenTSDB, InfluxDB, and the emerging Apache IoTDB, while also discussing practical deployment considerations and industry use cases.

Apache IoTDBB+TreeInfluxDB
0 likes · 38 min read
Time‑Series Database Series: Trends, Design Principles, and Comparative Analysis of OpenTSDB, InfluxDB, and Apache IoTDB
DataFunTalk
DataFunTalk
Apr 19, 2021 · Databases

Current Trends, Core Technologies, and Challenges of Time Series Databases

This article reviews the rapid growth of global data, examines the evolving landscape and classification of time‑series databases, analyzes storage engine designs such as B‑Tree versus LSM‑Tree, discusses query optimization and real‑time analytics, and outlines practical application scenarios in IoT and industrial settings.

IoTLSM‑TreeTime Series Database
0 likes · 19 min read
Current Trends, Core Technologies, and Challenges of Time Series Databases
NetEase Media Technology Team
NetEase Media Technology Team
Dec 23, 2020 · Databases

Practical Experience of MyRocks in NetEase Media Business

Since 2019 NetEase Media has migrated several recommendation and account services from RDS to MyRocks, cutting disk usage by up to 68 % and halving response times while handling 40‑50 k QPS write‑heavy workloads, though the engine lacks partitioning, online DDL, and certain index types, requiring careful workload assessment.

Database OptimizationLSM‑TreeMyRocks
0 likes · 12 min read
Practical Experience of MyRocks in NetEase Media Business
Tencent Cloud Developer
Tencent Cloud Developer
Dec 10, 2020 · Databases

Understanding LevelDB Architecture, Read/Write Flow, and Compaction Process

LevelDB stores data using an in‑memory Memtable that flushes to immutable tables and disk‑based SSTables, writes are logged then batched and applied through a writer queue, reads check Memtable, immutable Memtable, then SSTables, and background compactions merge tables to improve read performance and reclaim space.

Database InternalsLSM‑TreeLevelDB
0 likes · 16 min read
Understanding LevelDB Architecture, Read/Write Flow, and Compaction Process
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 27, 2020 · Databases

How OceanBase Delivers Cloud‑Native Distributed Relational Database Performance and Availability

This article explains OceanBase's public‑cloud deployment, its unique architecture without a central controller, horizontal scaling via partition groups, LSM‑Tree storage design, advanced SQL engine features, ACID‑plus‑Availability guarantees, and real‑world performance records, illustrating why it suits high‑availability financial workloads.

Cloud NativeLSM‑TreeOceanBase
0 likes · 12 min read
How OceanBase Delivers Cloud‑Native Distributed Relational Database Performance and Availability
AntTech
AntTech
Oct 7, 2019 · Databases

OceanBase Storage Architecture and Optimizations for TPC‑C Benchmark

This article explains how OceanBase’s distributed, shared‑nothing architecture, with dual data replicas, Paxos‑based consistency, online compression, and resource‑isolated compaction, enables it to achieve top TPC‑C performance while addressing storage costs and CPU usage.

LSM‑TreeOceanBaseOnline Compression
0 likes · 10 min read
OceanBase Storage Architecture and Optimizations for TPC‑C Benchmark
Big Data Technology Architecture
Big Data Technology Architecture
May 8, 2019 · Databases

Understanding HBase Scan Process and Its Performance Compared to Parquet and Kudu

The article explains why HBase read operations are complex due to its LSM‑Tree storage and multi‑version design, details the step‑by‑step Scan workflow, discusses the reasons for its multi‑request architecture, compares scan performance with Parquet and Kudu, and offers recommendations for large‑scale data scanning.

HBaseLSM‑TreeSCAN
0 likes · 7 min read
Understanding HBase Scan Process and Its Performance Compared to Parquet and Kudu
Qunar Tech Salon
Qunar Tech Salon
Apr 18, 2018 · Databases

FPGA-Accelerated X-Engine Storage Engine for High‑Performance OLTP

This article presents the design, implementation, and evaluation of X‑Engine, a next‑generation LSM‑Tree based storage engine that offloads compaction to FPGA, achieving up to 50% KV‑interface and 40% SQL‑interface performance gains for write‑intensive OLTP workloads.

FPGALSM‑TreeStorage Engine
0 likes · 19 min read
FPGA-Accelerated X-Engine Storage Engine for High‑Performance OLTP
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 9, 2018 · Databases

How FPGA Acceleration Supercharges X-Engine’s Compaction for 10× MySQL Performance

This article introduces Alibaba’s X‑Engine storage engine, the foundation of the next‑generation distributed database X‑DB, and explains how FPGA‑accelerated compaction and asynchronous scheduling dramatically improve write‑intensive OLTP performance, reduce CPU contention, and achieve up to 50 % throughput gains while maintaining fault tolerance.

FPGAHardware accelerationLSM‑Tree
0 likes · 21 min read
How FPGA Acceleration Supercharges X-Engine’s Compaction for 10× MySQL Performance
Architecture Digest
Architecture Digest
Mar 22, 2018 · Fundamentals

Recent Trends and Hot Topics in Storage Technologies: Open‑Channel SSD, NVM, Learned Indexes, LSM‑Tree Optimizations, and Crash Consistency

This article surveys current storage research and industry developments, covering Open‑Channel SSD architecture, non‑volatile memory programming models, machine‑learning‑driven learned indexes, LSM‑Tree performance improvements, crash‑consistency verification, and recent virtualization and container‑storage advances.

LSM‑TreeLearned IndexNVM
0 likes · 18 min read
Recent Trends and Hot Topics in Storage Technologies: Open‑Channel SSD, NVM, Learned Indexes, LSM‑Tree Optimizations, and Crash Consistency
Taobao Frontend Technology
Taobao Frontend Technology
Jul 6, 2017 · Databases

Understanding LevelDB: Architecture, Interfaces, and New Features

LevelDB, Google's high-performance key‑value store built on LSM trees, uses an in‑memory skip‑list, immutable memtables, and sstable files organized in multi‑level compaction, offering interfaces for creation, reads, writes, snapshots, and new features like fuzzy search and JSON storage, all explained with diagrams.

Database ArchitectureLSM‑TreeLevelDB
0 likes · 11 min read
Understanding LevelDB: Architecture, Interfaces, and New Features