Tagged articles

Compaction

69 articles · Page 1 of 1

Jun 21, 2026 · Databases

Why Deleting 1 Million Vectors in Milvus Doesn't Shrink Disk Space: A Deep Dive into 11 CompactionTypes

When Milvus appears to keep disk usage unchanged after deleting a million vectors, the cause is not a bug but a sophisticated compaction system that splits the single compact() API into eleven enum values, six independent policies, and seven special handling paths that together manage different kinds of data waste and ensure safe, incremental reclamation.

ClusteringCompactionDataCoord

0 likes · 23 min read

Why Deleting 1 Million Vectors in Milvus Doesn't Shrink Disk Space: A Deep Dive into 11 CompactionTypes

AI Engineer Programming

Jun 3, 2026 · Artificial Intelligence

Production-Grade Agent Memory: Compaction, Decay, and the Observation Engine

The article presents a comprehensive architecture for production‑grade autonomous agents, detailing failure modes, four distinct memory types, a nightly observation engine that turns patterns into procedural rules, tier‑aware decay scoring, context budgeting, GDPR‑compliant deletion, and a step‑by‑step maintenance pipeline.

Agent MemoryCompactionGDPR compliance

0 likes · 31 min read

Production-Grade Agent Memory: Compaction, Decay, and the Observation Engine

Big Data Technology & Architecture

May 26, 2026 · Big Data

Advanced Paimon Production Issues: 10 Rare Compaction‑Related Problems and Fixes

This article enumerates ten uncommon, compaction‑related problems encountered in large‑scale Paimon deployments, explains their root causes—such as RPC timeouts, snapshot expiration, file corruption, and write conflicts—and provides concrete configuration tweaks and operational steps to resolve each issue.

Big DataCompactionFlink

0 likes · 9 min read

Advanced Paimon Production Issues: 10 Rare Compaction‑Related Problems and Fixes

Shi's AI Notebook

May 18, 2026 · Artificial Intelligence

Anthropic’s Practical Approach to Context Engineering for AI Agents

The article explains how Anthropic engineers treat the limited token budget of large language models as a finite resource, detailing static configuration, runtime retrieval, and long‑task strategies such as compaction, structured notes, and sub‑agent architectures to build reliable, efficient AI agents.

AI AgentsAnthropicCompaction

0 likes · 18 min read

Anthropic’s Practical Approach to Context Engineering for AI Agents

James' Growth Diary

May 3, 2026 · Artificial Intelligence

How Claude Code Handles max_output_tokens and Model Downgrade to Keep Agents Running

The article explains Claude Code's multi‑level fault‑tolerance for max_output_tokens errors, detailing dynamic token allocation, automatic model downgrade, environment‑variable controls, StopFailure hooks, and their coordination with compaction to prevent agents from getting stuck during long‑running tasks.

AI AgentClaude CodeCompaction

0 likes · 13 min read

How Claude Code Handles max_output_tokens and Model Downgrade to Keep Agents Running

AI Tech Publishing

Apr 27, 2026 · Artificial Intelligence

Context Window Strategies in Agent Harnesses: Pi, OpenClaw, Claude Code, Letta, Alyx

The article analyzes how five Agent Harness frameworks—Pi, OpenClaw, Claude Code, Letta, and Alyx—handle context windows, file pagination, tool result limits, session pruning, and sub‑agent isolation, revealing convergent design patterns that treat the context as a managed memory system.

Agent HarnessCompactionContext Management

0 likes · 21 min read

Context Window Strategies in Agent Harnesses: Pi, OpenClaw, Claude Code, Letta, Alyx

ByteDance SE Lab

Apr 17, 2026 · Industry Insights

How DisCoGC Cuts Storage Costs by 20%: A Deep Dive into ByteStore’s New GC Paradigm

This article analyzes the DisCoGC algorithm introduced by ByteDance, explaining how its discard‑centric garbage collection eliminates the write‑amplification vs. space‑amplification trade‑off in log‑structured storage, details the engineering challenges of multi‑layer deployment, and presents production results showing up to 20% TCO reduction without impacting latency.

CompactionDistributed storageGarbage Collection

0 likes · 19 min read

How DisCoGC Cuts Storage Costs by 20%: A Deep Dive into ByteStore’s New GC Paradigm

Architect

Apr 16, 2026 · Artificial Intelligence

Mastering Claude Code: Session Management Strategies for 1M Context Windows

This article analyzes Anthropic's Claude Code session‑management features, explaining how context rot limits effective token usage, what the 1 M‑token window actually stores, and when to use the five built‑in actions—Continue, /rewind, /clear, Compact and Subagent—to keep long‑running AI tasks reliable and efficient.

AI AgentsClaude CodeCompaction

0 likes · 18 min read

Mastering Claude Code: Session Management Strategies for 1M Context Windows

AI Tech Publishing

Mar 18, 2026 · Artificial Intelligence

How Context Engineering Turns AI Agents from ‘Usable’ to ‘Highly Effective’

The article explains how organizing the prompt, tool schemas, dialogue history, and retrieved documents—collectively the context window—affects an AI agent’s decisions, introduces the concepts of Lost‑in‑the‑Middle, Thinking Tokens, tool‑response caching, compaction versus SubAgent strategies, and shows a step‑by‑step evolution that raised accuracy from 60 % to over 95 %.

AI AgentsCompactionLLM

0 likes · 9 min read

How Context Engineering Turns AI Agents from ‘Usable’ to ‘Highly Effective’

Tencent Cloud Developer

Mar 4, 2026 · Artificial Intelligence

How OpenClaw Uses a Multi‑Layer Defense System to Prevent LLM Context Overflow

The article provides a detailed technical walkthrough of OpenClaw's three‑stage context‑management framework—including pre‑emptive pruning, LLM‑driven compaction, and overflow‑recovery truncation—showing how each layer protects long‑running AI agent sessions from exceeding token windows while preserving essential information.

Cache OptimizationCompactionContext Management

0 likes · 27 min read

How OpenClaw Uses a Multi‑Layer Defense System to Prevent LLM Context Overflow

Architect

Feb 27, 2026 · Artificial Intelligence

Turning AI Agents into Deliverable Workflows: Skills, Shell, and Compaction Explained

The article explains why writing code alone does not guarantee delivery, outlines three core challenges for long‑running agents—process reuse, execution, and context continuity—and presents a practical framework of Skills, Shell, and Compaction together with ten actionable recommendations, security guidelines, and implementation steps for teams.

AI AgentsCompactionSkills

0 likes · 18 min read

Turning AI Agents into Deliverable Workflows: Skills, Shell, and Compaction Explained

High Availability Architecture

Feb 13, 2026 · Artificial Intelligence

Why OpenAI’s Skills, Shell, and Compaction Are Redefining AI Agent Engineering

The article explains OpenAI’s new agent primitives—Skills, a hosted Shell environment, and server‑side Compaction—detailing how they enable long‑running, reliable AI agents, provides practical design patterns and tips, and compares this approach with the open‑source OpenClaw framework.

AIAgentsCompaction

0 likes · 17 min read

Why OpenAI’s Skills, Shell, and Compaction Are Redefining AI Agent Engineering

StarRocks

Dec 11, 2025 · Databases

How StarRocks Redesigns Bulk Import to Cut Small Files and Boost Throughput

This article explains how StarRocks mitigates the hidden risks of massive one‑time data imports in a storage‑compute separated architecture by redesigning the write path to spill to local disk, merge centrally, and write to object storage, resulting in fewer small files, higher write throughput, and more stable query performance.

Bulk ImportCompactionData Engineering

0 likes · 12 min read

How StarRocks Redesigns Bulk Import to Cut Small Files and Boost Throughput

Full-Stack DevOps & Kubernetes

May 28, 2025 · Operations

How to Fix etcd “NOSPACE” Errors in Kubernetes Clusters

When a Kubernetes cluster’s etcd reaches its default 2 GB quota, it triggers a “NOSPACE” alarm that blocks all write operations, causing critical services to fail; this guide explains the root cause, how to diagnose the issue with etcdctl, and step‑by‑step remediation including compaction, defragmentation, and quota expansion.

CompactionEtcdNOSPACE

0 likes · 7 min read

How to Fix etcd “NOSPACE” Errors in Kubernetes Clusters

DeWu Technology

Mar 3, 2025 · Databases

Implementing an LSM‑Tree in Zig: Core Components, Write/Read Logic, and Compaction

The article walks through a complete Zig implementation of an LSM‑Tree, detailing its in‑memory skip‑list MemTable, immutable SSTable blocks with compression and Bloom filters, write‑ahead logging, iterator hierarchy for reads, and multi‑level compaction logic that merges and rewrites SSTables.

CompactionIteratorsLSM‑Tree

0 likes · 42 min read

Implementing an LSM‑Tree in Zig: Core Components, Write/Read Logic, and Compaction

DataFunSummit

Dec 27, 2024 · Big Data

Tencent Real-time Lakehouse Intelligent Optimization Practice

This presentation describes Tencent's real-time lakehouse architecture, including data lake compute, management, and storage layers, and details the intelligent optimization services—such as compaction, indexing, clustering, and auto-engine—designed to improve query performance, storage cost, and operational efficiency for large-scale data processing.

AutoEngineCompactionFlink

0 likes · 11 min read

Tencent Real-time Lakehouse Intelligent Optimization Practice

Tencent Advertising Technology

Dec 6, 2024 · Big Data

Building a High‑Performance Advertising Feature Data Lake with Apache Iceberg at Tencent

Tencent's advertising team replaced a traditional HDFS‑Hive warehouse with an Apache Iceberg‑based data lake, adding primary‑key tables, multi‑stream merging, adaptive compaction, and Spark SPJ optimizations to achieve minute‑level feature update latency, 10× back‑fill speed, and up to 60% storage savings.

Big DataCDCCompaction

0 likes · 25 min read

Building a High‑Performance Advertising Feature Data Lake with Apache Iceberg at Tencent

Big Data Technology & Architecture

Oct 31, 2024 · Big Data

Understanding Paimon's Changelog Producer: Four Modes and Their Trade‑offs

The article explains Paimon's changelog‑producer capability, detailing its purpose, storage format, and the four generation modes—None, Input, Lookup, and Full Compaction—while comparing their costs, implementation details, and suitability for different data sources such as CDC.

@LookupBig DataCompaction

0 likes · 16 min read

Understanding Paimon's Changelog Producer: Four Modes and Their Trade‑offs

Big Data Technology & Architecture

Oct 28, 2024 · Big Data

Key Considerations for Using Paimon Primary Key Tables

This article explains the characteristics of Paimon primary key tables, covering bucket selection, cross‑partition update issues, recommended record‑level expiration settings, and two approaches to handle file compaction, including configuration tweaks and dedicated compaction tasks.

Big DataBucketCompaction

0 likes · 6 min read

Key Considerations for Using Paimon Primary Key Tables

Linux Kernel Journey

Oct 17, 2024 · Fundamentals

Inside Linux Memory Compaction: A Source‑Code Walkthrough of Memory Management

The article explains how Linux manages memory page watermarks, when the allocator falls back to kswapd, and the exact conditions that trigger direct compaction via __alloc_pages_direct_compact, then walks through the core compaction functions—try_to_compact_pages, compact_zone_order, compact_zone, and the page‑migration helpers—illustrated with flow diagrams and real kernel code.

C#CompactionLinux

0 likes · 37 min read

Inside Linux Memory Compaction: A Source‑Code Walkthrough of Memory Management

Aikesheng Open Source Community

Oct 15, 2024 · Databases

Troubleshooting Compaction Stuck Issue in OceanBase: Diagnosis and Resolution

This article details a step‑by‑step investigation of a compaction‑stuck problem in OceanBase, covering background, environment setup, view and log analysis, root‑cause identification related to clock drift, and the corrective actions taken to restore normal merging.

CompactionOceanBaseSQL

0 likes · 13 min read

Troubleshooting Compaction Stuck Issue in OceanBase: Diagnosis and Resolution

Linux Kernel Journey

Sep 12, 2024 · Fundamentals

Understanding Linux Memory Allocation: Fast Path vs. Slow Path in the Source Code

This article dissects the Linux kernel's page allocation mechanisms, explaining how alloc_pages() follows a fast‑path using low watermarks and falls back to a slow‑path that triggers kswapd, direct reclaim, and compaction, while also detailing the corresponding page‑freeing functions and their internal data structures.

CompactionLinux kernelbuddy allocator

0 likes · 30 min read

Understanding Linux Memory Allocation: Fast Path vs. Slow Path in the Source Code

Sohu Tech Products

Sep 11, 2024 · Big Data

Tencent Real-time Lakehouse Intelligent Optimization Practice

Tencent’s real‑time lakehouse combines Spark, Flink, StarRocks and Presto compute layers with Iceberg‑based management and HDFS/COS storage, and its Intelligent Optimize Service—comprising Compaction, Expiration, Cleaning, Clustering, Index and Auto‑Engine modules—automatically reduces merge time, improves query performance, enables secondary indexing, and dynamically routes hot partitions, while future plans target cold/hot separation, materialized view acceleration, and AI‑driven optimizations.

Big DataClusteringCompaction

0 likes · 12 min read

DataFunSummit

Aug 19, 2024 · Big Data

Apache Hudi from Zero to One: Introduction to Table Services – Compaction, Cleaning, and Indexing (Part 5)

This article introduces Apache Hudi's table services, explaining the concepts, execution modes, and detailed workflows of compaction, cleaning, and indexing, and how they optimize storage layout and read/write performance in large‑scale data lake environments.

Apache HudiBig DataCleaning

0 likes · 8 min read

Apache Hudi from Zero to One: Introduction to Table Services – Compaction, Cleaning, and Indexing (Part 5)

Big Data Technology & Architecture

Aug 12, 2024 · Big Data

Best Practices for Apache Doris Compaction in Production Environments

This article outlines practical production‑level optimizations for Apache Doris compaction, covering vertical, segment, and single‑replica compaction methods, compaction policies, concurrency controls, and data‑ingestion tuning to improve import speed and query performance in OLAP workloads.

Apache DorisBig DataCompaction

0 likes · 9 min read

Best Practices for Apache Doris Compaction in Production Environments

Big Data Technology & Architecture

Jul 26, 2024 · Databases

Apache Doris Architecture and Common Q&A: Read/Write Flow, Replication Consistency, Storage, and High Availability

This article provides a comprehensive overview of Apache Doris, explaining its frontend and backend nodes, storage structures such as tablets, rowsets, and segments, replication mechanisms, partitioning versus bucketing, indexing types, compaction processes, and high‑availability strategies through a detailed Q&A format.

Apache DorisBig DataCompaction

0 likes · 22 min read

Apache Doris Architecture and Common Q&A: Read/Write Flow, Replication Consistency, Storage, and High Availability

Big Data Technology & Architecture

Jul 25, 2024 · Big Data

Fundamental Concepts and File Layout of Paimon: Snapshots, Partitions, Buckets, Consistency, and Compaction

This article explains Paimon's core concepts—including snapshots, partitions, buckets, consistency guarantees, file layout, LSM‑tree organization, and compaction strategies—while also covering table management tasks such as snapshot expiration, rollback, partition expiration, and small‑file mitigation techniques.

Big DataBucketsCompaction

0 likes · 12 min read

Fundamental Concepts and File Layout of Paimon: Snapshots, Partitions, Buckets, Consistency, and Compaction

Top Architecture Tech Stack

Jul 16, 2024 · Databases

Understanding LSM-Tree Architecture and Its Applications in Big Data Systems

The article explains the Log-Structured Merge-Tree (LSM) architecture, its core components, advantages and disadvantages, and demonstrates how it is employed in big‑data platforms such as HBase and Apache Druid to achieve high‑throughput writes and scalable query processing.

CompactionDatabasesLSM‑Tree

0 likes · 7 min read

Understanding LSM-Tree Architecture and Its Applications in Big Data Systems

StarRocks

Jun 18, 2024 · Databases

How StarRocks Compaction Boosts Query Performance: Mechanics, Tuning, and Best Practices

This article explains StarRocks' compaction process that merges multiple data versions into larger files to reduce I/O, details the scheduler and executor roles, shows how to monitor and control compaction via SQL commands, and provides tuning parameters and best‑practice recommendations for optimal performance.

CompactionData ManagementPerformance Tuning

0 likes · 21 min read

How StarRocks Compaction Boosts Query Performance: Mechanics, Tuning, and Best Practices

Big Data Technology & Architecture

Feb 18, 2024 · Big Data

Understanding Apache Paimon Table Modes and Their Use Cases

Apache Paimon provides multiple table modes—including primary key tables with fixed or dynamic buckets, Append scalable and queue tables—each with specific configurations, compaction behavior, and suitable scenarios, and the article explains their structures, performance considerations, and how to use them with Flink.

Apache PaimonAppend TableBig Data

0 likes · 12 min read

Understanding Apache Paimon Table Modes and Their Use Cases

Cognitive Technology Team

Jan 21, 2024 · Databases

Understanding LSM-Tree (Log-Structured Merge Tree) and Its Storage Mechanisms

This article explains the Log-Structured Merge Tree (LSM-Tree) architecture, describing its immutable storage design, the roles of WAL, MemTable, ImmuTable, and SSTable, and detailing the write workflow, compaction process, and the associated read, space, and write amplification challenges.

CompactionDatabasesLSM‑Tree

0 likes · 7 min read

Understanding LSM-Tree (Log-Structured Merge Tree) and Its Storage Mechanisms

Huolala Tech

Dec 27, 2023 · Big Data

How HBase Compaction Tuning Boosts Performance at Scale

This article explains LSM‑Tree based HBase compaction concepts, compares Minor and Major compactions, and shares practical tuning steps—including disabling automatic major compactions, controlling merge size, leveraging off‑peak windows, and improving merge efficiency—to reduce I/O, CPU usage, and latency in production environments.

Big DataCompactionHBase

0 likes · 11 min read

How HBase Compaction Tuning Boosts Performance at Scale

Deepin Linux

Dec 9, 2023 · Fundamentals

Linux Page Reclaim Mechanism and Memory Compaction: Detailed Source Code Analysis

This article explains the Linux page‑reclaim mechanism, its goals, common techniques, the allocation paths, LRU data structures, and provides an in‑depth walkthrough of the kernel source code for slow‑path reclaim, direct reclaim, and memory compaction, including all relevant functions and code snippets.

CompactionLinuxpage reclaim

0 likes · 80 min read

Linux Page Reclaim Mechanism and Memory Compaction: Detailed Source Code Analysis

Alibaba Cloud Native

Dec 6, 2023 · Cloud Native

How RocketMQ Implements Random Indexing for Cloud‑Native Storage

This article explains RocketMQ's random indexing mechanism, detailing its on‑disk three‑segment hash table structure, the compact format conversion process, multi‑threaded write and query workflows, layered system design, crash‑recovery strategy, and comparisons with RocksDB and InnoDB storage engines.

CompactionMessage IndexingRocketMQ

0 likes · 16 min read

How RocketMQ Implements Random Indexing for Cloud‑Native Storage

Aikesheng Open Source Community

Nov 8, 2023 · Databases

Analyzing OceanBase Freeze Dump Process via Log Parsing

This article explains how to parse OceanBase logs to trace the tenant freeze dump workflow, detailing the roles and log sequences of the freeze check thread, LSFreeze, Flush, DagScheduler, and MiniMerge threads, and illustrating each step with actual log excerpts and code snippets.

CompactionDAGFreeze Process

0 likes · 16 min read

Analyzing OceanBase Freeze Dump Process via Log Parsing

DataFunTalk

Aug 30, 2023 · Big Data

Design and Implementation of Baidu Cloud Block Storage EC System for Large‑Scale Data

This article presents Baidu Cloud's block storage architecture, comparing replication and erasure‑coding fault‑tolerance methods, detailing the challenges of applying EC to mutable block data, and describing a two‑layer append‑engine solution with selective 3‑replica caching, cost‑benefit compaction, and performance optimizations for low‑cost, high‑throughput storage.

Big DataCompactionappend engine

0 likes · 14 min read

Design and Implementation of Baidu Cloud Block Storage EC System for Large‑Scale Data

Sohu Tech Products

Aug 16, 2023 · Big Data

Understanding HBase Compaction: Principles, Process, Throttling Strategies and Real‑World Optimizations

This article explains HBase’s LSM‑Tree compaction fundamentals—including minor and major compaction triggers, file‑selection policies, dynamic throughput throttling, and practical tuning examples that show how adjusting size limits, thread pools, and off‑peak settings can dramatically improve read latency and cluster stability.

Big DataCompactionHBase

0 likes · 35 min read

Understanding HBase Compaction: Principles, Process, Throttling Strategies and Real‑World Optimizations

vivo Internet Technology

Jul 26, 2023 · Big Data

Understanding HBase Compaction: Principles, Process, Throttling Strategies, and Optimization Cases

Understanding HBase compaction involves knowing its minor and major merge types, trigger mechanisms, file‑selection policies such as RatioBased and Exploring, throttling controls based on file count, and practical tuning of key parameters to avoid latency spikes, as illustrated by real‑world production cases.

Big DataCompactionHBase

0 likes · 36 min read

Understanding HBase Compaction: Principles, Process, Throttling Strategies, and Optimization Cases

Huolala Tech

May 25, 2023 · Big Data

How Huolala Solved HBase Bulkload Challenges: A Practical Guide

This article details Huolala’s experience building a unified Hive‑to‑HBase pipeline, addressing low development efficiency, lack of monitoring, and HBase instability by evaluating two architectures, implementing a generic Transform tool, optimizing compaction and DistCp, and establishing stability and data‑validation mechanisms.

CompactionDistcpHBase

0 likes · 12 min read

How Huolala Solved HBase Bulkload Challenges: A Practical Guide

Aikesheng Open Source Community

Mar 9, 2023 · Databases

In‑Depth Exploration of OceanBase Hierarchical Dump and Compaction Mechanisms

This article explains the LSM‑Tree foundation of OceanBase, details its tiered and leveled compaction strategies, and presents two experiments that observe Mini and Minor compactions under different configuration parameters, revealing how minor freeze and trigger settings affect data movement between L0 and L1 layers.

CompactionDatabase StorageLSM‑Tree

0 likes · 13 min read

In‑Depth Exploration of OceanBase Hierarchical Dump and Compaction Mechanisms

DataFunSummit

Feb 28, 2023 · Big Data

Iceberg Technology Overview and Its Application at Xiaomi: Practices, Stream‑Batch Integration, and Future Plans

This article introduces the Iceberg table format, explains its core architecture and advantages such as transactionality, implicit partitioning and row‑level updates, details Xiaomi's practical deployments—including CDC pipelines, partition strategies, compaction services, and stream‑batch integration—and outlines future development directions.

CompactionData LakeFlink

0 likes · 20 min read

Iceberg Technology Overview and Its Application at Xiaomi: Practices, Stream‑Batch Integration, and Future Plans

Tencent Cloud Middleware

Sep 8, 2022 · Operations

Understanding Apache BookKeeper GC: Mechanisms, Triggers, and Code Walkthrough

This article explains how Apache BookKeeper's garbage collection works, detailing minor and major GC triggers, compression strategies, entry log size management, usage calculations, and provides a step‑by‑step code analysis of the GarbageCollectorThread implementation.

Apache BookKeeperCompactionEntryLog

0 likes · 19 min read

Understanding Apache BookKeeper GC: Mechanisms, Triggers, and Code Walkthrough

Big Data Technology Architecture

Aug 13, 2022 · Big Data

Apache Doris at Xiaomi: Architecture Evolution, Performance Optimizations, and Production Practices

This article details Xiaomi's three‑year journey of adopting Apache Doris across dozens of internal services, describing the transition from a Spark‑SQL‑based Lambda architecture to a unified MPP database, performance benchmarks, data ingestion pipelines, compaction tuning, two‑phase commit, single‑replica writes, monitoring, and community contributions.

Apache DorisCompactionData Warehouse

0 likes · 19 min read

Apache Doris at Xiaomi: Architecture Evolution, Performance Optimizations, and Production Practices

Big Data Technology & Architecture

Jun 29, 2022 · Databases

Understanding Doris Compaction Mechanism and Optimization Strategies

This article explains Doris's compaction mechanism, covering its producer‑consumer architecture, tablet scoring, permission control, cumulative and base compaction processes, parameter tuning, monitoring metrics, and manual compaction commands to help optimize performance and resource usage.

CompactionDorisbigdata

0 likes · 38 min read

Understanding Doris Compaction Mechanism and Optimization Strategies

StarRocks

May 12, 2022 · Databases

How StarRocks’ Primary Key Model Delivers 3‑5× Faster Real‑Time Queries

This article explains the design and implementation of StarRocks 2.x Primary Key tables, covering real‑time update mechanisms, write and commit workflows, in‑memory primary indexing, compaction, read‑path optimizations, performance benchmarks, and upcoming features such as partial and conditional updates.

CompactionIndexingOLAP

0 likes · 19 min read

How StarRocks’ Primary Key Model Delivers 3‑5× Faster Real‑Time Queries

NetEase Cloud Music Tech Team

Mar 16, 2022 · Databases

RDB: Cloud Music's Customized Algorithm Feature KV Storage System Based on RocksDB

To meet Cloud Music’s massive algorithm‑feature KV storage needs, the team built RDB—a RocksDB‑based engine within Tair—adding bulk‑load, dual‑version imports, KV‑separation, in‑place sequence appends and protobuf field updates, cutting storage cost, write amplification and latency while scaling to billions of records and millions of QPS.

Algorithm FeaturesCompactionKV Separation

0 likes · 16 min read

RDB: Cloud Music's Customized Algorithm Feature KV Storage System Based on RocksDB

DataFunTalk

Feb 25, 2022 · Big Data

Tencent's Application of Apache Iceberg for Real‑Time Data Lake Ingestion, Governance, and Query Optimization

This article explains how Tencent leverages Apache Iceberg together with Flink to build a real‑time data lake pipeline, covering data ingestion, Iceberg's snapshot‑based read/write model, compaction and governance services, Z‑order based query optimization, performance results, and future roadmap.

Apache IcebergBig DataCompaction

0 likes · 24 min read

Tencent's Application of Apache Iceberg for Real‑Time Data Lake Ingestion, Governance, and Query Optimization

NiuNiu MaTe

Dec 29, 2021 · Databases

Why LSM Tree Powers Modern Key‑Value Stores: Design, Write Path & Compaction

This article explains the fundamentals of Log‑Structured Merge (LSM) Trees, covering their write‑ahead log, memtable, SSTable architecture, compaction processes, read/write optimizations, and popular open‑source implementations such as LevelDB, RocksDB, and GoLevelDB.

CompactionKey-Value StoreLSM‑Tree

0 likes · 15 min read

Why LSM Tree Powers Modern Key‑Value Stores: Design, Write Path & Compaction

Tencent Cloud Developer

Dec 10, 2020 · Databases

Understanding LevelDB Architecture, Read/Write Flow, and Compaction Process

LevelDB stores data using an in‑memory Memtable that flushes to immutable tables and disk‑based SSTables, writes are logged then batched and applied through a writer queue, reads check Memtable, immutable Memtable, then SSTables, and background compactions merge tables to improve read performance and reclaim space.

CompactionDatabase InternalsLSM‑Tree

0 likes · 16 min read

Understanding LevelDB Architecture, Read/Write Flow, and Compaction Process

360 Zhihui Cloud Developer

Nov 4, 2020 · Databases

How to Safely Drop Massive Data in TiDB Without Causing Write Stall

This article explains why dropping large amounts of data in a TiDB cluster can trigger compaction flow‑control, leading to write stalls and QPS jitter, and provides step‑by‑step troubleshooting, configuration tweaks, and best‑practice recommendations to resolve the issue.

CompactionConfigurationRegion Merge

0 likes · 20 min read

How to Safely Drop Massive Data in TiDB Without Causing Write Stall

Big Data Technology & Architecture

Aug 27, 2020 · Big Data

HBase Architecture, Components, and Operations Overview

This article provides a comprehensive overview of Apache HBase’s architecture, detailing its core components such as RegionServer, HMaster, ZooKeeper, WAL, MemStore, and HFiles, and explains key processes including read/write paths, compaction, region splitting, load balancing, and recovery mechanisms.

Big DataCompactionDatabase Architecture

0 likes · 17 min read

HBase Architecture, Components, and Operations Overview

Big Data Technology & Architecture

Jun 10, 2020 · Databases

Understanding HBase Compaction: Types, Triggers, Algorithms, and Impact on Read/Write Performance

This article explains HBase compaction—a key operation in the Log‑Structured Merge‑Tree model—covering minor and major compaction differences, trigger conditions, configuration parameters, selection algorithms, thread‑pool handling, and the effects on read and write performance in a big‑data database environment.

CompactionHBaseLSM

0 likes · 10 min read

Understanding HBase Compaction: Types, Triggers, Algorithms, and Impact on Read/Write Performance

Big Data Technology Architecture

Jun 2, 2020 · Databases

JVM Tuning, Region Split, BlockCache, and Compaction Strategies for HBase

This article explains how to configure JVM memory, choose appropriate garbage‑collector settings, tune HBase region split policies, optimize BlockCache implementations, and select suitable compaction strategies to improve HBase performance on clusters of various sizes.

BlockCacheCompactionDatabase Performance

0 likes · 20 min read

JVM Tuning, Region Split, BlockCache, and Compaction Strategies for HBase

Big Data Technology Architecture

May 22, 2020 · Databases

HBase Compaction Types and Parameter Tuning Guide

This article explains how HBase uses WAL and MemStore to create HFiles, describes the two compaction types (Minor and Major), and provides detailed recommendations for tuning key compaction-related configuration parameters to improve query performance and reduce HDFS impact.

CompactionDatabasesHBase

0 likes · 4 min read

HBase Compaction Types and Parameter Tuning Guide

Tencent Database Technology

Apr 24, 2020 · Databases

Log-Structured Merge Trees: Overview, History, Modern Design, Optimizations, and Concurrency Control

This article explains the principles, evolution, modern structures, compaction strategies, optimization techniques such as Bloom filters and partitioning, and concurrency and recovery mechanisms of Log-Structured Merge (LSM) trees, which are widely used in contemporary NoSQL storage systems.

CompactionLSM‑TreeNoSQL

0 likes · 12 min read

Log-Structured Merge Trees: Overview, History, Modern Design, Optimizations, and Concurrency Control

Big Data Technology & Architecture

Mar 30, 2020 · Databases

HBase Optimization: JVM Tuning, Region Split Policies, BlockCache, and Compaction Strategies

This guide explains how to optimize HBase performance by adjusting JVM memory settings, selecting appropriate garbage collectors, configuring MSLAB and in‑memory compaction, choosing region split policies, tuning BlockCache implementations, and applying suitable compaction policies for different workloads.

Big DataBlockCacheCompaction

0 likes · 18 min read

HBase Optimization: JVM Tuning, Region Split Policies, BlockCache, and Compaction Strategies

DataFunTalk

Mar 24, 2020 · Databases

ByteDance’s Enhancements to RocksDB: LazyBuffer, Adaptive Map, KV Separation, Multi‑Index, Extreme Compression, and New Hardware Support

This article describes ByteDance’s extensive improvements to the RocksDB storage engine—including LazyBuffer, Adaptive Map‑based lazy compaction, KV separation, adaptive multi‑index support, extreme compression techniques, and hardware acceleration—to reduce amplification, improve performance, and lower costs for large‑scale database workloads.

CompactionIndexingKV Separation

0 likes · 14 min read

ByteDance’s Enhancements to RocksDB: LazyBuffer, Adaptive Map, KV Separation, Multi‑Index, Extreme Compression, and New Hardware Support

Big Data Technology Architecture

Mar 2, 2020 · Databases

Understanding HBase Flush and Compaction Mechanisms and Their Configuration Parameters

This article explains the core mechanisms of HBase—Flush and Compaction—detailing why they are needed, the conditions that trigger Flush, the types and triggers of Compaction, and provides practical recommendations for tuning the most important configuration parameters to improve write and read performance.

CompactionConfigurationFlush

0 likes · 11 min read

Understanding HBase Flush and Compaction Mechanisms and Their Configuration Parameters

dbaplus Community

Feb 4, 2020 · Databases

Understanding Cassandra’s Row‑Oriented Storage, Write Path, and Consistency

This article explains Cassandra’s row‑oriented storage model, the multi‑step write and read processes, how tombstones and compaction manage data growth, and the impact of its distributed architecture on high availability, fault tolerance, and configurable consistency levels.

CassandraCompactionConsistency Levels

0 likes · 25 min read

Understanding Cassandra’s Row‑Oriented Storage, Write Path, and Consistency

Big Data Technology & Architecture

Jan 7, 2020 · Big Data

Why Small Files Are a Problem in Big Data and How Delta Lake Compaction Solves It

This article examines the root causes and performance impact of massive small-file proliferation in traditional data warehouses, explains why HDFS metadata limits scalability, and details how Delta Lake’s custom compaction process can safely merge these files for append-only tables without disrupting reads or writes.

CompactionDelta LakeHDFS

0 likes · 5 min read

Why Small Files Are a Problem in Big Data and How Delta Lake Compaction Solves It

58 Tech

Dec 2, 2019 · Databases

Optimizing RocksDB Compaction Rate Limiting to Reduce IO Spikes in WTable

This article analyzes RocksDB's compaction rate‑limiting source code and presents practical tuning methods—both fixed and auto‑tuned—to mitigate IO spikes in the distributed KV store WTable, improving real‑time read/write latency and stability.

CompactionIO optimizationRocksDB

0 likes · 7 min read

Optimizing RocksDB Compaction Rate Limiting to Reduce IO Spikes in WTable

Big Data Technology Architecture

Aug 16, 2019 · Big Data

In‑Depth Overview of HBase Architecture

This article provides a comprehensive, illustrated explanation of Apache HBase's architecture, covering its master‑slave components, region management, Zookeeper coordination, data flow for reads and writes, storage structures, compaction processes, fault recovery, and the system's strengths and limitations within the Hadoop ecosystem.

CompactionHBaseHadoop

0 likes · 21 min read

Alibaba Cloud Developer

Jun 24, 2019 · Databases

How X‑Engine Redefines LSM Storage for High‑Performance E‑Commerce

This article provides an in‑depth technical overview of Alibaba's X‑Engine storage engine, explaining its LSM‑based architecture, compaction optimizations, transaction pipeline, caching strategies, and how these innovations enable low‑cost, high‑throughput OLTP for large‑scale e‑commerce workloads.

CompactionLSMPolarDB-X

0 likes · 18 min read

How X‑Engine Redefines LSM Storage for High‑Performance E‑Commerce

Big Data Technology Architecture

May 27, 2019 · Databases

Understanding HBase Compaction: Types, Triggers, Parameters, and Performance Impact

This article explains HBase's compaction mechanism, covering why it is needed, the differences between minor and major compaction, the conditions that trigger compaction, key configuration parameters, thread‑pool handling, compaction policies, and how compaction influences read and write performance in a large‑scale NoSQL database.

CompactionDatabasesHBase

0 likes · 12 min read

Understanding HBase Compaction: Types, Triggers, Parameters, and Performance Impact

Big Data Technology Architecture

May 21, 2019 · Databases

Postmortem Analysis of a 10‑Node HBase Cluster Outage and Mitigation Measures

This article presents a detailed post‑mortem of a 10‑node HBase cluster failure caused by excessive region count and memstore pressure, analyzes HDFS and datanode log errors, and outlines configuration adjustments and operational recommendations that restored the service and prevented future outages.

Cluster OutageCompactionHBase

0 likes · 16 min read

Postmortem Analysis of a 10‑Node HBase Cluster Outage and Mitigation Measures

Qunar Tech Salon

Apr 18, 2018 · Databases

FPGA-Accelerated X-Engine Storage Engine for High‑Performance OLTP

This article presents the design, implementation, and evaluation of X‑Engine, a next‑generation LSM‑Tree based storage engine that offloads compaction to FPGA, achieving up to 50% KV‑interface and 40% SQL‑interface performance gains for write‑intensive OLTP workloads.

CompactionFPGALSM‑Tree

0 likes · 19 min read

FPGA-Accelerated X-Engine Storage Engine for High‑Performance OLTP

Alibaba Cloud Developer

Apr 9, 2018 · Databases

How FPGA Acceleration Supercharges X-Engine’s Compaction for 10× MySQL Performance

This article introduces Alibaba’s X‑Engine storage engine, the foundation of the next‑generation distributed database X‑DB, and explains how FPGA‑accelerated compaction and asynchronous scheduling dramatically improve write‑intensive OLTP performance, reduce CPU contention, and achieve up to 50 % throughput gains while maintaining fault tolerance.

CompactionFPGALSM‑Tree

0 likes · 21 min read

How FPGA Acceleration Supercharges X-Engine’s Compaction for 10× MySQL Performance

Taobao Frontend Technology

Jul 6, 2017 · Databases

Understanding LevelDB: Architecture, Interfaces, and New Features

LevelDB, Google's high-performance key‑value store built on LSM trees, uses an in‑memory skip‑list, immutable memtables, and sstable files organized in multi‑level compaction, offering interfaces for creation, reads, writes, snapshots, and new features like fuzzy search and JSON storage, all explained with diagrams.

CompactionDatabase ArchitectureKey-Value Store

0 likes · 11 min read

Architect

May 30, 2016 · Backend Development

Backend Log Management Threads, Log Cleaning, and Compaction in Distributed Kafka Systems

This article explains how Kafka's LogManager loads existing logs, manages background threads for flushing, checkpointing, cleaning, and compaction, and details the code implementations and strategies for log retention, segment cleanup, and log compression in a distributed storage environment.

Backend DevelopmentCompactionLog Cleaning

0 likes · 15 min read

Backend Log Management Threads, Log Cleaning, and Compaction in Distributed Kafka Systems