Design and Practice of Using HBase for Massive TMP Monitoring Data Storage
This article analyzes the limitations of the original TMP monitoring storage architecture, evaluates OpenTSDB's shortcomings at large scale, and details the design, implementation, and performance tuning of a custom HBase-based solution that achieves 3‑5× higher throughput for billions of monitoring data points per day.
Background In recent years, open‑source big‑data processing systems have matured, providing solutions for many scenarios similar to MySQL's role in relational storage. The company operates hundreds of thousands of servers, and the TMP system collects over 1.2 trillion monitoring data points daily at one‑minute granularity. This article examines the existing storage architecture problems and describes the evolution from trying OpenTSDB to designing a custom HBase storage solution for massive TMP monitoring data.
Current TMP Storage Architecture Analysis TMP data is reported by agents, routed by a collector using MySQL‑based index and routing rules, and sent to data nodes (Datacache). Data nodes buffer data in memory and periodically dump it to the file system. While the design is simple, supports latest data caching, distributed storage, and horizontal scaling, it suffers from several issues:
Cache process failures cause in‑memory data loss, requiring recovery from agents or peer clusters.
Disk or machine failures lead to loss of persisted files, needing manual cluster switch‑over.
Fixed data format and space usage prevent monitoring granularity expansion; empty data points still occupy storage and compression is unsupported.
Metadata such as index and routing rules depend on external DBs, making system availability vulnerable.
Advantages of HBase Storage Engine HBase, a distributed column‑store in the Hadoop ecosystem based on the Bigtable model and LSM‑Tree engine, is widely used for large‑scale time‑series data. Its benefits include high reliability and availability (writes are logged to HLog on HDFS with default three replicas), high write performance due to LSM‑Tree, natural horizontal scalability, and column compression that eliminates storage for empty columns.
OpenTSDB Attempt and Bottleneck Analysis Initially, the team tried OpenTSDB, an HBase‑based open‑source time‑series database. However, when write throughput reached ~700k ops/s, the HBase cluster became overloaded and unresponsive. The main bottlenecks identified were:
Metric and tag translation via the tsdb‑uid table introduced heavy CPU overhead.
Append‑only writes and the original compaction design caused severe performance degradation.
All data stored in a single table hindered time‑based maintenance, region control, and hotspot avoidance.
Consequently, the decision was made to use HBase directly.
TMP Monitoring Storage Design Practice The design combines proven HBase practices with improvements inspired by OpenTSDB.
Region Pre‑splitting To avoid hotspot regions during table creation, the daily tables are pre‑split into 100 regions using split keys ranging from 0x01 to 0x63. This, together with rowkey salting, distributes data evenly across RegionServers.
Rowkey and Column Design Rowkey format: salt(1 byte) + serverID(4 bytes) + timestamp(4 bytes) + metricID(4 bytes) . The salt, derived from a hash of the server ID, spreads data across regions. Placing server ID early improves query efficiency for per‑server metrics. Timestamp acts as a time‑base, and metric ID distinguishes specific monitoring indicators. Only one column family (a single byte) is used to minimize Memstore overhead. Column qualifiers store time offsets, together with the time‑base forming the precise data point timestamp.
Column‑based Compaction In HBase, each column stores its own rowkey and column family, leading to redundancy. Inspired by OpenTSDB, columns sharing the same time‑base are merged into a single column, drastically reducing storage overhead. Daily full‑table scans compress columns, achieving up to 90% space reduction.
HBase Performance Tuning Key tuning points include:
Increasing RegionServer heap and adjusting Memstore/BlockCache ratios (e.g., 0.5 : 0.3) to favor write‑heavy workloads.
Enabling Snappy compression to reduce disk I/O.
Raising the number of compaction threads (e.g., setting hbase.regionserver.thread.compaction.small and large to 5) to better utilize CPU.
Optimizing GC parameters to avoid stop‑the‑world pauses (refer to HBase CMS GC best practices).
Summary After applying the above design and optimizations, the HBase‑based TMP monitoring storage achieves 3‑5× higher performance than the OpenTSDB approach, with peak write rates of 4 million ops/s across eight RegionServers. The system is now in production, and future work includes adding a buffering layer for pre‑compaction to further boost performance.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.