HBase Optimization Practice in Vivo's Unified Content Platform
Vivo's unified content platform replaced its unwieldy 60 TB MongoDB store with HBase, then upgraded the cluster, introduced table‑specific connection pools, column‑only reads, tuned compaction, and leveraged multi‑version cells, cutting response times from seconds to under ten milliseconds and dramatically lowering operational costs while boosting read/write performance.
This article introduces HBase optimization practices implemented in vivo's unified content platform, which handles core functions including content review, content understanding, content creation, and content distribution.
Business Background: As a content middleware platform, it stores massive amounts of text, images, and video content daily. The platform processes classification labels, review information, and other processing data. Read and write operations are frequent, serving video business and pan-information flow services.
Problems with Previous MongoDB Solution: Core data reached over 20TB with total storage exceeding 60TB. MongoDB's storage architecture couldn't meet scalability requirements. High query traffic from smart push, pan-information flow, and video recommendation systems demanded high performance. Regular maintenance required switching MongoDB primary-replica nodes and rebuilding instances, resulting in high operational costs.
HBase Selection Reasons: HBase provides Key/Value columnar storage with millisecond-level read/write performance. Built on Hadoop's HDFS, it offers high scalability and fault tolerance through replication mechanisms. It ensures strong consistency with Write-ahead log (WAL) for data durability. Additionally, HBase supports multiple versions per column, allowing flexible version control.
Optimization Practices:
4.1 Cluster Upgrade: Upgraded from HBase 1.2 to 2.4.8 to resolve issues like frequent RIT problems, request latency spikes, slow table creation/deletion, meta table instability, and slow node failure recovery. After upgrade, average response time dropped from occasional spikes exceeding 10s to consistently below 10ms.
4.2 Connection Pool and Pre-warming: Created connection pools for different tables using Apache Commons Pool's GenericObjectPool. This provides resource isolation between tables, connection reuse to reduce network overhead, and traffic smoothing to handle sudden spikes. Implemented pre-loading during application startup to avoid performance degradation from mass connection creation.
4.3 Column-based Reading: Used HBase's Get class methods (addFamily, addColumn, setTimeRange, setMaxVersions) to read only required columns instead of all fields. For tables with hundreds of fields or large vector fields, this eliminated over half of unnecessary field returns and improved average response time.
4.4 Compact Optimization: Configured compaction throughput parameters (hbase.hstore.compaction.throughput.higher.bound and lower.bound) to throttle compaction operations. Major compactions are only executed during off-peak hours. This reduced compaction duration by over 70% while maintaining read performance.
4.5 Field-level Version Management: Explored HBase's multi-version capability to store multiple versions of cell values. Can configure version retention by count or time dimension. Useful for scenarios requiring temporal data retrieval and can ensure consumption order in asynchronous update scenarios via message timestamps as version numbers.
Results: After optimization, both read and write performance significantly improved, ensuring business stability while greatly reducing operational costs.
vivo Internet Technology
Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.