How Vivo Scaled Hive Metastore Using TiDB: A Deep Dive into Big Data Metadata
This article recounts Vivo’s journey to horizontally scale its Hive Metastore service by evaluating MySQL sharding, the open‑source Waggle‑Dance gateway, and ultimately selecting TiDB, detailing the migration process, configuration tweaks, performance benchmarks, encountered issues such as primary‑key conflicts, index choices, memory spikes, and the solutions implemented to ensure stable, high‑performance metadata storage for massive data volumes.
Background
Hive Metastore Service (HMS) stores all metadata required by data‑warehouse engines such as Hive, Spark, and Presto. Vivo’s offline data‑warehouse runs on CDH 5.14.4, using HMS version 1.1.0‑cdh5.14.4 backed by MySQL. Rapid business growth pushed metadata volume to billions of rows (e.g., PARTITION_PARAMS 810 M, PARTITION_KEY_VALS 350 M, PARTITIONS 140 M), causing severe performance bottlenecks and time‑outs during high‑concurrency queries.
Horizontal Scaling Options
The team investigated two main directions: (1) sharding the existing MySQL backend, and (2) replacing MySQL with a high‑performance distributed engine. For MySQL sharding, the open‑source Waggle‑Dance gateway (originating from Hotels.com) provides a cross‑cluster Metastore proxy that aggregates multiple MySQL instances. For the distributed engine route, TiDB—an open‑source HTAP database from PingCAP—was evaluated and tested against Hive 2.x.
Waggle‑Dance
Waggle‑Dance acts as a unified Metastore entry point, routing client Thrift requests to underlying Metastore services while hiding the physical distribution. Its key advantages include:
Clients can keep using the standard Metastore API and switch between multiple Waggle‑Dance instances when one becomes unavailable.
Stateless design allows rapid scaling: new instances can be launched in seconds to absorb peak load.
Extensible gateway features such as authentication and firewall filtering can be added as needed.
TiDB
TiDB is a distributed relational database that supports HTAP workloads, horizontal scaling, high availability, and MySQL 5.7 compatibility. TiDB 4.x offers significant performance and stability improvements that meet HMS query‑latency requirements. The proposed TiDB‑based architecture replaces MySQL entirely while keeping HMS unchanged.
Advantages of TiDB over Waggle‑Dance include:
No changes required on the HMS side; only the underlying storage engine is swapped.
Data integrity is preserved without needing to split data across multiple instances.
No additional supporting services are required beyond TiDB itself.
Comparison of TiDB and Waggle‑Dance
A multi‑dimensional comparison (performance, scalability, operational complexity, hardware cost, etc.) showed TiDB to be superior, leading to its selection for production rollout.
TiDB Migration Procedure
Because TiDB cannot operate in a dual‑master configuration with MySQL, the HMS service must be fully stopped before switching the storage connection string to TiDB. To enable a safe rollback, a data‑sync pipeline was built to keep MySQL and TiDB consistent before the cut‑over.
During the switch, only the primary MySQL instance remains writable; all TiDB nodes are read‑only until the HMS nodes are stopped, the MySQL binlog and TiDB source are synchronized to the same timestamp, and then TiDB is promoted to writable. HMS instances are then restarted with the new TiDB connection string.
TiDB Configuration Tweaks
Set tidb_skip_isolation_level_check=1 to avoid MetaException on HMS startup.
Set tidb_txn_mode=pessimistic to strengthen transaction consistency.
Adjust transaction size limit to 3 GB (tunable per workload).
Increase connection limit to a maximum of 3000 (adjustable).
When Sentry is enabled, ensure that NOTIFICATION_ID in Sentry metadata does not lag behind NEXT_EVENT_ID in HMS’s NOTIFICATION_SEQUENCE table; otherwise table‑creation or partition‑creation may time out.
Performance Results
HQL compatibility reached 100% with TiDB.
Average query‑type latency improved by ~15%; table‑creation time reduced by 80%.
Disk usage stayed below 10%; CPU average ~12% with IO wait <0.025%.
One‑click horizontal scaling distributes query load evenly across nodes.
TiDB Binlog provides reliable data backup and rollback capability.
High availability achieved via LVS or HAProxy load‑balancing.
API latency statistics before and after migration are shown in the following charts:
Problems Encountered and Solutions
4.1 Primary‑Key Conflict When Rolling Back to MySQL
After three‑fold data growth on TiDB, rolling back to MySQL triggered duplicate‑key errors because TiDB’s auto‑increment IDs are allocated in blocks per node, causing gaps and non‑monotonic global IDs. The SEQUENCE_TABLE in Metastore no longer reflected the true maximum ID.
Fix: Manually set the values in SEQUENCE_TABLE to the current maximum primary‑key values of the corresponding tables before switching back.
4.2 Index Choice for PARTITION_KEY_VALS
Slow‑query analysis on MySQL revealed that queries joining multiple PARTITION_KEY_VALS tables performed full scans due to missing indexes, leading to >100 s latency under concurrency.
Adding an index on PARTITION_KEY_VAL dramatically improved performance for high‑concurrency workloads (average <500 ms). However, for typical daily‑partition queries (≈260 k rows per day), the same index caused the optimizer to scan ~1.6 M rows, which is inefficient.
Conclusion: The index is beneficial for specific high‑selectivity queries but should not be enabled globally in production.
4.3 TiDB Memory Spikes
During early production, TiDB memory would suddenly surge when users analyzed large slow‑query logs via the Dashboard, sometimes leading to out‑of‑memory crashes. Canceling the analysis could also cause memory leaks.
Mitigation:
Upgrade to machines with larger RAM.
Raise the slow‑query threshold to 3 s to reduce log volume.
Periodically move slow‑query logs to backup storage.
4.4 locate() Function Causing Full Table Scan on TiKV
Some JDO‑based HMS queries used locate() on PART_NAME, which prevents index usage in TiDB, forcing a full table scan on TiKV and overloading the KV layer.
Solution: Rewrite the condition using LIKE, which enables the UNIQUEPARTITION index on PARTITIONS to filter data early, halving TiKV CPU usage and reducing network traffic from ~1 GB to ~200 MB.
4.5 Optimizing get_all_functions API
The original JDO implementation fetched all UDFs by joining FUNCS, FUNC_RU, and DBS, resulting in 40‑90 s latency. Switching to a direct SQL query reduced the response to under 1 s.
select FUNCS.FUNC_NAME,
DBS.NAME,
FUNCS.CLASS_NAME,
FUNCS.OWNER_NAME,
FUNCS.OWNER_TYPE,
FUNCS.CREATE_TIME,
FUNCS.FUNC_TYPE,
FUNC_RU.RESOURCE_URI,
FUNC_RU.RESOURCE_TYPE
from FUNCS
left join FUNC_RU on FUNCS.FUNC_ID = FUNC_RU.FUNC_ID
left join DBS on FUNCS.DB_ID = DBS.DB_ID;Conclusion
Since July 2021, Vivo has migrated HMS from MySQL to TiDB after extensive compatibility and performance testing. Early production issues—memory spikes from slow‑query analysis, primary‑key conflicts, and sub‑optimal index choices—were resolved through configuration changes, query rewrites, and operational safeguards. Post‑migration metrics show TiDB memory usage below 10%, TiKV CPU stable under 30 cores, and API latency meeting business SLAs. The architecture will continue to support the growing data platform, with plans to evaluate newer TiDB releases and further HMS optimizations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
