MySQL vs HBase: Architectural, Engine, and Use‑Case Differences Explained
This article compares MySQL and HBase across architecture, engine design, indexing structures, and feature sets such as TTL and multi‑versioning, highlighting how MySQL excels in low‑latency online transactions while HBase offers distributed scalability and write‑optimized storage for big‑data scenarios.
MySQL + HBase are two commonly used databases in daily applications, solving online transaction problems and massive storage for big‑data scenarios respectively.
Differences from an Architectural Perspective
Compared with MySQL, HBase’s architectural characteristics:
Fully distributed (data sharding, self‑recovery on failures)
Built on HDFS (storage‑compute separation)
Capability differences observed from architecture:
MySQL: simple operations (few components), low latency (short access path)
HBase: good scalability, built‑in fault tolerance and data redundancy
Differences from Engine Structure
Compared with MySQL, HBase’s internal engine characteristics:
HBase has no native SQL engine (cannot use SQL directly, uses API); cloud‑enhanced HBase (Lindorm) and open‑source Phoenix provide SQL capability
HBase uses LSM (Log‑Structured Merge) tree, while InnoDB uses B+ tree
Capability differences seen from engine structure (B+Tree vs LSM Tree):
MySQL: read/write balanced, suffers from space fragmentation
HBase: write‑oriented, compact storage without waste, I/O amplification, strong data import capability
Understanding LSM Tree and B+ Tree
The goal is to reduce disk I/O.
Index: a data structure that facilitates data lookup.
Hash indexes are not suitable for range queries, so tree structures are used.
B+ Tree
Disk reads are page‑oriented, so a balanced multi‑way search tree is used.
Non‑leaf nodes store indexes, leaf nodes store data.
Non‑leaf nodes can hold more indexes, reducing tree height.
Leaf nodes are linked, benefiting range queries.
Leaf‑to‑root distance is roughly equal, giving stable lookup efficiency.
Data insertion may split leaf nodes, causing logically consecutive data to be stored in different physical blocks, which degrades range‑query performance.
LSM Tree
LSM (Log‑Structured Merge) is used by LevelDB, RocksDB, HBase, Cassandra, etc.
Both HDD and SSD have higher sequential read/write speeds than random; logging writes sequentially.
Components include WAL, memtable, SSTable.
Optimized for writes, not reads: first search memtable, then all SSTable files on disk.
Compaction reduces the number of SSTable files, alleviates read amplification, and Bloom filters can accelerate lookups.
Compaction strategies: STCS (Size‑Tiered) addresses space and read amplification; LCS (Leveled) addresses write amplification.
Compaction introduces write amplification; KV separation can mitigate it when values are large.
When write operations dominate reads, LSM trees perform better because inserts avoid frequent B+Tree node splits, turning many random single‑page writes into fewer multi‑page sequential writes, greatly improving write throughput at the cost of some read performance.
Data Access
Similarity: data is logically organized as tables, and applications perform CRUD operations.
Differences: MySQL’s SQL features are richer with stronger transaction capabilities; HBase can be accessed via API for flexible, high‑performance operations or via Phoenix for standard SQL, but only supports single‑row transactions.
HBase’s distinctive features – TTL
HBase’s distinctive features – Multi‑Version
HBase’s distinctive features – Column Families
HBase’s distinctive features – MOB
Differences from Ecosystem Perspective
MySQL satisfies online application storage needs and is generally sufficient.
Big‑data ecosystem: components for storage, compute, and management in big‑data scenarios.
MySQL can generally meet online application storage alone or with a few components (e.g., cache, sharding middleware).
HBase usually requires many big‑data components, making architecture design and implementation more challenging.
Summary
Which storage scenarios are suitable for HBase?
HBase is not a replacement for MySQL; it is a natural extension when business scale and scenario expand beyond MySQL.
Author: Zhuang Xiaoyan Source: blog.csdn.net/weixin_41605937/article/details/110933984
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Interview Crash Guide
Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
