Databases 9 min read

MySQL vs HBase: Architectural, Engine, and Use‑Case Differences Explained

This article compares MySQL and HBase across architecture, engine design, indexing structures, and feature sets such as TTL and multi‑versioning, highlighting how MySQL excels in low‑latency online transactions while HBase offers distributed scalability and write‑optimized storage for big‑data scenarios.

Java Interview Crash Guide

Sep 26, 2021

MySQL vs HBase: Architectural, Engine, and Use‑Case Differences Explained

MySQL + HBase are two commonly used databases in daily applications, solving online transaction problems and massive storage for big‑data scenarios respectively.

Differences from an Architectural Perspective

Compared with MySQL, HBase’s architectural characteristics:

Fully distributed (data sharding, self‑recovery on failures)

Built on HDFS (storage‑compute separation)

Capability differences observed from architecture:

MySQL: simple operations (few components), low latency (short access path)

HBase: good scalability, built‑in fault tolerance and data redundancy

Differences from Engine Structure

Compared with MySQL, HBase’s internal engine characteristics:

HBase has no native SQL engine (cannot use SQL directly, uses API); cloud‑enhanced HBase (Lindorm) and open‑source Phoenix provide SQL capability

HBase uses LSM (Log‑Structured Merge) tree, while InnoDB uses B+ tree

Capability differences seen from engine structure (B+Tree vs LSM Tree):

MySQL: read/write balanced, suffers from space fragmentation

HBase: write‑oriented, compact storage without waste, I/O amplification, strong data import capability

Understanding LSM Tree and B+ Tree

The goal is to reduce disk I/O.

Index: a data structure that facilitates data lookup.

Hash indexes are not suitable for range queries, so tree structures are used.

B+ Tree

Disk reads are page‑oriented, so a balanced multi‑way search tree is used.

Non‑leaf nodes store indexes, leaf nodes store data.

Non‑leaf nodes can hold more indexes, reducing tree height.

Leaf nodes are linked, benefiting range queries.

Leaf‑to‑root distance is roughly equal, giving stable lookup efficiency.

Data insertion may split leaf nodes, causing logically consecutive data to be stored in different physical blocks, which degrades range‑query performance.

LSM Tree

LSM (Log‑Structured Merge) is used by LevelDB, RocksDB, HBase, Cassandra, etc.

Both HDD and SSD have higher sequential read/write speeds than random; logging writes sequentially.

Components include WAL, memtable, SSTable.

Optimized for writes, not reads: first search memtable, then all SSTable files on disk.

Compaction reduces the number of SSTable files, alleviates read amplification, and Bloom filters can accelerate lookups.

Compaction strategies: STCS (Size‑Tiered) addresses space and read amplification; LCS (Leveled) addresses write amplification.

Compaction introduces write amplification; KV separation can mitigate it when values are large.

When write operations dominate reads, LSM trees perform better because inserts avoid frequent B+Tree node splits, turning many random single‑page writes into fewer multi‑page sequential writes, greatly improving write throughput at the cost of some read performance.

Data Access

Similarity: data is logically organized as tables, and applications perform CRUD operations.

Differences: MySQL’s SQL features are richer with stronger transaction capabilities; HBase can be accessed via API for flexible, high‑performance operations or via Phoenix for standard SQL, but only supports single‑row transactions.

HBase’s distinctive features – TTL

HBase’s distinctive features – Multi‑Version

HBase’s distinctive features – Column Families

HBase’s distinctive features – MOB

Differences from Ecosystem Perspective

MySQL satisfies online application storage needs and is generally sufficient.

Big‑data ecosystem: components for storage, compute, and management in big‑data scenarios.

MySQL can generally meet online application storage alone or with a few components (e.g., cache, sharding middleware).

HBase usually requires many big‑data components, making architecture design and implementation more challenging.

Summary

Which storage scenarios are suitable for HBase?

HBase is not a replacement for MySQL; it is a natural extension when business scale and scenario expand beyond MySQL.

Author: Zhuang Xiaoyan Source: blog.csdn.net/weixin_41605937/article/details/110933984

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data LSM‑Tree Database Architecture mysql HBase B+Tree

Written by

Java Interview Crash Guide

Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.