Databases 10 min read

OceanBase Storage Architecture and Optimizations for TPC‑C Benchmark

This article explains how OceanBase’s distributed, shared‑nothing architecture, with dual data replicas, Paxos‑based consistency, online compression, and resource‑isolated compaction, enables it to achieve top TPC‑C performance while addressing storage costs and CPU usage.

AntTech
AntTech
AntTech
OceanBase Storage Architecture and Optimizations for TPC‑C Benchmark

Ant Financial’s self‑developed database OceanBase recently topped the TPC‑C benchmark, attracting wide industry attention. To detail the technical achievements, OceanBase core engineers released a five‑part series; this text is the fifth article.

The TPC‑C specification ties transaction throughput (tpmC) to the number of warehouses, each roughly 70 MB of data. A system achieving 1.5 million tpmC corresponds to about 120 000 warehouses, or ~8.4 TB of raw data, and storage consumption dominates the test cost.

OceanBase, the first shared‑nothing database to lead TPC‑C, stores two data replicas and three log replicas. It uses Paxos for strong consistency, defining three replica types: F (data + log, read/write), D (data + log, read‑only), and L (log only). The test deploys an FDL configuration (one of each), doubling data‑replica storage but ensuring fault tolerance.

To mitigate the extra storage, OceanBase applies online compression, reducing a 70 MB warehouse to about 50 MB. The benchmark runs on 204 i2 ECS instances, each with a 1 TB log disk and ~13 TB data disk, achieving a balanced resource usage while keeping storage costs low.

Maintaining a smooth performance curve (jitter < 2 %) over the 8‑hour test is challenging because LSM‑Tree compaction is resource‑intensive. OceanBase introduced scheduling and isolation techniques that keep jitter under 0.5 %.

The storage engine adopts layered SSTable compaction with flexible policies to limit the number of files, balancing write and read performance. Resource isolation separates CPU, memory, disk I/O, and network I/O for foreground queries and background tasks, using dedicated thread pools and I/O throttling.

Compared with centralized databases (e.g., Oracle), OceanBase’s software‑level fault tolerance incurs higher CPU usage due to multiple replicas and online compression, but overall hardware cost accounts for only ~18 % of total cost, offering better cost‑effectiveness.

Future work focuses on improving single‑node storage performance to close the gap with Oracle/DB2 and enhancing OLAP capabilities so that large analytical queries can run directly on compressed data.

Author: Zhao Yuzhong, Senior Technical Expert in the OceanBase team, responsible for storage engine development.

LSM TreeStorage EngineDistributed DatabaseOceanBaseTPC-COnline Compression
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.