Databases 14 min read

ByteDance’s NoSQL Strategy: Powering Billions of Requests with KV, Graph & More

ByteDance’s NoSQL ecosystem, spanning KV stores like ABase, document databases, columnar systems, and a custom distributed graph database, underpins over 90% of its online services, handling tens of thousands of instances and billions of daily requests, while embracing BASE principles and cloud‑native scalability.

Volcano Engine Developer Services
Volcano Engine Developer Services
Volcano Engine Developer Services
ByteDance’s NoSQL Strategy: Powering Billions of Requests with KV, Graph & More

NoSQL Application Status

NoSQL refers to databases that prioritize the BASE principles—Basically Available, Soft State, and Eventually Consistent—over the strict consistency of traditional relational databases.

Basically Available : Allows partial availability during failures to keep core functions running, e.g., browsing products even if payment fails.

Soft State : Accepts intermediate states that do not affect overall availability, such as pending payments or data synchronization.

Eventually Consistent : Guarantees that all nodes will converge to the same state after some time.

BASE extends the CAP model by sacrificing strong consistency to achieve higher availability.

NoSQL can be categorized into:

KV stores : e.g., Redis.

Document databases : e.g., MongoDB.

Columnar stores : e.g., HBase.

Emerging databases : graph, time‑series, etc.

Within ByteDance, NoSQL powers a massive portion of online services: tens of thousands of instances, over 100,000 physical servers, and more than 90% of online services rely on NoSQL systems.

ByteDance NoSQL Latest Practices

ByteDance’s data can be grouped into three types: user relationships, content (videos, articles, ads), and the connections between users and content (comments, likes, shares). These form a graph structure.

Self‑Developed Distributed Graph Database

To support online social‑graph operations, ByteDance built ByteGraph , a distributed graph storage database. It supports directed property graphs, Gremlin queries, and offers massive storage and throughput—up to trillions of edges per cluster and millions of QPS for multi‑degree reads/writes. Hotspot nodes can handle over 10K QPS with millisecond latency.

ByteGraph is used in scenarios such as fraud detection (replacing HBase + compute engine) and recommendation models.

Current usage statistics include:

2000+ internal services

1000+ graph database clusters

1000+ daily graph computation tasks

Over 10,000 servers

Graph Computing System

Graph computing extends graph databases to run algorithms like PageRank on large‑scale data. Traditional batch systems (MapReduce, Spark) struggle with graph data due to heavy shuffle operations. ByteDance therefore introduced a dedicated graph computing system that supports trillion‑scale graphs, dynamic high‑throughput training and inference, and hybrid memory/SSD storage with fault tolerance.

A one‑stop graph data analysis and management platform integrates graph computing and training capabilities for internal business scenarios such as risk control, e‑commerce, search, and recommendation.

ABase – ByteDance’s KV Storage Service

ABase is a self‑developed KV storage service offering high capacity, high throughput, high availability, multi‑region support, low latency, ease of use, and cost efficiency. To meet growing demands for availability, cross‑region sync, disaster recovery, and resource optimization, ByteDance introduced a second‑generation leaderless architecture.

The leaderless design eliminates master‑node failover latency and mitigates slow‑node impact, achieving millisecond‑level latency even under node failures.

Key innovations include:

Fast consensus algorithm with limited write flow and out‑of‑order sync.

Hybrid HLC timestamps in keys for natural conflict resolution.

Dual‑engine architecture: multi‑version data stored in a log engine, while single‑version writes go to a KV engine, keeping most queries as point lookups.

After optimization, ABase now offers extremely high availability, global deployment with CRDT support for complex data structures, high‑performance architecture (RunToComplete, KV separation, in‑memory indexing, FIFO log), and serverless storage with multi‑tenant QoS and fine‑grained load balancing.

Currently, over 5,000 business lines use ABase, with more than 50,000 servers, handling hundreds of billions of requests and petabyte‑scale data across 30+ regions.

NoSQL Future Development Trends

Looking ahead, NoSQL will evolve toward ultra‑high‑performance KV systems (e.g., Redis) and massive‑scale KV platforms (e.g., ByteGraph, ABase). Key directions include leveraging cloud‑native and serverless capabilities for elasticity and cost efficiency, enhancing data value and sharing for analytics and AI, unifying storage and computation for diverse schemas, combining hardware and software innovations, and standardizing product interfaces to strengthen both Redis and SQL ecosystems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

graph databasedatabasesNoSQLKV StoreABaseByteGraph
Volcano Engine Developer Services
Written by

Volcano Engine Developer Services

The Volcano Engine Developer Community, Volcano Engine's TOD community, connects the platform with developers, offering cutting-edge tech content and diverse events, nurturing a vibrant developer culture, and co-building an open-source ecosystem.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.