Databases 18 min read

Inside ByteGraph: How ByteDance Scales Distributed Graph Databases with Index and Execution Optimizations

This article summarizes ByteDance engineer Chen Chao's DTCC 2022 talk on ByteGraph, covering its purpose, Gremlin query interface, three‑layer architecture, indexing strategies, distributed transaction handling, performance optimizations such as adaptive throttling and write‑amplification reduction, and integration with offline data pipelines.

ITPUB
ITPUB
ITPUB
Inside ByteGraph: How ByteDance Scales Distributed Graph Databases with Index and Execution Optimizations

ByteGraph Overview

ByteGraph is ByteDance's self‑developed distributed graph database storage system supporting the Gremlin query language. It models user, content, and user‑content relationship data, offering high throughput (up to tens of millions of QPS), low latency, and eventual consistency. Deployed in over 1,000 clusters worldwide, it powers services such as Toutiao, Douyin, Xigua, and knowledge‑graph applications.

Gremlin Query Interface

ByteGraph implements about 80 % of the Gremlin language. Example queries include counting likes received by authors followed by a user, ranking followed authors by fan count, and time‑based ranking of followed authors. ByteGraph adds a withTable() clause to enable cross‑table Gremlin queries, allowing queries that span multiple tables or clusters.

System Architecture

The system is divided into three independent layers—query engine, storage engine, and disk storage—each horizontally scalable. The query engine, written in Go, manages user sessions, proxy services, Gremlin parsing, distributed execution planning, and optimization. It employs rule‑based (RBO) and cost‑based (CBO) optimizers and a push‑mode pipeline executor that supports both row‑ and column‑oriented execution.

The storage engine, implemented in C++, stores graph data as partitioned B‑trees. Each partition (or "slice") represents a sub‑graph and is designed to minimize read/write amplification. Write‑ahead logging (WAL) guarantees durability, and a custom LRU cache provides fast access while keeping memory usage bounded. The storage layer also handles distributed transactions using a two‑phase‑commit protocol, offering read‑committed isolation and eventual consistency.

Indexing Strategies

ByteGraph provides two kinds of indexes. Local (or "partial") indexes are built per vertex and edge type on edge attributes, accelerating edge filtering and attribute sorting. Global indexes map a specific attribute value to all vertex IDs that possess it, enabling fast attribute‑based lookups. Indexes can be constructed synchronously—updating metadata immediately on insert, delete, or update—or lazily, based on query cost analysis to save resources. Both local and global indexes are stored as B+‑trees sharing an OP log to avoid distributed transaction overhead.

Performance Optimizations

Adaptive throttling separates queries into light‑weight and heavy‑weight pools, preventing super‑node hotspots from saturating CPU resources.

Write‑amplification reduction uses a delta‑page design inspired by BW‑Tree: new writes go to a single delta page, which is merged into the base page once a threshold is reached, keeping read amplification around 2×.

Columnar execution and chunk‑based data transfer double query performance for certain workloads.

Offline Integration

ByteGraph supports bulk import from external sources such as MySQL, Hive, Redis, and HBase. For small datasets, RPC writes achieve up to 1 M QPS; for large datasets, MapReduce pipelines can ingest up to 500 B edges per hour directly into the underlying KV store. Real‑time ingestion is provided via SDKs that consume Kafka streams, producing daily snapshots that can be exported to Hive or used in offline graph computation pipelines.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

graph databasedistributed storageGremlinByteGraph
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.