Backend Development 12 min read

Design and Implementation of DGraph: A High‑Performance Recommendation Engine

DGraph, a C++ recommendation engine launched in 2022 for 得物, combines an index layer with a service layer, uses lock‑free RCU structures, a custom mmap‑based D‑Allocator, RoaringBitmap invert indexes, and a multi‑operator fusion scheduler to achieve high‑performance, eventually consistent, scalable recommendations.

DeWu Technology
DeWu Technology
DeWu Technology
Design and Implementation of DGraph: A High‑Performance Recommendation Engine

DGraph is a C++ project launched in late 2022 to build an efficient and easy‑to‑use recommendation engine for the rapidly growing business of "得物". It addresses the challenges of multi‑table data, frequent updates, and queries that span many tables.

The system is divided into an index layer and a service layer. The index layer provides CRUD operations for indexes, while the service layer hosts the Graph operator framework, external services, query parsing, output encoding, sorting, and other business‑oriented modules.

Index management is abstracted into five modules—Reader, Writer, Compaction, LifeCycle, and Schema—so that a new index type only needs to implement these classes.

DGraph adopts eventual consistency. In a cluster of N engines, each engine updates data independently, allowing millisecond‑level divergence that does not affect the business because the data eventually converges.

To meet the high read‑performance demands of recommendation scenarios, DGraph uses lock‑free RCU data structures (single‑writer, multi‑reader) instead of traditional locking.

The custom memory allocator, D‑Allocator , is based on mmap and allocates 128 MB–1 GB blocks. It manages up to 96 TB of address space per cluster with fixed‑address mapping using a keyId (e.g., 0x0000100000000000 + keyId * 100GB ).

KV/KVV indexes are implemented as a dense hash map where each bucket stores the first KVPair and conflict information; incremental parts use RcuHashMap<Key, RcuDoc> built on D‑Allocator .

The invert index is built on RoaringBitmap with a two‑level scheme (high 16 bits for primary index, low 16 bits for secondary containers), providing efficient storage for both sparse and dense postings.

Embedding indexes leverage K‑means clustering; each centroid creates a RoaringBitmap that can be combined with textual filters during vector retrieval.

The operator scheduling framework evolved from a simple query/completion engine to a multi‑operator fusion scheduler with node‑driven and thread‑sticky scheduling, columnar storage, and virtual columns, significantly reducing latency.

Overall, DGraph has delivered improvements in algorithm metrics, stability, and cost for recommendation workloads. The article shares practical lessons on data validation, flexible APIs (SQL‑like or DAG), and the importance of binary mmap‑loaded indexes for fast recovery.

distributed systemsMemory ManagementIndexingC++RCURecommendation Engine
DeWu Technology
Written by

DeWu Technology

A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.