High‑Availability Architecture of TuGraph‑DB: Design, Planning, and Deployment
This article explains the high‑availability architecture of TuGraph‑DB, covering the concepts of HA, the Raft consensus algorithm, cluster design, server and client mechanisms, snapshot handling, and future roadmap such as witness nodes and on‑demand snapshots.
The article introduces the high‑availability (HA) architecture of TuGraph‑DB, beginning with a definition of HA, its importance for online services, and the industry‑standard metrics (e.g., 4‑9 availability, RTO/RPO).
It describes common HA patterns, focusing on the primary‑backup replication model and its limitations, then presents TuGraph‑DB’s HA solution based on the Raft consensus algorithm, highlighting Raft’s leader‑based service, strong consistency through log replication, safety guarantees, and fault tolerance when fewer than half of the nodes fail.
The document details the Raft‑driven cluster design: enabling HA with the enable_ha and ha_conf parameters, automatic leader election, request routing for reads and writes, log‑driven write consistency, and the use of snapshots to accelerate follower synchronization without blocking MVCC reads.
Client‑side design is also covered, emphasizing the need for connections to all cluster nodes, automatic leader discovery, and load‑balancing of read requests across followers to improve performance.
Future plans include introducing a Witness role for small‑scale deployments, on‑demand snapshot generation to reduce storage overhead, and expanding the HA toolchain (e.g., cluster‑wide online import, snapshot utilities, and richer client APIs).
References to detailed deployment guides and SDKs for C++, Java, and Python are provided, along with visual diagrams illustrating the architecture.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.