Databases 12 min read

Pegasus: Design Overview, New Features, Ecosystem, and Community Development

This article introduces Pegasus, a distributed key‑value storage system, covering its background, architecture, data model, dual‑WAL design, performance benchmarks, recent features such as hot backup, bulk load, access control, partition split, as well as its surrounding ecosystem tools and open‑source community initiatives.

DataFunSummit

Feb 4, 2022

Pegasus: Design Overview, New Features, Ecosystem, and Community Development

Pegasus is a distributed key‑value store designed to address the limitations of existing storage systems like Redis and HBase, offering strong consistency, low latency, and efficient resource usage.

Background: While Redis provides high performance without strong consistency and HBase ensures consistency at the cost of latency, Pegasus was created to combine the strengths of both.

System Architecture: The cluster consists of a Meta server for metadata management and configuration changes, Replica servers that store user data using RocksDB and ensure strong consistency via the PacificA protocol, and Zookeeper for leader election and cluster metadata. Clients first retrieve routing information from the Meta server and then interact directly with Replica servers.

Data Model: Pegasus uses a two‑level index with a hashkey for sharding and a sortkey stored in lexical order, enabling atomic batch reads and writes within the same hashkey.

Dual‑WAL Design: To reduce write latency, Pegasus separates the write‑ahead log (WAL) onto a dedicated disk (shared log) while data is written to RocksDB; a private log per shard is used for load balancing and fault recovery, mitigating long‑tail latency.

Performance Testing: Benchmark results of version 2.2.0 under YCSB workloads demonstrate the system’s high throughput and low latency.

New Features:

Hot backup for online cross‑region replication, supporting both single‑master and multi‑master modes, and enabling seamless data migration and disaster recovery.

Bulk Load for fast offline data ingestion using Pegasus‑Spark to generate SST files that are directly loaded into Replica servers.

Access Control based on Kerberos authentication and table‑level white‑list permissions.

Partition Split that doubles the number of hash partitions by asynchronously copying data from parent to child partitions before making the split visible to clients.

Ecosystem Tools:

Pegasus‑Spark: a connector that allows Spark jobs to read Pegasus snapshots and perform bulk data loading via generated SST files.

Meta Proxy: a proxy layer that hides Meta server addresses, providing a unified entry point and transparent failover for hot‑backup clusters.

Disk Data Migration Tool: an online balancer that redistributes shards across disks to alleviate storage bottlenecks without disrupting service.

Community Development: Since its inception in 2015, Pegasus has been open‑sourced (1.7.0 in 2017) and graduated to an Apache incubator project in 2020. The community encourages contributions to client tools, shells, and other utilities, and plans future enhancements such as periodic bulk imports, hot‑backup optimizations, hotspot detection, throttling, tracing, and support for dual‑replica architectures.

For more information and to participate, visit the GitHub repository https://github.com/apache/incubator-pegasus and follow the official Apache Pegasus channels.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Key-Value Store PEGASUS

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.