Leveraging Non‑Volatile Memory to Enhance Redis Persistence Performance
This article examines how integrating byte‑addressable non‑volatile memory (NVM) with Redis can resolve the performance‑reliability trade‑off of AOF persistence, delivering near‑always‑mode data safety while maintaining everysec‑like throughput and dramatically reducing recovery time.
Background: Redis is a lightweight, high‑performance open‑source key‑value store used for caching and persistent storage, offering configurable options for read/write performance, cache capacity, and data reliability.
Redis supports RDB and AOF persistence; version 4.0 added a hybrid RDB‑AOF mode. AOF write can be configured as always (real‑time flush, high safety but low performance) or everysec (buffered flush, high performance but risk of second‑level data loss), forcing a trade‑off between speed and durability.
High‑safety scenarios often require application‑level coordination, adding system complexity, while failover introduces cache warm‑up latency. The large performance gap between DRAM and SSD motivates the use of emerging Non‑Volatile Memory (NVM) technologies.
NVM products provide a DIMM‑style interface, retain data without power, offer higher capacity and lower cost than DRAM, and deliver byte‑addressable access with speeds far exceeding traditional SSDs, though they exhibit read/write and sequential/random asymmetry.
Typical NVM use cases include persistent memory for highly consistent storage, in‑memory databases as an "in‑place" data space, and system‑log volumes for checkpointing in HPC environments.
By exploiting NVM’s byte‑addressability, persistence, and performance, Redis was extensively redesigned and customized, achieving excellent test results across the identified challenges.
Performance analysis: The always mode guarantees durability via real‑time flush but degrades performance; everysec improves QPS by reducing flush frequency but risks data loss. NVM‑based persistence elegantly balances both, as illustrated in the data flow diagram.
The AOF file is placed on an NVM‑aware filesystem (e.g., EXT4 DAX) and memory‑mapped into user space, turning AOF accesses into lightweight load/store operations; persistence is achieved simply by a cache‑flush, bypassing the traditional block‑device stack.
To mitigate AOF growth, a background thread replays AOF commands into NVM‑resident KV structures and then deletes the original file, eliminating space waste.
Replay is performed asynchronously: writes initially target DDR to preserve client QPS, then are replayed to NVM. DRAM thus serves as a write cache, while NVM acts as a read cache, leveraging NVM’s superior read performance while avoiding its write limitations.
Benchmarks on two 96‑core/384 GB servers show SET operation throughput comparable to everysec mode while retaining always‑mode safety.
During restart, native Redis needs ~53 seconds to load a 10 GB RDB, during which the service is unavailable. The NVM‑based solution can serve read‑only requests within 1 second and full read/write service within 35 seconds, dramatically reducing recovery time.
Native Redis forks a child process to persist the entire DB, causing performance jitter even for minimal changes. The NVM approach provides continuous incremental persistence, offering smoother operation for online services.
NVM’s higher storage density and lower cost enable larger data capacities. A data swap strategy between non‑volatile NVM and volatile DRAM was designed to increase storage capacity without compromising Redis’s baseline performance.
Conclusion: By fully exploiting NVM’s byte‑addressability and persistence, and using DRAM as a cache to offset NVM’s write asymmetry, the solution achieves high reliability, high performance, and low cost for Redis. Future work will address replay performance under heavy writes and improve failover handling.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.