What Is Redis? Exploring Its Data Structures, Persistence, and High‑Availability Features
This comprehensive guide explains Redis as an in‑memory key‑value NoSQL database, covering its core data structures, common use cases, performance advantages, persistence mechanisms, replication, Sentinel, clustering, cache design patterns, and operational best practices for handling large keys, memory limits, and distributed locks.
Fundamentals
1. What is Redis?
Redis is a key‑value based NoSQL database. Unlike simple key‑value stores, Redis values support strings, hashes, lists, sets, sorted sets, bitmaps, HyperLogLog, and GEO data structures, enabling many application scenarios. All data is stored in memory, providing excellent read/write performance, and can be persisted to disk via snapshots and logs to prevent data loss on power failures. Additional features include key expiration, publish/subscribe, transactions, pipelines, and Lua scripting.
2. What can Redis be used for?
Common uses include caching, counters, leaderboards, social networks, message queues, and distributed locks. In an e‑commerce user service, Redis can store tokens, login‑failure counters, address caches, and implement distributed locks.
Cache: reduce data source pressure and improve response speed.
Counter: high‑performance counting for views, likes, etc.
Leaderboard: use lists and sorted sets.
Social network: likes, followers, push, refresh.
Message queue: publish/subscribe and blocking queues.
Distributed lock: synchronize access in distributed environments.
3. What data structures does Redis provide?
Redis offers five basic data structures:
String : basic type, can store text, numbers, binary data up to 512 MB. Typical uses: cache, counting, shared session, rate limiting.
Hash : a map of fields to values. Typical uses: cache user info, cache objects.
List : ordered collection of strings, can act as stack or queue. Typical uses: message queue, article list.
Set : unordered collection of unique strings. Typical uses: tags, common interests.
Sorted set : ordered by a score. Typical uses: like statistics, user ranking.
4. Why is Redis fast?
Completely memory‑based operations.
Single‑threaded execution avoids context switches and race conditions.
Non‑blocking I/O multiplexing.
Implemented in C with optimized data structures.
5. What is I/O multiplexing?
It allows a single thread to monitor multiple sockets simultaneously, using mechanisms such as select, poll, and epoll. This reduces blocking and enables event‑driven (reactor) models.
6. Why did early Redis use a single thread?
Because Redis is memory‑bound, CPU bottlenecks are rare; the main limitation is memory size or network bandwidth. Multiple instances can be launched on one machine to utilize more CPU cores.
7. How does Redis 6.0 use multiple threads?
Redis 6.0 introduces multithreading for network I/O and protocol parsing, while command execution remains single‑threaded, improving overall performance.
Persistence
8. What persistence methods does Redis have?
Redis supports RDB (snapshot) and AOF (append‑only file) persistence.
RDB
Creates a compact binary snapshot of the dataset. Triggered manually (SAVE, BGSAVE) or automatically based on configuration. Fast recovery but not real‑time.
AOF
Appends every write command to a log. Provides real‑time durability and can be configured for different fsync policies (always, everysec, no). AOF files are larger and slower to recover than RDB.
9. Pros and cons of RDB vs AOF
RDB pros: compact file, good disaster recovery, fast load.
RDB cons: lower real‑time durability, compatibility issues across versions.
AOF pros: real‑time durability, can recover from partial writes.
AOF cons: larger files, slower recovery, higher memory usage during rewrite.
10. How to choose between RDB and AOF?
For maximum data safety, enable both (AOF takes precedence on restart). If a few minutes of data loss is acceptable, use only RDB. For most production workloads, using both combines fast recovery with durability.
High Availability
13. What is master‑slave replication?
One Redis server (master) replicates its data to one or more slaves. Replication is unidirectional. Supports synchronous and asynchronous (partial) sync.
14. Common replication topologies
One master, one slave.
One master, multiple slaves (star topology).
Tree topology where slaves can act as masters for other slaves.
15. Replication workflow
Save master info (IP, port).
Slave connects to master.
Ping exchange.
Authentication if required.
Full data sync (RDB transfer).
Continuous command propagation.
16. Full vs partial sync
Full sync transfers the entire dataset (expensive). Partial sync uses PSYNC with offsets to transfer only missing data after a temporary disconnection.
18. What is Redis Sentinel?
Sentinel monitors masters and slaves, performs automatic failover, provides configuration to clients, and sends notifications. It uses periodic monitoring, subjective/ objective down detection, leader election (Raft‑based), and failover steps.
22. What is Redis Cluster?
Cluster provides data partitioning (16384 slots) and high availability. Slots are assigned to nodes; each node holds a subset of slots. Automatic failover works similarly to Sentinel but involves all nodes.
Cache Design
26. Cache breakdown, penetration, and avalanche
Breakdown: hot key expires, causing a surge of DB reads. Penetration: requests for non‑existent data bypass cache, hitting DB. Avalanche: many keys expire simultaneously, overwhelming DB.
27. Bloom filter
Uses multiple hash functions to set bits in a bitmap, allowing fast existence checks with a false‑positive probability.
28. Ensuring cache‑DB consistency
Common strategies: delete cache after DB write, write‑behind, delayed double delete, message‑queue‑driven invalidation, and setting reasonable TTLs.
29. Consistency between local and distributed cache
Use Redis Pub/Sub or reliable message queues to broadcast cache‑eviction events to all application nodes, or rely on short TTLs.
30. Handling hot keys
Monitor hot keys via client, proxy, or server, then split them across nodes, use secondary caches, or apply rate limiting.
31. Cache warm‑up
Pre‑load data via manual scripts, application startup, or scheduled jobs.
32. Hot key reconstruction
Use mutex locks, never‑expire logical timestamps, or background rebuild threads to avoid thundering‑herd rebuilds.
33. The "bottomless pit" problem
When many keys are spread across many nodes, batch operations require multiple network round‑trips, degrading performance. Optimizations include command reduction, connection pooling, and NIO.
Operations
34. Out‑of‑memory handling
Increase maxmemory, adjust eviction policy, or scale horizontally with clustering.
35. Expiration policies
Lazy deletion (on access) and periodic random sampling deletion.
36. Memory eviction policies
noeviction, volatile‑lru, allkeys‑lru, allkeys‑random, volatile‑random, volatile‑ttl.
37. Blocking issues and solutions
Identify slow commands (slowlog), avoid O(N) operations on large data, monitor CPU usage, and tune persistence (fork, AOF fsync, Transparent HugePages).
38. Large‑key problems
Detect with BIGKEYS or redis‑rdb‑tools. Delete with UNLINK (non‑blocking) or SCAN‑based deletion. Compress or shard large values.
39. Common performance tips
Avoid persistence on masters; use slaves for AOF.
Place master and slaves in the same LAN.
Use chain replication (master←slave1←slave2…) for easier failover.
Application
40. Implementing an async queue with Redis
Use LIST with LPUSH / BRPOP for blocking consumption, or Pub/Sub for fan‑out (non‑reliable).
41. Delayed queue
Use ZSET with timestamps as scores; periodically query ZRANGEBYSCORE for ready tasks.
42. Transactions
Redis supports MULTI/EXEC atomic command batches. No rollback; commands are queued until EXEC.
43. Lua scripting
Lua scripts run atomically, can combine multiple commands, reduce network overhead, and create custom commands.
44. Pipelining
Send multiple commands without waiting for replies, reducing RTT and context switches.
45. Distributed lock
Basic lock with SETNX, add expiration with SET EX NX, or use Redisson library for robust implementation.
Underlying Structures
46. Core data structures
SDS (dynamic strings), linked list, hash table (dict), skiplist, intset, ziplist.
47. Advantages of SDS over C strings
Stores length for O(1) size, automatic allocation, binary safety, and reduces reallocations.
48. Dictionary implementation and rehashing
Hash table with chaining, two tables during rehash, incremental rehash to avoid long pauses.
49. Skiplist implementation
Multi‑level forward pointers, random level generation, spans for rank calculation, used for sorted sets.
50. Ziplist (compressed list)
Memory‑efficient sequential structure storing strings or integers with length, tail offset, and entry count.
51. Quicklist
Combines linked list of ziplist nodes, reducing memory overhead and fragmentation for LIST type.
Other Questions
52. Finding keys with a known prefix among 100 million keys
Use SCAN with a match pattern to iterate without blocking; avoid KEYS on production.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Sanyou's Java Diary
Passionate about technology, though not great at solving problems; eager to share, never tire of learning!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
