Databases 62 min read

What Is Redis? Exploring Its Data Structures, Persistence, and High‑Availability Features

This comprehensive guide explains Redis as an in‑memory key‑value NoSQL database, covering its core data structures, common use cases, performance advantages, persistence mechanisms, replication, Sentinel, clustering, cache design patterns, and operational best practices for handling large keys, memory limits, and distributed locks.

Sanyou's Java Diary
Sanyou's Java Diary
Sanyou's Java Diary
What Is Redis? Exploring Its Data Structures, Persistence, and High‑Availability Features

Fundamentals

1. What is Redis?

Redis is a key‑value based NoSQL database. Unlike simple key‑value stores, Redis values support strings, hashes, lists, sets, sorted sets, bitmaps, HyperLogLog, and GEO data structures, enabling many application scenarios. All data is stored in memory, providing excellent read/write performance, and can be persisted to disk via snapshots and logs to prevent data loss on power failures. Additional features include key expiration, publish/subscribe, transactions, pipelines, and Lua scripting.

2. What can Redis be used for?

Common uses include caching, counters, leaderboards, social networks, message queues, and distributed locks. In an e‑commerce user service, Redis can store tokens, login‑failure counters, address caches, and implement distributed locks.

Cache: reduce data source pressure and improve response speed.

Counter: high‑performance counting for views, likes, etc.

Leaderboard: use lists and sorted sets.

Social network: likes, followers, push, refresh.

Message queue: publish/subscribe and blocking queues.

Distributed lock: synchronize access in distributed environments.

3. What data structures does Redis provide?

Redis offers five basic data structures:

String : basic type, can store text, numbers, binary data up to 512 MB. Typical uses: cache, counting, shared session, rate limiting.

Hash : a map of fields to values. Typical uses: cache user info, cache objects.

List : ordered collection of strings, can act as stack or queue. Typical uses: message queue, article list.

Set : unordered collection of unique strings. Typical uses: tags, common interests.

Sorted set : ordered by a score. Typical uses: like statistics, user ranking.

4. Why is Redis fast?

Completely memory‑based operations.

Single‑threaded execution avoids context switches and race conditions.

Non‑blocking I/O multiplexing.

Implemented in C with optimized data structures.

5. What is I/O multiplexing?

It allows a single thread to monitor multiple sockets simultaneously, using mechanisms such as select, poll, and epoll. This reduces blocking and enables event‑driven (reactor) models.

6. Why did early Redis use a single thread?

Because Redis is memory‑bound, CPU bottlenecks are rare; the main limitation is memory size or network bandwidth. Multiple instances can be launched on one machine to utilize more CPU cores.

7. How does Redis 6.0 use multiple threads?

Redis 6.0 introduces multithreading for network I/O and protocol parsing, while command execution remains single‑threaded, improving overall performance.

Persistence

8. What persistence methods does Redis have?

Redis supports RDB (snapshot) and AOF (append‑only file) persistence.

RDB

Creates a compact binary snapshot of the dataset. Triggered manually (SAVE, BGSAVE) or automatically based on configuration. Fast recovery but not real‑time.

AOF

Appends every write command to a log. Provides real‑time durability and can be configured for different fsync policies (always, everysec, no). AOF files are larger and slower to recover than RDB.

9. Pros and cons of RDB vs AOF

RDB pros: compact file, good disaster recovery, fast load.

RDB cons: lower real‑time durability, compatibility issues across versions.

AOF pros: real‑time durability, can recover from partial writes.

AOF cons: larger files, slower recovery, higher memory usage during rewrite.

10. How to choose between RDB and AOF?

For maximum data safety, enable both (AOF takes precedence on restart). If a few minutes of data loss is acceptable, use only RDB. For most production workloads, using both combines fast recovery with durability.

High Availability

13. What is master‑slave replication?

One Redis server (master) replicates its data to one or more slaves. Replication is unidirectional. Supports synchronous and asynchronous (partial) sync.

14. Common replication topologies

One master, one slave.

One master, multiple slaves (star topology).

Tree topology where slaves can act as masters for other slaves.

15. Replication workflow

Save master info (IP, port).

Slave connects to master.

Ping exchange.

Authentication if required.

Full data sync (RDB transfer).

Continuous command propagation.

16. Full vs partial sync

Full sync transfers the entire dataset (expensive). Partial sync uses PSYNC with offsets to transfer only missing data after a temporary disconnection.

18. What is Redis Sentinel?

Sentinel monitors masters and slaves, performs automatic failover, provides configuration to clients, and sends notifications. It uses periodic monitoring, subjective/ objective down detection, leader election (Raft‑based), and failover steps.

22. What is Redis Cluster?

Cluster provides data partitioning (16384 slots) and high availability. Slots are assigned to nodes; each node holds a subset of slots. Automatic failover works similarly to Sentinel but involves all nodes.

Cache Design

26. Cache breakdown, penetration, and avalanche

Breakdown: hot key expires, causing a surge of DB reads. Penetration: requests for non‑existent data bypass cache, hitting DB. Avalanche: many keys expire simultaneously, overwhelming DB.

27. Bloom filter

Uses multiple hash functions to set bits in a bitmap, allowing fast existence checks with a false‑positive probability.

28. Ensuring cache‑DB consistency

Common strategies: delete cache after DB write, write‑behind, delayed double delete, message‑queue‑driven invalidation, and setting reasonable TTLs.

29. Consistency between local and distributed cache

Use Redis Pub/Sub or reliable message queues to broadcast cache‑eviction events to all application nodes, or rely on short TTLs.

30. Handling hot keys

Monitor hot keys via client, proxy, or server, then split them across nodes, use secondary caches, or apply rate limiting.

31. Cache warm‑up

Pre‑load data via manual scripts, application startup, or scheduled jobs.

32. Hot key reconstruction

Use mutex locks, never‑expire logical timestamps, or background rebuild threads to avoid thundering‑herd rebuilds.

33. The "bottomless pit" problem

When many keys are spread across many nodes, batch operations require multiple network round‑trips, degrading performance. Optimizations include command reduction, connection pooling, and NIO.

Operations

34. Out‑of‑memory handling

Increase maxmemory, adjust eviction policy, or scale horizontally with clustering.

35. Expiration policies

Lazy deletion (on access) and periodic random sampling deletion.

36. Memory eviction policies

noeviction, volatile‑lru, allkeys‑lru, allkeys‑random, volatile‑random, volatile‑ttl.

37. Blocking issues and solutions

Identify slow commands (slowlog), avoid O(N) operations on large data, monitor CPU usage, and tune persistence (fork, AOF fsync, Transparent HugePages).

38. Large‑key problems

Detect with BIGKEYS or redis‑rdb‑tools. Delete with UNLINK (non‑blocking) or SCAN‑based deletion. Compress or shard large values.

39. Common performance tips

Avoid persistence on masters; use slaves for AOF.

Place master and slaves in the same LAN.

Use chain replication (master←slave1←slave2…) for easier failover.

Application

40. Implementing an async queue with Redis

Use LIST with LPUSH / BRPOP for blocking consumption, or Pub/Sub for fan‑out (non‑reliable).

41. Delayed queue

Use ZSET with timestamps as scores; periodically query ZRANGEBYSCORE for ready tasks.

42. Transactions

Redis supports MULTI/EXEC atomic command batches. No rollback; commands are queued until EXEC.

43. Lua scripting

Lua scripts run atomically, can combine multiple commands, reduce network overhead, and create custom commands.

44. Pipelining

Send multiple commands without waiting for replies, reducing RTT and context switches.

45. Distributed lock

Basic lock with SETNX, add expiration with SET EX NX, or use Redisson library for robust implementation.

Underlying Structures

46. Core data structures

SDS (dynamic strings), linked list, hash table (dict), skiplist, intset, ziplist.

47. Advantages of SDS over C strings

Stores length for O(1) size, automatic allocation, binary safety, and reduces reallocations.

48. Dictionary implementation and rehashing

Hash table with chaining, two tables during rehash, incremental rehash to avoid long pauses.

49. Skiplist implementation

Multi‑level forward pointers, random level generation, spans for rank calculation, used for sorted sets.

50. Ziplist (compressed list)

Memory‑efficient sequential structure storing strings or integers with length, tail offset, and entry count.

51. Quicklist

Combines linked list of ziplist nodes, reducing memory overhead and fragmentation for LIST type.

Other Questions

52. Finding keys with a known prefix among 100 million keys

Use SCAN with a match pattern to iterate without blocking; avoid KEYS on production.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Sanyou's Java Diary
Written by

Sanyou's Java Diary

Passionate about technology, though not great at solving problems; eager to share, never tire of learning!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.