Unlocking Redis Hash Internals: How Dict and Listpack Boost Performance
This article explains the inner workings of Redis Hashes, covering the dict and listpack storage mechanisms, memory‑optimizing conditions, rehashing strategies, and the underlying C structures that enable fast O(1) field lookups and efficient memory usage.
1. What Is a Redis Hash?
Redis Hash (a hash table) stores field‑value pairs similar to Python dictionaries or Java HashMaps, allowing O(1) lookup, update, or deletion of a field.
The underlying dict consists of an array + linked list where each array slot is a hash bucket ; collisions are resolved with chaining (linked lists).
2. Underlying Storage Strategies
Redis Hashes can be stored using two structures:
dict structure.
listpack (formerly ziplist before Redis 7.0).
By default, dict is used, with each field‑value pair forming a dictEntry node.
listpack is chosen only when both conditions are met:
Both field and value string sizes are smaller than the hash-max-listpack-value configuration (default 64 bytes).
The number of field‑value pairs is less than hash-max-listpack-entries (default 512).
When inserting or updating data, Redis calls hashTypeConvertListpack() in t_hash.c to decide whether to switch the underlying structure. The structure never reverts from dict to listpack.
Using listpack reduces memory consumption significantly while keeping performance comparable for small datasets.
An abstract hashTypeIterator hides the underlying storage differences.
3. Dict Structure Details
struct dict {
dictType *type;
dictEntry **ht_table[2];
unsigned long ht_used[2];
long rehashidx;
int16_t pauserehash;
signed char ht_size_exp[2];
}; dictType *type: function pointers for custom key/value handling. dictEntry **ht_table[2]: two arrays of pointers to dictEntry nodes (the two hash tables used during rehash). ht_used[2]: number of slots used in each table. rehashidx: index of the bucket currently being rehashed; -1 means no rehash. pauserehash: indicates whether rehash is paused (>0) or errored (<0).
The dictEntry node stores the actual key‑value pair:
typedef struct dictEntry {
void *key;
union {
void *val;
uint64_t u64;
int64_t s64;
double d;
} v;
struct dictEntry *next;
} dictEntry; *keypoints to the key (an SDS string). v is a union that holds the value in one of several forms to save memory. *next links entries that share the same bucket (hash collisions) using chaining.
MySQL: “Why does ht_table[2] store two pointers to hash tables? Isn’t one enough?”
Redis normally reads/writes using ht_table[0]. When the hash table grows, a new ht_table[1] is created and entries are gradually moved.
4. Expansion, Contraction, and Load Factor
Redis expands or contracts the hash table based on the load factor ( number of dictEntry nodes / number of buckets). Expansion occurs when no BGSAVE or BGREWRITEAOF is running and the load factor ≥ 1, or when those commands are running and the load factor ≥ 5.
During expansion, a new table twice the size of the current one is allocated ( ht_table[1]), and entries are rehashed into it. After migration, ht_table[0] points to the new table and the old one is freed.
5. Progressive Rehash
To avoid blocking the main thread, Redis performs incremental rehashing:
Set rehashidx to 0 to start.
On each client operation, if rehash is active, move the bucket at rehashidx from ht_table[0] to ht_table[1] and increment rehashidx.
When all buckets are moved, set rehashidx to -1 to finish.
MySQL: “During rehash, do delete/find/update/add operations need to check both tables?”
Deletes, finds, and updates may need to look in both tables; adds are performed only in the new table.
MySQL: “If traffic is low, will both tables stay active for a long time?”
Redis registers a periodic serverCron event that assists rehash and performs other maintenance tasks such as key expiration, monitoring, statistics updates, and handling client timeouts.
Expire keys.
Monitor server status.
Update statistics.
Progressive rehash.
Trigger BGSAVE/AOF rewrite and stop child processes.
Handle client timeouts.
Thus, Redis’s hash implementation balances high performance with memory efficiency, using dict for general cases and listpack for small, dense datasets, while employing incremental rehash to maintain responsiveness.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
