Unlocking Redis Hash Internals: How Dict and Listpack Boost Performance

This article explains the inner workings of Redis Hashes, covering the dict and listpack storage mechanisms, memory‑optimizing conditions, rehashing strategies, and the underlying C structures that enable fast O(1) field lookups and efficient memory usage.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
Unlocking Redis Hash Internals: How Dict and Listpack Boost Performance

1. What Is a Redis Hash?

Redis Hash (a hash table) stores field‑value pairs similar to Python dictionaries or Java HashMaps, allowing O(1) lookup, update, or deletion of a field.

The underlying dict consists of an array + linked list where each array slot is a hash bucket ; collisions are resolved with chaining (linked lists).

2. Underlying Storage Strategies

Redis Hashes can be stored using two structures:

dict structure.

listpack (formerly ziplist before Redis 7.0).

By default, dict is used, with each field‑value pair forming a dictEntry node.

listpack is chosen only when both conditions are met:

Both field and value string sizes are smaller than the hash-max-listpack-value configuration (default 64 bytes).

The number of field‑value pairs is less than hash-max-listpack-entries (default 512).

When inserting or updating data, Redis calls hashTypeConvertListpack() in t_hash.c to decide whether to switch the underlying structure. The structure never reverts from dict to listpack.

Using listpack reduces memory consumption significantly while keeping performance comparable for small datasets.

An abstract hashTypeIterator hides the underlying storage differences.

3. Dict Structure Details

struct dict {
    dictType *type;
    dictEntry **ht_table[2];
    unsigned long ht_used[2];
    long rehashidx;
    int16_t pauserehash;
    signed char ht_size_exp[2];
};
dictType *type

: function pointers for custom key/value handling. dictEntry **ht_table[2]: two arrays of pointers to dictEntry nodes (the two hash tables used during rehash). ht_used[2]: number of slots used in each table. rehashidx: index of the bucket currently being rehashed; -1 means no rehash. pauserehash: indicates whether rehash is paused (>0) or errored (<0).

The dictEntry node stores the actual key‑value pair:

typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;
*key

points to the key (an SDS string). v is a union that holds the value in one of several forms to save memory. *next links entries that share the same bucket (hash collisions) using chaining.

MySQL: “Why does ht_table[2] store two pointers to hash tables? Isn’t one enough?”

Redis normally reads/writes using ht_table[0]. When the hash table grows, a new ht_table[1] is created and entries are gradually moved.

4. Expansion, Contraction, and Load Factor

Redis expands or contracts the hash table based on the load factor ( number of dictEntry nodes / number of buckets). Expansion occurs when no BGSAVE or BGREWRITEAOF is running and the load factor ≥ 1, or when those commands are running and the load factor ≥ 5.

During expansion, a new table twice the size of the current one is allocated ( ht_table[1]), and entries are rehashed into it. After migration, ht_table[0] points to the new table and the old one is freed.

5. Progressive Rehash

To avoid blocking the main thread, Redis performs incremental rehashing:

Set rehashidx to 0 to start.

On each client operation, if rehash is active, move the bucket at rehashidx from ht_table[0] to ht_table[1] and increment rehashidx.

When all buckets are moved, set rehashidx to -1 to finish.

MySQL: “During rehash, do delete/find/update/add operations need to check both tables?”

Deletes, finds, and updates may need to look in both tables; adds are performed only in the new table.

MySQL: “If traffic is low, will both tables stay active for a long time?”

Redis registers a periodic serverCron event that assists rehash and performs other maintenance tasks such as key expiration, monitoring, statistics updates, and handling client timeouts.

Expire keys.

Monitor server status.

Update statistics.

Progressive rehash.

Trigger BGSAVE/AOF rewrite and stop child processes.

Handle client timeouts.

Thus, Redis’s hash implementation balances high performance with memory efficiency, using dict for general cases and listpack for small, dense datasets, while employing incremental rehash to maintain responsiveness.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendMemory OptimizationredisCData StructureHashRehash
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.