Databases 26 min read

Understanding and Optimizing Redis Rehash Mechanism and Scan Behavior

The article details how Redis’s incremental rehashing can double memory usage and trigger massive key eviction and scan inconsistencies in large‑scale clusters, explains the underlying dictionary structures, demonstrates the cursor bug when tables shrink, and presents a memory‑guard and scan‑cursor patch that resolves both problems.

Meituan Technology Team

Jul 26, 2018

Understanding and Optimizing Redis Rehash Mechanism and Scan Behavior

Squirrel is Meituan's in‑house cache system built on top of Redis Cluster. It has been in production since 2015, serving more than 60 TB of data and handling trillions of requests per day. During large‑scale operation the team discovered several pitfalls of Redis, especially related to the Rehash mechanism, which can cause massive key eviction and inconsistencies between master and slave nodes.

Root cause analysis : When a Redis instance reaches its maximum memory and triggers eviction, the slave has one less repl‑backlog buffer than the master. The team initially looked for external factors (client connections, network jitter, etc.) but found none. By digging into the Redis source code they identified the Rehash process as the culprit.

Redis Rehash internals : Redis stores key‑value pairs in a dictionary ( dict) which is implemented as a hash table ( dictht). Each dictionary contains two hash tables ( ht[0] and ht[1]) – the second one is used during incremental rehashing. The relevant structures are:

/* hash table structure definition */
typedef struct dictht {
    dictEntry **table;   // array of buckets
    unsigned long size;  // number of buckets
    unsigned long sizemask; // size‑1, used for indexing
    unsigned long used; // number of entries
} dictht;

/* bucket (hash entry) */
typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next; // linked‑list for collisions
} dictEntry;

/* dictionary structure */
typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx; // -1 means no rehash in progress
    int iterators;
} dict;

During normal operation only ht[0] is used. When the load factor exceeds a threshold, dictExpand() creates ht[1] and starts moving entries from ht[0] to ht[1] incrementally (the Rehash process). The expansion size is the smallest power‑of‑two greater than the required size, starting from the default DICT_HT_INITIAL_SIZE = 4. Memory consumption of the new table can be calculated as new_size * sizeof(dictEntry*) (8 bytes per pointer on a 64‑bit OS).

The article provides a table mapping ht[0].size to the memory allocated for ht[1] during expansion (e.g., size 4 → 64 bytes, size 8 → 128 bytes, …).

Impact on eviction : When the dictionary expands, both master and slave experience a burst of key eviction because the rehashing process temporarily doubles the memory footprint and forces the eviction policy to run on both sides, leading to master/slave divergence.

Scan command pitfalls : Redis SCAN is used to iterate over keys without blocking the server. Its cursor‑based design works fine when the dictionary is stable, but during Rehash (especially when the hash table shrinks) it can miss keys or return duplicates. The four dictionary states are:

Stable size (no resize)

Resize completed – table grew

Resize completed – table shrank

Rehashing – incremental grow or shrink

When the dictionary shrinks, the high‑order bits used by the cursor may skip buckets in the larger table, causing some keys to be invisible to SCAN. The article demonstrates this with concrete cursor values (e.g., cursor 20 during a shrink from size 32 to 8).

To fix the issue the team modified the scan implementation to always advance the cursor using the high‑order bits of the larger table, ensuring every bucket is visited regardless of growth or shrinkage. The patched code looks like:

unsigned long dictScan(dict *d, unsigned long v, dictScanFunction *fn,
                         dictScanBucketFunction *bucketfn, void *privdata) {
    dictht *t0, *t1;
    const dictEntry *de, *next;
    unsigned long m0, m1;

    if (dictSize(d) == 0) return 0;

    if (!dictIsRehashing(d)) {
        t0 = &(d->ht[0]);
        m0 = t0->sizemask;
        if (bucketfn) bucketfn(privdata, &t0->table[v & m0]);
        de = t0->table[v & m0];
        while (de) {
            next = de->next;
            fn(privdata, de);
            de = next;
        }
        v |= ~m0;
        v = rev(v); v++; v = rev(v);
    } else {
        t0 = &d->ht[0];
        t1 = &d->ht[1];
        if (t0->size > t1->size) { t0 = &d->ht[1]; t1 = &d->ht[0]; }
        m0 = t0->sizemask; m1 = t1->sizemask;
        if (bucketfn) bucketfn(privdata, &t0->table[v & m0]);
        de = t0->table[v & m0];
        while (de) { next = de->next; fn(privdata, de); de = next; }
        do {
            if (bucketfn) bucketfn(privdata, &t1->table[v & m1]);
            de = t1->table[v & m1];
            while (de) { next = de->next; fn(privdata, de); de = next; }
            v |= ~m1; v = rev(v); v++; v = rev(v);
        } while (v & (m0 ^ m1));
    }
    return v;
}

The patch was submitted as PR Fix dictScan(): It can't scan all buckets when dict is shrinking and has been merged upstream.

Conclusion : The article explains two major pitfalls caused by Redis Rehash – massive eviction under full‑capacity conditions and incomplete key cleanup when using SCAN. By adding a memory‑availability guard before triggering a resize and by fixing the scan cursor logic, both issues are mitigated. The findings are valuable for anyone operating large Redis clusters or developing Redis‑based services.

Note: The source code examples are based on Redis 3.2.8.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance Memory Management Redis C#hash table Rehash SCAN

Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.