Understanding Redis String Implementation: SDS, Encoding Optimizations, and redisObject
This article explains why Redis uses a custom SDS string structure instead of native C strings, describes the SDS layout, expansion rules, redisObject metadata, and the three string encodings (int, embstr, raw) that optimize performance and memory usage.
Preface
Redis is the most frequently used cache system, and strings are its most common data structure. This article explores whether you truly understand Redis strings.
Studying source code is the best way to learn from masters; let's examine how Redis implements strings at the low level.
Whether you are a fresh graduate preparing for interviews or an experienced engineer, we will revisit Redis strings from the underlying data perspective.
1. Why Redis does not use native C strings?
The article explains that Redis does not rely on C's native null‑terminated strings because operations like strlen would be O(n) and inefficient for long strings, and append would require copying due to immutable C strings, leading to high CPU usage.
Additionally, native C strings are binary‑unsafe because they stop at the first 0x00 byte, which would truncate binary data.
In summary, native C strings cannot meet Redis's performance and functionality requirements, so a custom string type was designed.
2. SDS String
Redis uses a custom structure called SDS (Simple Dynamic String). The structure is defined as:
<span><span style="color: rgb(0, 0, 136)">struct</span><span style="color: rgb(0, 0, 0)"> SDS</span><span style="color: rgb(102, 102, 0)"><</span><span style="color: rgb(0, 0, 0)">T</span><span style="color: rgb(102, 102, 0)">></span><span style="color: rgb(102, 102, 0)">{</span></span>
<span><span style="color: rgb(0, 0, 0)"> T capacity</span><span style="color: rgb(102, 102, 0)">;</span><span style="color: rgb(136, 0, 0)">// array capacity</span></span>
<span><span style="color: rgb(0, 0, 0)"> T len</span><span style="color: rgb(102, 102, 0)">;</span><span style="color: rgb(136, 0, 0)">// actual length</span></span>
<span><span style="color: rgb(0, 0, 136)">byte</span><span style="color: rgb(0, 0, 0)"> flags</span><span style="color: rgb(102, 102, 0)">;</span><span style="color: rgb(136, 0, 0)">// flag bits</span></span>
<span><span style="color: rgb(0, 0, 136)">byte</span><span style="color: rgb(102, 102, 0)">[]</span><span style="color: rgb(0, 0, 0)"> content</span><span style="color: rgb(102, 102, 0)">;</span><span style="color: rgb(136, 0, 0)">// actual data, ends with a 0x00 terminator</span></span>
<span><span style="color: rgb(102, 102, 0)}">}</span></span>The SDS design solves several problems:
O(1) length retrieval : a stored len field avoids O(n) scans.
Reduced reallocations on append : a redundant‑space strategy minimizes frequent expansions.
Binary‑safe terminator : a dedicated terminator byte allows binary data to be stored safely.
Flag field : indicates which header type (8, 16, 32, or 64‑bit) is used, allowing the structure size to adapt to the string length.
3. Expansion Mechanism
When an SDS has no spare space, Redis expands it. The rules are:
If the new length is less than 1 MiB, the buffer size is doubled.
If the new length exceeds 1 MiB, the buffer grows by an additional 1 MiB.
The relevant C code implements this logic:
<span><span style="color: rgb(0, 0, 0)"> sds sdsMakeRoomFor</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">sds s</span><span style="color: rgb(102, 102, 0)">,</span><span style="color: rgb(102, 0, 102)">size_t</span><span style="color: rgb(0, 0, 0)"> addlen</span><span style="color: rgb(102, 102, 0)">)</span><span style="color: rgb(102, 102, 0)">{</span></span>
<span><span style="color: rgb(0, 0, 136)">void</span><span style="color: rgb(102, 102, 0)">*</span><span style="color: rgb(0, 0, 0)">sh</span><span style="color: rgb(102, 102, 0)">,</span><span style="color: rgb(102, 102, 0)">*</span><span style="color: rgb(0, 0, 0)">newsh</span><span style="color: rgb(102, 102, 0)">;</span></span>
<span><span style="color: rgb(136, 0, 0)">// remaining capacity</span></span>
<span><span style="color: rgb(102, 0, 102)">size_t</span><span style="color: rgb(0, 0, 0)"> avail </span><span style="color: rgb(102, 102, 0)">=</span><span style="color: rgb(0, 0, 0)"> sdsavail</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">s</span><span style="color: rgb(102, 102, 0)">);</span></span>
<span><span style="color: rgb(102, 0, 102)">size_t</span><span style="color: rgb(0, 0, 0)"> len</span><span style="color: rgb(102, 102, 0)">,</span><span style="color: rgb(0, 0, 0)"> newlen</span><span style="color: rgb(102, 102, 0)">;</span></span>
<span><span style="color: rgb(0, 0, 136)">char</span><span style="color: rgb(0, 0, 0)"> type</span><span style="color: rgb(102, 102, 0)">,</span><span style="color: rgb(0, 0, 0)"> oldtype </span><span style="color: rgb(102, 102, 0)">=</span><span style="color: rgb(0, 0, 0)"> s</span><span style="color: rgb(102, 102, 0)">[-</span><span style="color: rgb(0, 102, 102)">1</span><span style="color: rgb(102, 102, 0)">]</span><span style="color: rgb(102, 102, 0)">&</span><span style="color: rgb(0, 0, 0)"> SDS_TYPE_MASK</span><span style="color: rgb(102, 102, 0)">;</span></span>
<span><span style="color: rgb(0, 0, 136)">int</span><span style="color: rgb(0, 0, 0)"> hdrlen</span><span style="color: rgb(102, 102, 0)">;</span></span>
<span><span style="color: rgb(136, 0, 0)">// if current capacity is enough, no need to expand</span></span>
<span><span style="color: rgb(0, 0, 136)">if</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">avail </span><span style="color: rgb(102, 102, 0)">>=</span><span style="color: rgb(0, 0, 0)"> addlen</span><span style="color: rgb(102, 102, 0)">)</span><span style="color: rgb(0, 0, 136)">return</span><span style="color: rgb(0, 0, 0)"> s</span><span style="color: rgb(102, 102, 0)">;</span></span>
<span><span style="color: rgb(136, 0, 0)">// current length</span></span>
<span><span style="color: rgb(0, 0, 0)"> len </span><span style="color: rgb(102, 102, 0)">=</span><span style="color: rgb(0, 0, 0)"> sdslen</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">s</span><span style="color: rgb(102, 102, 0)">);</span></span>
<span><span style="color: rgb(0, 0, 0)"> sh </span><span style="color: rgb(102, 102, 0)">=</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 136)">char</span><span style="color: rgb(102, 102, 0)>")</span><span style="color: rgb(0, 0, 0)">s</span><span style="color: rgb(102, 102, 0)">-</span><span style="color: rgb(0, 0, 0)">sdsHdrSize</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">oldtype</span><span style="color: rgb(102, 102, 0)">);</span></span>
<span><span style="color: rgb(136, 0, 0)">// new length after expansion</span></span>
<span><span style="color: rgb(0, 0, 0)"> newlen </span><span style="color: rgb(102, 102, 0)">=</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">len</span><span style="color: rgb(102, 102, 0)">+</span><span style="color: rgb(0, 0, 0)">addlen</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(102, 102, 0)">;</span></span>
<span><span style="color: rgb(136, 0, 0)">// if new length < 1MiB, double it</span></span>
<span><span style="color: rgb(0, 0, 136)">if</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">newlen </span><span style="color: rgb(102, 102, 0)"><</span><span style="color: rgb(0, 0, 0)"> SDS_MAX_PREALLOC</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(136, 0, 0)">// #define SDS_MAX_PREALLOC (1024*1024)</span></span>
<span><span style="color: rgb(0, 0, 0)"> newlen </span><span style="color: rgb(102, 102, 0)">*=</span><span style="color: rgb(0, 102, 102)">2</span><span style="color: rgb(102, 102, 0)">;</span></span>
<span><span style="color: rgb(0, 0, 136)">else</span></span>
<span><span style="color: rgb(0, 0, 0)"> newlen </span><span style="color: rgb(102, 102, 0)">+=</span><span style="color: rgb(0, 0, 0)"> SDS_MAX_PREALLOC</span><span style="color: rgb(102, 102, 0)">;</span><span style="color: rgb(136, 0, 0)">// otherwise add 1MiB</span></span>
<span><span style="color: rgb(136, 0, 0)">// determine new header type</span></span>
<span><span style="color: rgb(0, 0, 0)"> type </span><span style="color: rgb(102, 102, 0)">=</span><span style="color: rgb(0, 0, 0)"> sdsReqType</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">newlen</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(102, 102, 0)};</span></span>
<span><span style="color: rgb(0, 0, 136)">if</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">type </span><span style="color: rgb(102, 102, 0)">==</span><span style="color: rgb(0, 0, 0)"> SDS_TYPE_5</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(0, 0, 0)"> type </span><span style="color: rgb(102, 102, 0)">=</span><span style="color: rgb(0, 0, 0)"> SDS_TYPE_8</span><span style="color: rgb(102, 102, 0)};</span></span>
<span><span style="color: rgb(136, 0, 0)">// get size of new header</span></span>
<span><span style="color: rgb(0, 0, 0)"> hdrlen </span><span style="color: rgb(102, 102, 0)">=</span><span style="color: rgb(0, 0, 0)"> sdsHdrSize</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">type</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(102, 102, 0)};</span></span>
<span><span style="color: rgb(0, 0, 136)">if</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">oldtype</span><span style="color: rgb(102, 102, 0)">==</span><span style="color: rgb(0, 0, 0)">type</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(102, 102, 0)}{</span></span>
<span><span style="color: rgb(0, 0, 0)"> newsh </span><span style="color: rgb(102, 102, 0)">=</span><span style="color: rgb(0, 0, 0)"> s_realloc</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">sh</span><span style="color: rgb(102, 102, 0)">,</span><span style="color: rgb(0, 0, 0)"> hdrlen</span><span style="color: rgb(102, 102, 0)">+</span><span style="color: rgb(0, 0, 0)">newlen</span><span style="color: rgb(102, 102, 0)">+</span><span style="color: rgb(0, 102, 102)">1</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(102, 102, 0)};</span></span>
<span><span style="color: rgb(0, 0, 136)">if</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">newsh </span><span style="color: rgb(102, 102, 0)">==</span><span style="color: rgb(0, 0, 0)"> NULL</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(0, 0, 136)">return</span><span style="color: rgb(0, 0, 0)"> NULL</span><span style="color: rgb(102, 102, 0)};</span></span>
<span><span style="color: rgb(0, 0, 0)"> s </span><span style="color: rgb(102, 102, 0)">=</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 136)">char</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(0, 0, 0)">newsh</span><span style="color: rgb(102, 102, 0)">+</span><span style="color: rgb(0, 0, 0)">hdrlen</span><span style="color: rgb(102, 102, 0)};</span></span>
<span><span style="color: rgb(102, 102, 0)}
<span style="color: rgb(0, 0, 136)">else</span><span style="color: rgb(102, 102, 0)}{</span></span>
<span><span style="color: rgb(136, 0, 0)">// need to allocate new memory and copy data</span></span>
<span><span style="color: rgb(0, 0, 0)"> newsh </span><span style="color: rgb(102, 102, 0)">=</span><span style="color: rgb(0, 0, 0)"> s_malloc</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">hdrlen</span><span style="color: rgb(102, 102, 0)">+</span><span style="color: rgb(0, 0, 0)">newlen</span><span style="color: rgb(102, 102, 0)">+</span><span style="color: rgb(0, 102, 102)">1</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(102, 102, 0)};</span></span>
<span><span style="color: rgb(0, 0, 136)">if</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">newsh </span><span style="color: rgb(102, 102, 0)">==</span><span style="color: rgb(0, 0, 0)"> NULL</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(0, 0, 136)">return</span><span style="color: rgb(0, 0, 0)"> NULL</span><span style="color: rgb(102, 102, 0)};</span></span>
<span><span style="color: rgb(0, 0, 0)"> memcpy</span><span style="color: rgb(102, 102, 0)">((</span><span style="color: rgb(0, 0, 136)">char</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(0, 0, 0)">newsh</span><span style="color: rgb(102, 102, 0)">+</span><span style="color: rgb(0, 0, 0)">hdrlen</span><span style="color: rgb(102, 102, 0)>,</span><span style="color: rgb(0, 0, 0)"> s</span><span style="color: rgb(102, 102, 0)>,</span><span style="color: rgb(0, 0, 0)"> len</span><span style="color: rgb(102, 102, 0)">+</span><span style="color: rgb(0, 102, 102)">1</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(102, 102, 0)};</span></span>
<span><span style="color: rgb(0, 0, 0)"> s_free</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">sh</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(136, 0, 0)">// free old memory</span></span>
<span><span style="color: rgb(0, 0, 0)"> s </span><span style="color: rgb(102, 102, 0)">=</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 136)">char</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(0, 0, 0)">newsh</span><span style="color: rgb(102, 102, 0)">+</span><span style="color: rgb(0, 0, 0)">hdrlen</span><span style="color: rgb(102, 102, 0)};</span></span>
<span><span style="color: rgb(0, 0, 0)"> s</span><span style="color: rgb(102, 102, 0)"[-</span><span style="color: rgb(0, 102, 102)">1</span><span style="color: rgb(102, 102, 0)]]</span><span style="color: rgb(102, 102, 0)">=</span><span style="color: rgb(0, 0, 0)"> type</span><span style="color: rgb(102, 102, 0)};</span></span>
<span><span style="color: rgb(0, 0, 0)"> sdssetlen</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">s</span><span style="color: rgb(102, 102, 0)">,</span><span style="color: rgb(0, 0, 0)"> len</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(102, 102, 0)};</span></span>
<span><span style="color: rgb(0, 0, 0)"> sdssetalloc</span><span style="color: rgb(102, 102, 0)">(</span><span style="color: rgb(0, 0, 0)">s</span><span style="color: rgb(102, 102, 0)">,</span><span style="color: rgb(0, 0, 0)"> newlen</span><span style="color: rgb(102, 102, 0)})</span><span style="color: rgb(136, 0, 0)">// update capacity</span></span>
<span><span style="color: rgb(0, 0, 136)">return</span><span style="color: rgb(0, 0, 0)"> s</span><span style="color: rgb(102, 102, 0)};</span></span>
<span><span style="color: rgb(102, 102, 0)}
</span>4. redisObject
The top‑level metadata for any Redis value is stored in a redisObject structure:
<span><span style="color: rgb(0, 0, 136)">typedef</span><span style="color: rgb(0, 0, 136)">struct</span><span style="color: rgb(0, 0, 0)"> redisObject </span><span style="color: rgb(102, 102, 0)">{</span></span>
<span><span style="color: rgb(0, 0, 136)">unsigned</span><span style="color: rgb(0, 0, 0)"> type</span><span style="color: rgb(102, 102, 0)">:</span><span style="color: rgb(0, 102, 102)">4</span><span style="color: rgb(102, 102, 0)">;</span><span style="color: rgb(136, 0, 0)">// object type</span></span>
<span><span style="color: rgb(0, 0, 136)">unsigned</span><span style="color: rgb(0, 0, 0)"> encoding</span><span style="color: rgb(102, 102, 0)">:</span><span style="color: rgb(0, 102, 102)">4</span><span style="color: rgb(102, 102, 0)">;</span><span style="color: rgb(136, 0, 0)">// encoding strategy</span></span>
<span><span style="color: rgb(0, 0, 136)">unsigned</span><span style="color: rgb(0, 0, 0)"> lru</span><span style="color: rgb(102, 102, 0)">:</span><span style="color: rgb(0, 102, 102)">24</span><span style="color: rgb(102, 102, 0)">;</span><span style="color: rgb(136, 0, 0)">// LRU timestamp</span></span>
<span><span style="color: rgb(0, 0, 136)">int</span><span style="color: rgb(0, 0, 0)"> refcount</span><span style="color: rgb(102, 102, 0)">;</span><span style="color: rgb(136, 0, 0)">// reference count</span></span>
<span><span style="color: rgb(0, 0, 136)">void</span><span style="color: rgb(102, 102, 0)">*</span><span style="color: rgb(0, 0, 0)">ptr</span><span style="color: rgb(102, 102, 0)">;</span><span style="color: rgb(136, 0, 0)">// pointer to actual data</span></span>
<span><span style="color: rgb(102, 102, 0)}
<span style="color: rgb(0, 0, 0)"> </span><span style="color: rgb(0, 0, 0)">robj</span><span style="color: rgb(102, 102, 0)}
</span>The five fields represent type, encoding, LRU timestamp, reference count, and a pointer to the underlying data.
5. String Encoding Optimizations
Redis automatically selects one of three encodings for strings based on length and content.
5.1 Integer Encoding
If the string consists solely of digits and fits within a C long, Redis stores it as an integer. Small integers (≤ 9999) are cached and shared across keys, similar to Java's Integer cache.
5.2 embstr Encoding
For strings shorter than 39 bytes (44 bytes in Redis 3.2+), Redis allocates a single 64‑byte block that contains both the redisObject and the SDS header. This layout improves cache locality because the entire object fits into one CPU cache line.
5.3 raw Encoding
Longer strings use the raw encoding, where the redisObject and the SDS are allocated separately. This can cause the two structures to be far apart in memory, increasing the chance of cache misses.
Why the 39/44‑byte boundary?
Redis uses jemalloc, which allocates memory in size classes (8, 16, 32, 64 bytes, etc.). A 64‑byte allocation can hold a 16‑byte object header plus the minimum 9‑byte SDS header, leaving 39 bytes for actual data. In Redis 3.2 the SDS header was reduced to 3 bytes, raising the limit to 44 bytes.
End
Thank you for reading. Feel free to comment on any inaccuracies or questions.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Swan Home Tech Team
Official account of Swan Home's Technology Center, covering FE, Native, Java, QA, BI, Ops and more. We regularly share technical articles, events, and updates. Swan Home centers on home scenarios, using doorstep services as a gateway, and leverages an innovative “Internet + life services” model to deliver one‑stop, standardized, professional home services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
