Redis String Data Structure: Implementation, Encoding Formats, and Operations
This article explains Redis string basics, its mutable SDS implementation, common commands, internal memory layout, and the three encoding formats (int, embstr, raw) that determine how strings are stored and optimized in the database.
Introduction
Redis offers five fundamental data structures; the string type is the simplest and most widely used. Although simple on the surface, its internal design is highly refined.
Basic Overview
Unlike Java, Redis strings are mutable dynamic strings (Simple Dynamic String, SDS) whose internal structure resembles an ArrayList that maintains a byte array with pre‑allocated spare space to reduce frequent memory allocations. When the string length is less than 1 MB, each expansion doubles the existing space; for lengths exceeding 1 MB, each expansion adds 1 MB. The maximum string length is 512 MB.
Typical string operations include setting, getting, and batch commands that reduce network overhead:
> set name test<br/>OK<br/>> get name<br/>"test"<br/>> mset name1 test1 name2 test2<br/>OK<br/>> mget name1 name2<br/>1) "test1"<br/>2) "test2"<br/>> del name<br/>(integer) 1Redis strings can also store integers and support atomic increment operations. Integer values are stored in the range –2⁶⁴ to 2⁶⁴‑1; values outside this range are treated as ordinary strings and cannot be incremented. Because a string consists of bytes (8 bits each), it can also be used as a bitmap.
> set foo 1<br/>OK<br/>> get foo<br/>"1"<br/>> incr foo<br/>(integer) 2<br/>> get foo<br/>"2"Internal Principles
Basic Implementation
The core structure of a Redis string is shown in the diagram (content omitted). The content field holds the actual bytes and is terminated by a 0x0 byte that is not counted in the length.
struct SDS{<br/> T capacity; // array capacity<br/> T len; // actual length<br/> byte flags; // flag bits, low three indicate type<br/> byte[] content; // array content<br/>}Both capacity and len are generic types rather than plain int to allow Redis to use the smallest possible integer type for each string, minimizing memory waste.
Encoding Formats
Redis strings can be stored using three encoding formats: int , embstr , and raw . Understanding these formats requires knowledge of the RedisObject header that precedes every Redis value.
struct RedisObject{<br/> int4 type; // data type (5 kinds)<br/> int4 encoding; // internal encoding (int, embstr, raw, …)<br/> int24 lru; // LRU information for memory eviction<br/> int32 refcount; // reference count<br/> void *ptr; // pointer to actual data<br/>}int Encoding
When the stored value fits into a 64‑bit signed integer, Redis uses the int encoding, enabling fast atomic increment operations. Values in the range [0, 1000) are stored as shared objects, avoiding extra allocations.
> set foo 1<br/>OK<br/>> object encoding foo<br/>"int"<br/>> debug object foo<br/>Value at:0x7f44b020aca0 refcount:2147483647 encoding:int serializedlength:2 lru:14691591 lru_seconds_idle:72588Both foo and foo2 point to the same shared object address.
embstr Encoding
For short strings (length ≤ 44 bytes), Redis uses the embstr (embedded string) encoding. The SDS structure is embedded directly inside the RedisObject, and a single malloc call allocates a contiguous memory block.
Diagram illustrating the embedded layout (image omitted).
raw Encoding
For longer strings (length > 44 bytes), Redis switches to the raw encoding. In this case the RedisObject and the SDS are allocated separately, so their memory addresses are not contiguous.
Diagram illustrating the separate allocation (image omitted).
Thoughts
The boundary between embstr and raw is 44 bytes because jemalloc, Redis’s default allocator, allocates memory in powers of two. The smallest allocation that can hold an entire embstr object is 32 bytes; the next size class is 64 bytes. Strings that would require more than 44 bytes of actual content (45 bytes including the terminating 0x0) fall into the next size class and are therefore stored as raw.
Thus, the practical limit for an embstr string is 44 bytes of content.
— THE END —
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
