Understanding Hash Functions, Hash Tables, and Their Implementation in Redis
This article explains the concept of hash functions and hash tables, illustrates how they map data to array indices, discusses collision resolution methods, and details Redis's internal dictionary implementation, including zipmap optimization, operation complexities, and practical usage differences between hashes and sets.
A hash (also called a hash or digest) transforms an input of arbitrary length into a fixed‑length output, providing a compression mapping where different inputs may produce the same output, making it impossible to recover the original input from the hash value.
A hash table stores data in an array so that each key can be accessed via its index, achieving an average lookup time of O(1). The article shows two mapping schemes for four integers (6, 7, 9, 12): a sparse array of length 13 that wastes space, and a compact hash table of length 4 using modulo arithmetic.
When different keys map to the same slot (a collision), common resolution strategies such as open addressing, re‑hashing, and chaining are used; Redis adopts the chaining method, linking colliding entries in a linked list.
In Redis, a hash is called a dictionary and is implemented with a hash table. Each dictionary entry stores a key‑value pair, and Redis may maintain two hash tables ( ht[2]) to gradually expand capacity. Keys are placed using a function like HASH(key) MOD N.
Redis hashes consist of field ‑ value mappings. Small hashes use a zipmap (also known as a small hash) to reduce metadata overhead, allowing up to 2^32‑1 entries. Operations on zipmap have average complexity O(1) because the number of fields is limited, while larger hashes use a regular hash table.
The article compares Redis hash (via hset) with set structures: set stores each key‑value pair as a separate key with O(1) operations and supports expiration per key, whereas hash stores multiple fields under a single key, offering memory savings but higher complexity ( O(n)) for operations that depend on the number of fields.
Typical Redis hash commands are illustrated, e.g., hset key field value, which creates the hash if it does not exist and overwrites existing fields. An example error is shown when the argument count is incorrect.
Configuration parameters for zipmap limits are provided:
# Configure maximum number of fields in a zipmap
hash-max-zipmap-entries 512
# Configure maximum field value size (bytes)
hash-max-zipmap-value 64When both limits are satisfied, Redis compresses the hash; otherwise it falls back to a regular hash structure.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
