How to Safely Scan Large Redis Keyspaces with the SCAN Command
This article explains why using the KEYS command on massive Redis datasets can cause service blockage, analyzes the underlying O(n) traversal cost, and demonstrates how the incremental SCAN command with cursor, MATCH, and COUNT options provides a non‑blocking alternative for efficiently iterating keys.
When monitoring Redis usage, especially for keys with specific prefixes, many operators mistakenly use the KEYS command (e.g., keys user_token* ) to list matching keys. In a production environment with millions of login tokens stored as user_token:userid , this approach caused Redis to become unresponsive because KEYS performs a full O(n) scan of the keyspace.
The KEYS command traverses every entry in Redis, and since Redis is single‑threaded, the operation blocks all other commands until it finishes, leading to a “dead” Redis instance under heavy load.
To avoid this, the article recommends using the SCAN command, which iterates the keyspace incrementally using a cursor. Although its theoretical complexity is also O(n), it processes a small portion of the dictionary at a time, preventing thread blockage.
SCAN command format: SCAN cursor [MATCH pattern] [COUNT count] The cursor starts at 0 and returns a new cursor with each call; when the returned cursor is 0, the iteration is complete. The MATCH option filters keys by pattern, and COUNT hints how many entries to examine per iteration (it is not a limit on returned results).
Key characteristics of SCAN:
Runs in O(n) but yields results in small batches, so it does not block the server.
Provides a COUNT parameter to control the number of slots examined per call.
Supports pattern matching similar to KEYS.
Stateless on the server side; the client only needs to keep the cursor value.
Returned results may contain duplicates, so the client must deduplicate them.
An empty result set does not necessarily mean the scan is finished; the cursor must be checked.
Example usage (illustrated with screenshots in the original article): start with SCAN 0 MATCH user_token:* COUNT 100 , receive a cursor (e.g., 6) and a subset of matching keys, then continue scanning with SCAN 6 … until the cursor returns to 0.
In summary, understanding the difference between KEYS and SCAN is essential for interview questions and real‑world operations; using SCAN prevents performance degradation when dealing with large Redis datasets.
Architect's Tech Stack
Java backend, microservices, distributed systems, containerized programming, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.