How to Build a Fast Golang Offline RDB Parser for Redis Big‑Key Detection
This article walks through the motivation, design, implementation, performance tuning, deployment, and lessons learned when creating a Golang‑based offline Redis RDB parser that efficiently identifies large keys without impacting a live cluster.
Motivation
Large keys (big‑keys) in Redis can act as hidden bombs that threaten cluster stability. Existing solutions such as redis-cli --bigkeys or the Python‑based redis‑rdb‑tools either require live queries or are too slow for GB‑scale RDB files, and they do not integrate with internal monitoring platforms.
Why Golang
Golang offers high concurrency, compiled static binaries and superior performance, making it ideal for an offline RDB parsing tool that can be deployed on multiple servers without impacting production workloads.
Core Implementation Steps
Select an RDB parsing library The project uses the lightweight library github.com/HDT3213/rdb , which supports Redis 6.0+ formats. Because the original library did not expose internal attributes needed for the tool, it was forked and referenced via replace directives in go.mod .
Parse the RDB file and extract key data The workflow is: backup RDB → open file → initialize decoder → iterate key‑value pairs → filter big keys → output results.
func MyFindBiggestKeys(rdbFilename string, output chan<- RedisData, options ...interface{}) error {
if rdbFilename == "" {
return errors.New("src file path is required")
}
rdbFile, err := os.Open(rdbFilename)
if err != nil {
return fmt.Errorf("open rdb %s failed, %v", rdbFilename, err)
}
defer rdbFile.Close()
dec := core.NewDecoder(rdbFile)
// optional wrapper for custom options
if dec, err = wrapDecoder(dec, options...); err != nil {
return err
}
err = dec.Parse(func(object model.RedisObject) bool {
data := RedisData{Data: object, Err: nil}
select {
case output <- data:
return true
case <-time.After(5 * time.Second):
err = errors.New("send to output channel timeout")
return false
}
})
if err != nil {
return fmt.Errorf("parse rdb failed: %w", err)
}
return nil
}Identify big keys For each RedisObject , the memory size is compared against a configurable threshold. Keys exceeding the threshold are sent to the result channel.
if uint64(data.Data.GetSize()) <= *size {
continue // not a big key
}
// process big keyPersist results Big‑key information is stored in a MySQL table for later analysis and Grafana visualization.
rediskey := &models.RedisKey{
JobID: jobID,
RedisName: redisName,
Key: data.Data.GetKey(),
Type: data.Data.GetType(),
Size: int64(data.Data.GetSize()),
CreatedAt: time.Now(),
}
if err := b.ResultV1().CreateTaskResult(ctx, rediskey); err != nil {
slog.Error("operation failed", "err", err, "key", data.Data.GetKey())
}Performance Optimizations
Concurrent parsing Multiple RDB files are processed in parallel using Goroutines and a sync.WaitGroup , improving throughput on multi‑node clusters.
Streaming to reduce memory usage Instead of loading the entire RDB into memory, each key is processed and sent to the output channel immediately, preventing the program from consuming tens of gigabytes of RAM.
Deployment and Automation
The tool is compiled into static binaries for Linux and Windows via a Makefile, enabling easy distribution across servers.
# Build Linux binary
GOOS=linux GOARCH=amd64 go build -o $(OUTPUT_DIR)/$(BINARY_NAME)-linux-amd64 $(MAIN_FILE)
# Build Windows binary
GOOS=windows GOARCH=amd64 go build -o $(OUTPUT_DIR)/$(BINARY_NAME)-windows-amd64.exe $(MAIN_FILE)Run the binary with a configuration file, for example:
./rdb-bigkey-linux-amd64 -c configs/rdb-server.yamlPitfalls and Solutions
RDB format incompatibility Older Redis versions (e.g., 4.0) produce RDB files that the chosen library cannot parse. The fix is to switch to a library that supports multiple versions or extend the forked library.
Memory explosion on huge RDBs Parsing a 20 GB RDB can exceed 40 GB RAM if the whole file is loaded. Enabling streaming parsing—processing each database block and releasing memory immediately—solves the issue.
Results and Future Work
The Golang RDB big‑key parser has been running in production for two months, achieving a 3‑5× speedup over the previous Python tool (20 GB files parsed in ~15 minutes). Planned enhancements include automatic business‑line classification based on key prefixes and trend analysis to alert on continuously growing big keys.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
