Backend Development 12 min read

Troubleshooting Golang Memory Leaks: A Production Case Study

The case study walks through debugging a Go production service that regularly spiked to over 6 GB of resident memory, revealing that unbuffered channel leaks, mis‑configured HTTP client timeouts, and ultimately a cgo‑based image‑processing library spawning unmanaged threads caused the leaks, and outlines a systematic troubleshooting workflow.

Tencent Cloud Developer
Tencent Cloud Developer
Tencent Cloud Developer
Troubleshooting Golang Memory Leaks: A Production Case Study

This article presents a comprehensive case study of debugging and resolving memory leaks in a production Go (trpc-go) application. The author walks through the entire troubleshooting journey, starting with the initial symptom of memory spiking to 6GB+ RES every evening around 8 PM.

The investigation proceeds through multiple stages: First, pprof heap analysis showed normal memory allocation patterns. Then, goroutine analysis revealed channel blocking issues where unbuffered channels caused goroutine leaks when timeouts occurred. The fix involved adding buffer to channels: ch := make(chan struct{}, 1) .

After fixing the channel issue, memory continued to spike. Further investigation found HTTP client timeout misconfiguration - only DialContext timeout was set without the overall Client Timeout. The fix added: Timeout: time.Second * 4 .

Even after these fixes, the problem persisted. The breakthrough came when analyzing ThreadNum metrics, which perfectly correlated with memory usage. This led to the discovery that cgo was the root cause - the service used a cgo-based image processing library that was creating unmanaged threads. The final solution was rewriting the image processing module in pure Go.

The article concludes with a systematic approach to Go memory leak debugging: 1) Use top to check RES memory, 2) Use pprof heap analysis to find allocation issues, 3) Use pprof goroutine analysis to find goroutine leaks, 4) If all above are normal, investigate cgo-related thread leaks. The author summarizes: "In 10 Go memory leaks, 8 are goroutine leaks, 1 is true memory leak, and 1 is cgo-related."

debugginggolangpprofMemory Leakgoroutineperformance-optimizationcgo
Tencent Cloud Developer
Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.