How to Let Go Programs Profile Themselves Automatically
This article explains why traditional pprof sampling often fails in production, introduces Go's built‑in profiling tools and runtime.pprof, defines practical rules for triggering automatic sampling based on resource spikes, and demonstrates using the open‑source Holmes library and Docker demo to collect self‑diagnostic profiles.
Analyzing the performance of online services is challenging because it is hard to capture runtime information at the moment an anomaly occurs. Traditional approaches that manually request pprof endpoints often miss short‑lived spikes, especially when a service restarts automatically in Kubernetes.
Go's profiling tools
The Go pprof toolkit provides several built‑in samplers:
profile: CPU sampling
heap: memory allocation sampling
goroutine: stack traces of all goroutines
allocs: allocation events since start (including reclaimed memory)
threadcreate: stack traces for thread creation
How to obtain sampling data
Typical usage exposes HTTP routes that a client can query:
import (
"net/http/pprof"
)
func main() {
http.HandleFunc("/debug/pprof/heap", pprof.Index)
http.HandleFunc("/debug/pprof/profile", pprof.Profile)
http.HandleFunc("/", index)
// ...
http.ListenAndServe(":80", nil)
}Running the tool manually:
$ go tool pprof http://localhost/debug/pprof/profileDrawbacks of this method include:
Requires a client to request a specific route, so it cannot capture the first moment of a resource spike.
Registers several /debug/pprof routes, adding intrusion to the web service.
Non‑web services need an extra HTTP port, increasing invasiveness.
Runtime pprof
The runtime/pprof package offers a Lookup function to sample resources directly from the running process:
// lookup takes a profile name
pprof.Lookup("heap").WriteTo(some_file, 0)
pprof.Lookup("goroutine").WriteTo(some_file, 0)
pprof.Lookup("threadcreate").WriteTo(some_file, 0)CPU sampling can be performed with runtime/pprof:
bf, err := os.OpenFile('tmp/profile.out', os.O_RDWR|os.O_CREATE|os.O_APPEND, 0644)
err = pprof.StartCPUProfile(bf)
time.Sleep(2 * time.Second)
pprof.StopCPUProfile()This method writes profiles directly to a file without exposing extra ports, but continuous sampling degrades performance, and fixed‑interval sampling may miss critical moments.
When to trigger sampling
The most effective strategy is to let the Go process invoke pprof only when resource usage spikes or exceeds predefined thresholds.
Rules for determining sampling points
CPU, memory, and goroutine counts can be expressed numerically, allowing two simple rules:
A sudden increase (e.g., 25% above the recent average) indicates a resource spike.
Usage exceeding a hard threshold (e.g., 80% of allocated memory) signals sustained pressure.
Rule 1 captures brief, intense spikes; Rule 2 captures gradual climbs that eventually breach limits.
To detect sudden spikes without historical data, the program can keep a short‑term rolling window (e.g., the last 5‑10 intervals) and compare the current value against the average.
When a spike is detected, the program can automatically dump a memory profile, preserving the stack trace for later analysis even if the service restarts.
Open‑source automatic sampling library
The community provides the Holmes library, which implements unattended automatic dumps based on configurable thresholds.
WithCollectInterval("2s") – sets a 2‑second monitoring interval (recommend >10 s in production).
WithMemDump(3, 25, 80) – triggers a dump when memory usage exceeds 3 % and either a 25 % sudden rise or an absolute 80 % threshold is reached.
Example integration:
package main
import (
"net/http"
"time"
"github.com/mosn/holmes"
)
func init() {
http.HandleFunc("/make1gb", make1gbslice)
go http.ListenAndServe(":10003", nil)
}
func main() {
h, _ := holmes.New(
holmes.WithCollectInterval("2s"),
holmes.WithCoolDown("1m"),
holmes.WithDumpPath("/tmp"),
holmes.WithTextDump(),
holmes.WithMemDump(3, 25, 80),
)
h.EnableMemDump().Start()
time.Sleep(time.Hour)
}
func make1gbslice(wr http.ResponseWriter, req *http.Request) {
var a = make([]byte, 1073741824)
_ = a
}Running the program produces heap profiles such as:
heap profile: 0: 0 [1: 1073741824] @ heap/1048576
0: 0 [1: 1073741824] @ 0x42ba3ef 0x4252254 0x4254095 0x4254fd3 0x425128c 0x40650a1
# 0x42ba3ee main.make1gbslice+0x3e /path/to/1gbslice.go:24
# 0x4252253 net/http.HandlerFunc.ServeHTTP+0x43 /usr/local/go/src/net/http/server.go:2012
# 0x4254094 net/http.(*ServeMux).ServeHTTP+0x1a4 /usr/local/go/src/net/http/server.go:2387
# 0x4254fd2 net/http.serverHandler.ServeHTTP+0xa2 /usr/local/go/src/net/http/server.go:2807
# 0x425128b net/http.(*conn).serve+0x86b /usr/local/go/src/net/http/server.go:1895For more details, see the Holmes repository’s Get Started guide. A Docker image is also provided for quick experimentation:
docker run --name go-profile-demo -v /tmp:/tmp -p 10030:80 --rm -d kevinyan001/go-profilingThe container exposes three routes for memory, CPU overload, and channel blockage, allowing you to generate load and observe automatic dumps stored in the mapped /tmp directory.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Xiao Lou's Tech Notes
Backend technology sharing, architecture design, performance optimization, source code reading, troubleshooting, and pitfall practices
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
