Backend Development 16 min read

Using Go pprof for Online Performance Profiling: Case Studies and Lessons

The article demonstrates how Go’s built‑in pprof tools can be used for live performance profiling, walking through two real‑world cases—one where a malformed JSON request caused massive object allocation and CPU spikes, and another where per‑call self‑referencing structs leaked memory—while offering practical tips on input validation, allocation reduction, and GC monitoring.

Didi Tech
Didi Tech
Didi Tech
Using Go pprof for Online Performance Profiling: Case Studies and Lessons

This article introduces online performance problem diagnosis and optimization for developers, focusing on profiling as a powerful tool. Profiling collects runtime events and samples, enabling precise pinpointing of bottlenecks. The Go language’s built‑in runtime/pprof and net/http/pprof packages, together with visual tools, are used as examples.

What profiling is

Profiling (performance analysis) records CPU usage, memory consumption, thread states, and blocking information while a program runs. By examining these metrics, developers can locate the root cause of performance issues.

Go support for profiling

Go provides the runtime/pprof library and an HTTP endpoint for on‑the‑fly analysis. Adding a single import and starting an HTTP server is enough to expose profiling data:

import _ "net/http/pprof"

func main() {
    go func() {
        log.Println(http.ListenAndServe("0.0.0.0:8005", nil))
    }()
    // ... business logic
}

After deployment, developers can fetch a 30‑second CPU profile with:

go tool pprof -http=:1234 http://your-prd-addr:8005/debug/pprof/profile?seconds=30

The tool renders a flame graph that visualizes which functions consume CPU.

Case Study 1 – CPU usage spikes to 99%

Symptoms:

CPU idle drops to ~0% on three machines simultaneously.

Issue is intermittent and resolves after about two hours.

Steps taken:

Enabled pprof on the production service.

Collected a CPU profile when the problem re‑occurred.

The flame graph highlighted GetLeadCallRecordByLeadId as the dominant CPU consumer, especially its database calls. A deeper look revealed unusually high activity in runtime.gcBgMarkWorker , indicating a GC pressure caused by a massive number of short‑lived objects.

Further investigation showed that the endpoint /lp-api/v2/leadCallRecord/getLeadCallRecord was receiving a JSON string for leadId (should be an integer). The malformed request caused the SQL builder to treat the string as a parameter, pulling millions of rows and creating billions of objects.

[net/http.HandlerFunc.ServeHTTP/server.go:1947] _com_request_in||traceid=091d682895eda2fsdffsd0cbe3f9a95||spanid=297b2a9sdfsdfsdfb8bf739||hintCode=||hintContent=||method=GET||host=10.88.128.40:8000||uri=/lp-api/v2/leadCallRecord/getLeadCallRecord||params=leadId={"id":123123}||from=10.0.0.0||proto=HTTP/1.0

Root cause: the backend function GetLeadCallRecord accepted leadId string without type validation, directly embedding the value into the SQL query.

func GetLeadCallRecord(leadId string, bizType int) ([]model.LeadCallRecords, error) {
    sql := "SELECT record.* FROM lead_call_record AS record " +
        "where record.lead_id  = {{leadId}} and record.biz_type = {{bizType}}"
    conditions := make(map[string]interface{}, 2)
    conditions["leadId"] = leadId
    conditions["bizType"] = bizType
    cond, val, err := builder.NamedQuery(sql, conditions)
}

Fix: enforce correct parameter types and validate inputs before building SQL.

Case Study 2 – Memory usage climbs to 90%+

Symptoms:

CPU remains low (idle >85%).

Memory grows rapidly from 2 GB to 15 GB within weeks.

Profiling the heap revealed that 92 % of live objects originated from event.GetInstance , each occupying only 16 bytes.

var (
    firstActivationEventHandler FirstActivationEventHandler
    firstOnlineEventHandler FirstOnlineEventHandler
)

func GetInstance(eventType string) Handler {
    if eventType == FirstActivation {
        firstActivationEventHandler.ChildHandler = firstActivationEventHandler
        return firstActivationEventHandler
    } else if eventType == FirstOnline {
        firstOnlineEventHandler.ChildHandler = firstOnlineEventHandler
        return firstOnlineEventHandler
    }
    // ... other cases omitted
    return nil
}

The function creates a self‑referencing struct on every call, causing a massive linked list of 16‑byte objects that the GC cannot reclaim.

func init() {
    firstActivationEventHandler.ChildHandler = &firstActivationEventHandler
    firstOnlineEventHandler.ChildHandler = &firstOnlineEventHandler
    // ... omitted
}

Underlying Go runtime details explain the 16‑byte size (a pointer to the type table and a data pointer):

type iface struct {
    tab  *itab
    data unsafe.Pointer
}

type eface struct {
    _type *_type
    data  unsafe.Pointer
}

func convT2I(tab *itab, elem unsafe.Pointer) (i iface) {
    t := tab._type
    if raceenabled {
        raceReadObjectPC(t, elem, getcallerpc(), funcPC(convT2I))
    }
    if msanenabled {
        msanread(elem, t.size)
    }
    x := mallocgc(t.size, t, true)
    typedmemmove(t, x, elem)
    i.tab = tab
    i.data = x
    return
}

Fix: initialize the singleton handlers once (e.g., in init ) and avoid per‑call allocations.

Key Takeaways

When GC‑related functions dominate CPU, inspect object counts ( --inuse/alloc_objects ).

For CPU‑bound issues, focus on object allocation volume.

For memory‑bound issues, monitor allocated space ( --inuse/alloc_space ).

Always validate input types, especially for SQL parameters.

Prefer passing pointers to structs to reduce copying and heap pressure.

Avoid unnecessary cyclic references and per‑call initialization.

By leveraging Go’s pprof tooling, developers can systematically trace performance problems from high‑level symptoms down to concrete code defects.

Backend DevelopmentGoPerformance ProfilingpprofCPU optimizationMemory Leak
Didi Tech
Written by

Didi Tech

Official Didi technology account

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.