Backend Development 16 min read

Automated Performance Profiling for Go Services with Conan

Conan is an automated profiling solution for Go microservices that embeds an SDK to continuously or adaptively sample CPU, memory and goroutine metrics, detects anomalies via user‑defined rules, uploads data to a Pyroscope server, and reports results through Feishu or Pyroscope, delivering sub‑5 % overhead and faster root‑cause analysis.

DeWu Technology
DeWu Technology
DeWu Technology
Automated Performance Profiling for Go Services with Conan

Online performance troubleshooting is a mandatory step for Go developers. Traditional methods such as log analysis, distributed tracing, and manual pprof profiling are either insufficient for high‑CPU or memory‑leak issues or require manual intervention.

In the Golang ecosystem, net/http/pprof provides profiling data, often visualized as flame graphs. However, relying on developers to manually trigger profiling does not scale in large microservice environments.

Conan is an automated profiling solution that integrates with Go services to collect, store, and visualize profile data without manual effort.

Architecture

Conan consists of a client SDK embedded in the application and a server side built on Pyroscope . The SDK automatically captures profile data at appropriate moments and uploads it to the server, where Pyroscope stores the data and provides a visual UI.

Working Modes

Conan supports two modes:

Adaptive mode : profiling is triggered only when a predefined performance anomaly is detected.

Continuous mode : profiling runs at a fixed interval (default every 5 seconds) regardless of the service state.

Both modes share a common data‑collection pipeline that samples three key metrics: CPU usage, RSS (resident set size), and the number of goroutines.

Metric collection details

CPU metric : Conan uses gopsutil to read CPU usage. Because gopsutil returns a value multiplied by the number of CPU cores, Conan normalizes it to a 0‑100 % scale. The number of usable cores is determined by:

Reading runtime.GOMAXPROCS when the process limits itself.

Parsing /sys/fs/cgroup/cpu/cpu.cfs_quota_us and /sys/fs/cgroup/cpu/cpu.cfs_period_us for container‑restricted processes.

Falling back to runtime.NumCPU for bare‑metal processes.

RSS metric : Collected via gopsutil . The usable memory limit is obtained either from the cgroup file /sys/fs/cgroup/memory/memory.limit_in_bytes (container) or from the OS (bare metal).

Goroutine metric : Retrieved directly with runtime.NumGoroutine .

Rule evaluation

Collected metrics are matched against user‑defined rules to decide whether an anomaly occurred. Two main anomaly types are considered:

Spike : a short‑term surge in resource usage.

Gradual increase : a slow rise to a high water‑mark.

For spikes, Conan uses a relative (环比) rule, e.g., CPU increase >30 % compared to the recent N‑sample average, combined with an absolute threshold (e.g., CPU >40 %). For gradual increases, an absolute threshold (e.g., CPU >50 %) is applied.

Additional safeguards limit profiling overhead: profiling stops when CPU usage exceeds a configured ceiling, and goroutine profiling is disabled after a configurable goroutine count.

Reporters

After a profiling session, Conan hands the data to a reporter component. Two built‑in reporters are provided:

Feishu reporter : stores the profile locally (default /tmp ) and sends a Feishu webhook message with a download link.

Pyroscope reporter : pushes the profile directly to a Pyroscope server for centralized storage and visualization.

The reporter interface can be extended by implementing the following Go interface:

type ProfileReporter interface {
    Report(...) error
    Name() string
}

func WithProfileReporter(r ...ProfileReporter) Option {
    // ...
}

Stability verification

Conan has been deployed in dozens of core services at 得物 for over a year. Chaos testing showed that, even under extreme load (CPU, memory, goroutine spikes), Conan’s overhead stays below 5 %.

Case study

A nightly script caused a CPU spike at 04:00 AM. Traditional alerts missed the issue, but Conan captured the profile, which revealed a third‑party library consuming excessive CPU during sorting. The problem was fixed after analysis, demonstrating Conan’s value in rapid root‑cause identification.

Conclusion

Conan provides an end‑to‑end automated profiling solution for Go microservices, covering detection, data collection, storage, and visualization, thereby improving stability and reducing troubleshooting time.

microservicesautomationGoPerformance ProfilingpprofPyroscope
DeWu Technology
Written by

DeWu Technology

A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.