Backend Development 14 min read

Boost Go Performance: 7 Practical Optimization Techniques

This article presents seven practical Go performance optimization techniques—including using sync.Pool, avoiding pointer‑heavy maps, generating marshal code, leveraging strings.Builder, preferring strconv over fmt, pre‑allocating slices, and passing byte slices—to reduce garbage collection overhead, improve allocation efficiency, and achieve up to 97% faster execution.

360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Boost Go Performance: 7 Practical Optimization Techniques

Introduction

In this article we share a collection of Go performance‑optimisation tricks that require only modest code changes but can dramatically improve program speed and reduce memory pressure.

0. Benchmark Baseline

Before making any changes, establish a baseline using pprof or custom benchmarks to measure the impact of optimisations.

1. Reuse Objects with sync.Pool

sync.Pool maintains a free list of previously allocated objects, allowing reuse and reducing allocation overhead. Example:

<code>var bufpool = sync.Pool{New: func() interface{} { buf := make([]byte, 512); return &amp;buf }}</code>

Retrieve with Get() and return with Put() . Remember to reset fields before putting objects back to avoid leaking data.

<code>bp := bufpool.Get().(*[]byte)
b := *bp
defer func() { *bp = b; bufpool.Put(bp) }()
buf := bytes.NewBuffer(b)</code>

In Go 1.13 and earlier the pool is cleared on each GC cycle, which may affect performance.

2. Avoid Large Maps with Pointer Keys

Maps with string keys cause the GC to scan every pointer during collection. Replacing map[string]int with map[int]int reduces GC work dramatically.

<code>package main
import (
    "fmt"
    "runtime"
    "strconv"
    "time"
)

const numElements = 10000000
var foo = map[int]int{}

func timeGC() { t := time.Now(); runtime.GC(); fmt.Printf("gc took: %s\n", time.Since(t)) }

func main() {
    for i := 0; i < numElements; i++ { foo[i] = i }
    for { timeGC(); time.Sleep(1 * time.Second) }
}</code>

Switching to an integer key cut GC time from ~100 ms to ~4 ms (≈97% reduction).

3. Generate Marshal Code to Avoid Reflection

Reflection‑based JSON (or other) marshaling is slower than generated code. Tools like easyjson inspect structs and produce high‑performance marshalers that implement the standard json.Marshaller interface.

<code>easyjson -all $file.go</code>

The generated $file_easyjson.go replaces the default reflection path.

4. Build Strings with strings.Builder

Strings are immutable; concatenating them repeatedly allocates new buffers. strings.Builder writes to an internal byte buffer and creates the final string only once.

<code>var strs = []string{"here's","a","some","long","list","of","strings","for","you"}

func buildStrNaive() string { var s string; for _, v := range strs { s += v }; return s }

func buildStrBuilder() string { var b strings.Builder; b.Grow(60); for _, v := range strs { b.WriteString(v) }; return b.String() }</code>

Benchmarks show a 4.7× speedup and an 8× reduction in allocations.

5. Prefer strconv Over fmt for Number‑to‑String

Converting integers to strings with strconv.Itoa is significantly faster than fmt.Sprintf .

<code>func strconvFmt(a string, b int) string { return a + ":" + strconv.Itoa(b) }
func fmtFmt(a string, b int) string { return fmt.Sprintf("%s:%d", a, b) }</code>

Benchmarks reveal a 3.5× speed improvement and lower allocation counts.

6. Pre‑allocate Slices with make

When the final length of a slice is known, allocate it with make to set capacity upfront, avoiding repeated reallocations and copies.

<code>userIDs := make([]string, 0, len(rsp.Users))
for _, bar := range rsp.Users { userIDs = append(userIDs, bar.ID) }</code>

This eliminates extra allocations during growth.

7. Pass Byte Slices Directly

Prefer APIs that accept a []byte argument (e.g., time.AppendFormat ) instead of returning a new string, allowing reuse of buffers from a pool and reducing allocations.

Conclusion

By applying these techniques—object pooling, map key optimisation, generated marshalers, efficient string building, strconv usage, slice pre‑allocation, and byte‑slice passing—you can build a mental model for Go performance, dramatically reduce GC overhead, and improve overall application speed.

PerformanceOptimizationGoBenchmarksync.Poolstrconvstrings.Builder
360 Zhihui Cloud Developer
Written by

360 Zhihui Cloud Developer

360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.