Why Assigning Array Elements from the End First Speeds Up Go Code

The article explains how the vtprotobuf library writes to a byte slice from the tail toward the head, eliminating bounds‑check instructions, and demonstrates this effect with simple functions, compiler diagnostics, and references to the Go compiler source.

Golang Shines
Golang Shines
Golang Shines
Why Assigning Array Elements from the End First Speeds Up Go Code

The author discovered the github.com/planetscale/vtprotobuf library, which claims to be the fastest Go protobuf implementation. Its marshal function writes to a byte slice by assigning the tail elements first and then moving backward toward the head.

Key snippet from MarshalToSizedBufferVT shows this pattern:

func (m *Child) MarshalToSizedBufferVT(dAtA []byte) (int, error) {
    if m == nil {
        return 0, nil
    }
    i := len(dAtA)
    if len(m.ChildName) > 0 {
        i -= len(m.ChildName)
        copy(dAtA[i:], m.ChildName)
        i = protohelpers.EncodeVarint(dAtA, i, uint64(len(m.ChildName)))
        i--
        dAtA[i] = 0x12
    }
    if m.ChildId != 0 {
        i = protohelpers.EncodeVarint(dAtA, i, uint64(m.ChildId))
        i--
        dAtA[i] = 0x8
    }
    return len(dAtA) - i, nil
}

The author hypothesised that writing the larger index first lets the compiler infer that later accesses with smaller indices are safe, thus removing the bounds‑check code.

Two equivalent functions illustrate the effect:

func f1(arr []byte) {
    arr[0] = 1
    arr[9] = 2
}

func f2(arr []byte) {
    arr[9] = 2
    arr[0] = 1
}

Running go tool compile -d=ssa/check_bce/debug=1 bce.go produces the following SSA output:

func f1(arr []byte) {
    arr[0] = 1 // still has bounds check
    arr[9] = 2 // still has bounds check
}

func f2(arr []byte) {
    arr[9] = 2 // still has bounds check
    arr[0] = 1 // no bounds check found, it was eliminated
}

The missing check in f2 demonstrates that when a larger index is accessed first, the compiler learns the relationship 0 <= index < len and can safely drop the subsequent check for the smaller index.

This behaviour is documented in the Go compiler source file cmd/compile/internal/ssa/prove.go. The OpIsInBounds operation records an index‑in‑bounds proof; when true, the compiler updates its knowledge that 0 <= a0 < a1 (signed) or a0 < a1 (unsigned), allowing later accesses to be proven safe.

Consequently, vtprotobuf’s strategy of assigning array elements from the tail first reduces the number of generated bounds‑check instructions, leading to measurable performance gains.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PerformanceGoprotobufcompiler optimizationbounds check eliminationvtprotobuf
Golang Shines
Written by

Golang Shines

We share daily the latest Golang technical articles, practical resources, language news, tutorials, and real-world projects to help everyone learn and improve.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.