Why Go’s range Loop Can Slow You Down with Large Structs—and How to Fix It
In Go, using a range loop on slices of large structs implicitly copies each element, leading to significant performance loss, and modifying the loop variable does not affect the original slice; this article explains the copying behavior, benchmarks three loop styles, and offers practical guidelines to write fast and correct code.
1. Large or Complex Struct Scenarios
When iterating over a slice of structs that contain large fields (e.g., an 8 KB byte array), the common pattern for _, v := range xs { … } copies each element by value, moving the entire large field on every iteration.
package main
// User struct with an 8KB metadata array
type User struct {
Email string `json:"email"`
Name string `json:"name"`
Age int `json:"age"`
Metadata [8192]byte `json:"metadata"`
}Each copy of User transfers roughly 8 KB of data, which can become a bottleneck even without heap allocation.
2. The Semantics of range
Using for _, user := range users creates a new variable user that holds a copy of users[i]. Printing the addresses of the iteration variable and the original slice element shows they differ, confirming the copy.
2.1 Address Comparison Example
package main
import "fmt"
func main() {
users := []User{{Email:"[email protected]", Name:"Alice", Age:25}, {Email:"[email protected]", Name:"Bob", Age:30}, {Email:"[email protected]", Name:"Charlie", Age:35}}
fmt.Println("
=== Using 'for i, user := range users' (deepcopy) ===")
for i, user := range users {
fmt.Printf("Index: %d, Copy address: %p, Original address: %p
", i, &user, &users[i])
}
fmt.Println("
=== Using 'for i := range users' (reference via users[i]) ===")
for i := range users {
fmt.Printf("Index: %d, Reference address: %p, Original address: %p
", i, &users[i], &users[i])
}
}Output demonstrates that the first loop copies the struct (different addresses), while the second loop works directly on the slice element.
3. Benchmark: Comparing Three Loop Styles
The benchmark tests three variants: for _, user := range testUsers – copies each User into user. for j := 0; j < len(testUsers); j++ – index access, no copy. for j := range testUsers – index‑only range, also no copy.
The benchmark body performs a sum of Age to prevent compiler optimizations.
package main
import (
"math/rand"
"testing"
"time"
)
var testUsers = generateTestUsers(1024)
func generateTestUsers(count int) []User {
source := rand.NewSource(time.Now().UnixNano())
rng := rand.New(source)
users := make([]User, count)
for i := range users {
user := User{Email:"[email protected]", Name:"John Doe", Age:30}
for j := range user.Metadata {
user.Metadata[j] = byte(rng.Intn(256))
}
users[i] = user
}
return users
}
func BenchmarkRangeWithIndex(b *testing.B) {
var sum int
for b.N > 0 {
for _, user := range testUsers {
sum += user.Age // access field to avoid optimization
}
b.N--
}
_ = sum
}
func BenchmarkTraditionalFor(b *testing.B) {
var sum int
for b.N > 0 {
for j := 0; j < len(testUsers); j++ {
sum += testUsers[j].Age
}
b.N--
}
_ = sum
}
func BenchmarkRangeIndexOnly(b *testing.B) {
var sum int
for b.N > 0 {
for j := range testUsers {
sum += testUsers[j].Age
}
b.N--
}
_ = sum
}3.1 Benchmark Results
go test -bench=Benchmark -benchmemTypical output:
BenchmarkRangeWithIndex-12 6356 180326 ns/op 0 B/op 0 allocs/op
BenchmarkTraditionalFor-12 2056147 590.1 ns/op 0 B/op 0 allocs/op
BenchmarkRangeIndexOnly-12 2116189 566.9 ns/op 0 B/op 0 allocs/opThe copy‑heavy BenchmarkRangeWithIndex is roughly 300 ns/op slower than the index‑only versions, even though allocations remain zero; the slowdown stems from copying large objects.
allocs/op stays at 0, confirming that the cost is due to object copying, not heap allocation.
4. Behavioral Difference: Modifying the Loop Variable
Because the iteration variable is a copy, changes to its fields do not affect the original slice:
for _, user := range users {
user.Age = 40 // modifies only the copy
}The correct way to update elements is to use the index:
for i := range users {
users[i].Age = 40 // writes back to the slice
}5. Practical Recommendations
5.1 Read‑Only Scenarios (no element modification)
Small structs: for _, v := range xs is readable and usually fine.
Large structs: prefer for i := range xs and access xs[i] to avoid implicit copies.
5.2 Write Scenarios (need to modify elements)
Use the index to obtain a pointer to the element, then modify through the pointer:
for i := range users {
u := &users[i]
// modify u safely
}This is safer than iterating with for _, u := range users and then taking &u, which would give the address of the copy rather than the slice element.
Summary
The loop for _, v := range []Struct copies each element into the iteration variable v.
When the struct is large, the copy cost can be substantial even without any memory allocation.
Benchmarking shows that copy‑heavy range loops are noticeably slower than index‑based loops.
Modifying the iteration variable does not affect the original slice; use index‑based access or pointers for updates.
Adopt index‑only loops for large structs and pointer‑based updates to achieve both performance and correctness.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
