Unlocking Go’s BitVec: How Bitmap Structures Power Massive Data Processing
This article explains the bitmap (BitVec) data structure, its memory efficiency for handling billions of integers, real‑world applications such as bloom filters and permission flags, and walks through Go’s internal BitVec implementation with concrete code examples and performance insights.
Bitmap fundamentals
A bitmap (bit vector) stores a set of integers as individual bits, achieving one‑bit per value memory usage. Representing 1 billion distinct integers requires only 125 MiB, far less than the several gigabytes needed by hash tables.
Typical scenarios include:
Duplicate detection in massive data streams.
Bitmap indexes in databases for fast enum‑field queries.
Permission encoding where each bit represents a distinct right (e.g., a 32‑bit integer can encode read, write, admin flags).
Bloom filters extend the bitmap idea by applying multiple hash functions to set bits, providing probabilistic membership checks with minimal memory overhead.
Go standard library BitVec
The Go compiler implements a BitVec type in cmd/compile/internal/bitvec (see https://github.com/golang/go/blob/989eed28497cde7145958985f50bb3dd6ab698b6/src/cmd/compile/internal/bitvec/bv.go#L21). It is used for SSA liveness analysis and also appears in the runtime stack implementation (runtime/stack.go).
Structure definition:
type BitVec struct {
N int32 // number of bits in the vector
B []uint32 // backing array of 32‑bit words
}
func New(n int32) BitVec {
nword := (n + wordBits - 1) / wordBits // minimal word count
return BitVec{n, make([]uint32, nword)}
}The type provides a comprehensive set of operations (And, AndNot, Clear, Copy, Count, Eq, Get, IsEmpty, Next, Not, Or, Set, String, Unset). The following excerpts illustrate the core logic and the trade‑offs considered during implementation.
Bitwise AND (intersection)
func (dst BitVec) And(src1, src2 BitVec) {
if len(src1.B) == 0 {
return
}
// hoist bounds checks out of the loop
_, _ = dst.B[len(src1.B)-1], src2.B[len(src1.B)-1]
for i, x := range src1.B {
dst.B[i] = x & src2.B[i]
}
}This method computes the intersection of two vectors word‑wise using the & operator. The explicit hoisting of the last element accesses eliminates repeated length checks inside the loop, a micro‑optimisation that improves hot‑path performance.
Bitwise NOT (complement)
func (bv BitVec) Not() {
for i, x := range bv.B {
bv.B[i] = ^x
}
if bv.N%wordBits != 0 {
// mask out bits beyond the declared length in the final word
bv.B[len(bv.B)-1] &= 1<<uint(bv.N%wordBits) - 1
}
}The loop flips every bit with the ^ operator. Because the vector length may not be a multiple of the word size, the final word is masked with 1<<uint(bv.N%wordBits) - 1 to clear excess bits, preserving logical consistency.
Population count (Count)
func (bv BitVec) Count() int {
n := 0
for _, x := range bv.B {
n += bits.OnesCount32(x)
}
return n
}Each 32‑bit word is processed by bits.OnesCount32, a highly optimised CPU instruction (or intrinsic) that returns the number of set bits. The total sum yields the cardinality of the set represented by the bitmap.
All methods are deliberately simple; their performance impact becomes significant when operating on large vectors (e.g., millions of words). SIMD‑friendly loops and the use of intrinsics such as bits.OnesCount32 enable the compiler to generate vectorised code on supported architectures.
Design observations
Receiver naming varies ( dst, bv, bv1) to reflect the role of the argument: the destination vector for binary ops versus the primary operand for unary ops.
The implementation favours in‑place updates (e.g., And writes into dst) to avoid allocations during intensive analyses.
Masking of the trailing word in Not prevents spurious bits from influencing subsequent operations, a subtle bug source in earlier bitmap implementations.
Method set mirrors the needs of the Go compiler’s SSA engine: fast set/clear, intersection, union, complement, and iteration (via Next).
References
bitvec crate – https://crates.io/crates/bitvec
Go source: cmd/compile/internal/bitvec – https://github.com/golang/go/blob/989eed28497cde7145958985f50bb3dd6ab698b6/src/cmd/compile/internal/bitvec/bv.go#L21
runtime/stack.go – https://github.com/golang/go/blob/master/src/runtime/stack.go#L595
Method documentation (Go 1.23.2):
And – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.And
AndNot – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.AndNot
Clear – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Clear
Copy – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Copy
Count – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Count
Eq – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Eq
Get – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Get
IsEmpty – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.IsEmpty
Next – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Next
Not – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Not
Or – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Or
Set – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Set
String – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.String
Unset – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Unset
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
BirdNest Tech Talk
Author of the rpcx microservice framework, original book author, and chair of Baidu's Go CMC committee.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
