Fundamentals 12 min read

Unlocking Go’s BitVec: How Bitmap Structures Power Massive Data Processing

This article explains the bitmap (BitVec) data structure, its memory efficiency for handling billions of integers, real‑world applications such as bloom filters and permission flags, and walks through Go’s internal BitVec implementation with concrete code examples and performance insights.

BirdNest Tech Talk
BirdNest Tech Talk
BirdNest Tech Talk
Unlocking Go’s BitVec: How Bitmap Structures Power Massive Data Processing

Bitmap fundamentals

A bitmap (bit vector) stores a set of integers as individual bits, achieving one‑bit per value memory usage. Representing 1 billion distinct integers requires only 125 MiB, far less than the several gigabytes needed by hash tables.

Typical scenarios include:

Duplicate detection in massive data streams.

Bitmap indexes in databases for fast enum‑field queries.

Permission encoding where each bit represents a distinct right (e.g., a 32‑bit integer can encode read, write, admin flags).

Bloom filters extend the bitmap idea by applying multiple hash functions to set bits, providing probabilistic membership checks with minimal memory overhead.

Go standard library BitVec

The Go compiler implements a BitVec type in cmd/compile/internal/bitvec (see https://github.com/golang/go/blob/989eed28497cde7145958985f50bb3dd6ab698b6/src/cmd/compile/internal/bitvec/bv.go#L21). It is used for SSA liveness analysis and also appears in the runtime stack implementation (runtime/stack.go).

Structure definition:

type BitVec struct {
    N int32      // number of bits in the vector
    B []uint32   // backing array of 32‑bit words
}

func New(n int32) BitVec {
    nword := (n + wordBits - 1) / wordBits // minimal word count
    return BitVec{n, make([]uint32, nword)}
}

The type provides a comprehensive set of operations (And, AndNot, Clear, Copy, Count, Eq, Get, IsEmpty, Next, Not, Or, Set, String, Unset). The following excerpts illustrate the core logic and the trade‑offs considered during implementation.

Bitwise AND (intersection)

func (dst BitVec) And(src1, src2 BitVec) {
    if len(src1.B) == 0 {
        return
    }
    // hoist bounds checks out of the loop
    _, _ = dst.B[len(src1.B)-1], src2.B[len(src1.B)-1]
    for i, x := range src1.B {
        dst.B[i] = x & src2.B[i]
    }
}

This method computes the intersection of two vectors word‑wise using the & operator. The explicit hoisting of the last element accesses eliminates repeated length checks inside the loop, a micro‑optimisation that improves hot‑path performance.

Bitwise NOT (complement)

func (bv BitVec) Not() {
    for i, x := range bv.B {
        bv.B[i] = ^x
    }
    if bv.N%wordBits != 0 {
        // mask out bits beyond the declared length in the final word
        bv.B[len(bv.B)-1] &= 1<<uint(bv.N%wordBits) - 1
    }
}

The loop flips every bit with the ^ operator. Because the vector length may not be a multiple of the word size, the final word is masked with 1<<uint(bv.N%wordBits) - 1 to clear excess bits, preserving logical consistency.

Population count (Count)

func (bv BitVec) Count() int {
    n := 0
    for _, x := range bv.B {
        n += bits.OnesCount32(x)
    }
    return n
}

Each 32‑bit word is processed by bits.OnesCount32, a highly optimised CPU instruction (or intrinsic) that returns the number of set bits. The total sum yields the cardinality of the set represented by the bitmap.

All methods are deliberately simple; their performance impact becomes significant when operating on large vectors (e.g., millions of words). SIMD‑friendly loops and the use of intrinsics such as bits.OnesCount32 enable the compiler to generate vectorised code on supported architectures.

Design observations

Receiver naming varies ( dst, bv, bv1) to reflect the role of the argument: the destination vector for binary ops versus the primary operand for unary ops.

The implementation favours in‑place updates (e.g., And writes into dst) to avoid allocations during intensive analyses.

Masking of the trailing word in Not prevents spurious bits from influencing subsequent operations, a subtle bug source in earlier bitmap implementations.

Method set mirrors the needs of the Go compiler’s SSA engine: fast set/clear, intersection, union, complement, and iteration (via Next).

References

bitvec crate – https://crates.io/crates/bitvec

Go source: cmd/compile/internal/bitvec – https://github.com/golang/go/blob/989eed28497cde7145958985f50bb3dd6ab698b6/src/cmd/compile/internal/bitvec/bv.go#L21

runtime/stack.go – https://github.com/golang/go/blob/master/src/runtime/stack.go#L595

Method documentation (Go 1.23.2):

And – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.And

AndNot – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.AndNot

Clear – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Clear

Copy – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Copy

Count – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Count

Eq – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Eq

Get – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Get

IsEmpty – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.IsEmpty

Next – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Next

Not – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Not

Or – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Or

Set – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Set

String – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.String

Unset – https://pkg.go.dev/cmd/compile/internal/[email protected]#BitVec.Unset

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceconcurrencyGoBitmapData StructuresBitVec
BirdNest Tech Talk
Written by

BirdNest Tech Talk

Author of the rpcx microservice framework, original book author, and chair of Baidu's Go CMC committee.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.