How to Process a 16 GB Log File in Seconds with Go Concurrency

This article explains how to efficiently extract time‑range logs from a massive 16 GB .txt/.log file using Go's bufio.NewReader, sync.Pool for buffer reuse, and concurrent goroutines, achieving processing times of around 25 seconds.

21CTO
21CTO
21CTO
How to Process a 16 GB Log File in Seconds with Go Concurrency

Modern computer systems generate massive amounts of log data every day, and storing this immutable debugging information in databases is impractical, so most companies keep logs in local files.

We demonstrate how to extract logs from a 16 GB .txt or .log file using Go. First, the file is opened with the standard os.File API. Two naive approaches—reading line by line or loading the entire file into memory—are either too slow or impossible due to memory constraints.

Instead, we read the file in chunks with bufio.NewReader. A buffer of a few kilobytes is allocated, filled with Read, and processed. To reduce garbage‑collector pressure, sync.Pool is used to recycle byte slices, strings, and slices.

f, err := os.Open(fileName)
if err != nil {
    fmt.Println("cannot read the file", err)
    return
}
defer f.Close()

r := bufio.NewReader(f)
buf := make([]byte, 4*1024) // 4KB chunk size
n, err := r.Read(buf)
// ... process chunk ...

Each chunk is handed off to a separate goroutine, allowing parallel processing of multiple file sections. The ProcessChunk function splits the chunk into individual log lines, parses the ISO‑8601 timestamp, and prints lines whose timestamps fall within the user‑specified start and end times.

func ProcessChunk(chunk []byte, linesPool *sync.Pool, stringPool *sync.Pool, start, end time.Time) {
    logs := string(chunk)
    linesPool.Put(chunk)
    logsSlice := strings.Split(logs, "
")
    for _, text := range logsSlice {
        if len(text) == 0 {
            continue
        }
        parts := strings.SplitN(text, ",", 2)
        t, err := time.Parse("2006-01-02T15:04:05.0000Z", parts[0])
        if err != nil {
            fmt.Printf("Could not parse time: %s
", parts[0])
            continue
        }
        if t.After(start) && t.Before(end) {
            fmt.Println(text)
        }
    }
}

A benchmark on a 16 GB log file shows the entire extraction completes in roughly 25 seconds, demonstrating that chunked reading combined with buffer pooling and goroutine parallelism can handle very large log files efficiently.

func main() {
    // parse command‑line arguments for start time, end time, and file path
    // open file, set up sync.Pool objects, and launch goroutine workers as shown above
    // wait for all workers to finish and report elapsed time
}
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceconcurrencysync.PoolLog Processinglarge files
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.