How to Process a 16 GB Log File in Seconds with Go

This tutorial explains how to extract logs from a 16 GB file in about 25 seconds by reading the file in chunks, reusing buffers with sync.Pool, and processing those chunks concurrently with Go goroutines, avoiding full‑memory loads or slow line‑by‑line scans.

Liangxu Linux
Liangxu Linux
Liangxu Linux
How to Process a 16 GB Log File in Seconds with Go

In modern systems, massive log files are stored on disk; loading them entirely into memory is impossible, and line‑by‑line reading is too slow for time‑critical analysis. This Go tutorial shows how to extract logs from a 16 GB file in about 25 seconds by reading the file in chunks, reusing buffers with sync.Pool, and processing chunks concurrently with goroutines.

First, the file is opened with os.Open and a bufio.NewReader is created. Instead of reading the whole file, a fixed‑size byte slice (e.g., 4 KB) is repeatedly filled using Read. When the end of a chunk is reached, the remaining bytes up to the next newline are appended with ReadBytes('\n'). Each chunk is handed to a worker goroutine.

f, err := os.Open(fileName)
if err != nil {
    fmt.Println("cannot read the file", err)
    return
}
defer f.Close()

r := bufio.NewReader(f)
for {
    buf := make([]byte, 4*1024) // chunk size
    n, err := r.Read(buf)
    buf = buf[:n]
    if n == 0 { /* handle EOF */ }
    // read until newline, then launch goroutine
}

Two performance optimisations are introduced:

sync.Pool objects ( linesPool, stringPool, slicePool) recycle byte slices, strings and slices to reduce garbage‑collector pressure.

Goroutine workers process each chunk in parallel, coordinated by a sync.WaitGroup.

The core processing function ProcessChunk converts the byte chunk to a string, splits it into lines, and then parses the ISO‑8601 timestamp at the beginning of each line. If the timestamp falls between the user‑provided start and end times, the line is output.

func ProcessChunk(chunk []byte, linesPool *sync.Pool, stringPool *sync.Pool, slicePool *sync.Pool, start, end time.Time) {
    logs := string(chunk)
    linesPool.Put(chunk)
    logsSlice := strings.Split(logs, "
")
    // split work into sub‑chunks
    for i := 0; i < len(logsSlice); i += chunkSize {
        wg.Add(1)
        go func(s, e int) {
            defer wg.Done()
            for _, text := range logsSlice[s:e] {
                if len(text) == 0 { continue }
                parts := strings.SplitN(text, ",", 2)
                t, err := time.Parse("2006-01-02T15:04:05.0000Z", parts[0])
                if err != nil { continue }
                if t.After(start) && t.Before(end) {
                    fmt.Println(text)
                }
            }
        }(i*chunkSize, min((i+1)*chunkSize, len(logsSlice)))
    }
    wg.Wait()
}

The main program parses command‑line arguments for the start time, end time and file path, determines the file size, reads the last line to obtain the most recent timestamp, and invokes Process only when the log’s time range overlaps the query interval. Finally, it prints the total execution time.

Benchmarking on a 16 GB log file shows the complete extraction completes in roughly 25 seconds, demonstrating that chunked reading, buffer pooling, and concurrent processing can turn a seemingly impossible task into a fast operation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

concurrencyGolog parsinglarge filesFile Processing
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.