How to Process a 16 GB Log File in Seconds with Go
This guide shows how to efficiently extract timestamp‑filtered logs from a 16 GB text file in Go by reading the file in chunks, reusing memory with sync.Pool, and parallelising processing with goroutines, achieving roughly 25 seconds runtime.
Modern computer systems generate massive logs daily, and storing debugging data in a database becomes impractical. Most companies keep logs as files on local disks. The article demonstrates how to extract specific log entries from a 16 GB .txt / .log file using Go.
The program starts by opening the file with os.Open and handling any error. Two naïve approaches are considered: line‑by‑line reading (low memory, high CPU) and loading the whole file into memory (fast but impossible for 16 GB). The chosen solution reads the file in fixed‑size chunks using bufio.NewReader, which balances memory usage and speed.
f, err := os.Open(fileName)
if err != nil {
fmt.Println("cannot able to read the file", err)
return
}
defer f.Close()Chunked reading is performed in a loop that obtains a buffer from a sync.Pool, reads into it, and processes the data concurrently. The pool reuses byte slices to reduce garbage‑collector pressure.
r := bufio.NewReader(f)
for {
buf := linesPool.Get().([]byte)
n, err := r.Read(buf)
buf = buf[:n]
if n == 0 {
if err != nil {
fmt.Println(err)
}
if err == io.EOF {
break
}
return err
}
// read the rest of the line
nextUntilNewline, err := r.ReadBytes('
')
if err != io.EOF {
buf = append(buf, nextUntilNewline...)
}
wg.Add(1)
go func() {
ProcessChunk(buf, &linesPool, &stringPool, &slicePool, start, end)
wg.Done()
}()
}
wg.Wait()Two optimisation points are highlighted:
sync.Pool reuses memory slices, lowering GC overhead.
Goroutines process each chunk in parallel, dramatically increasing throughput.
The ProcessChunk function splits a chunk into individual log lines, parses the ISO‑8601 timestamp, and prints the line if the timestamp falls between the user‑provided start and end times.
func ProcessChunk(chunk []byte, linesPool *sync.Pool, stringPool *sync.Pool, slicePool *sync.Pool, start time.Time, end time.Time) {
var wg2 sync.WaitGroup
logs := string(chunk)
linesPool.Put(chunk)
logsSlice := strings.Split(logs, "
")
stringPool.Put(logs)
chunkSize := 300
n := len(logsSlice)
noOfThread := n / chunkSize
if n%chunkSize != 0 {
noOfThread++
}
for i := 0; i < noOfThread; i++ {
wg2.Add(1)
go func(s, e int) {
defer wg2.Done()
for i := s; i < e; i++ {
text := logsSlice[i]
if len(text) == 0 {
continue
}
logSlice := strings.SplitN(text, ",", 2)
logCreationTimeString := logSlice[0]
logCreationTime, err := time.Parse("2006-01-02T15:04:05.0000Z", logCreationTimeString)
if err != nil {
fmt.Printf("
Could not parse the time :%s for log : %v", logCreationTimeString, text)
return
}
if logCreationTime.After(start) && logCreationTime.Before(end) {
fmt.Println(text)
}
}
}(i*chunkSize, int(math.Min(float64((i+1)*chunkSize), float64(len(logsSlice)))))
}
wg2.Wait()
}The command‑line interface expects six arguments: the executable name, -f flag, start timestamp, -t flag, end timestamp, and the log file path. It validates the argument count and parses the timestamps using the layout 2006-01-02T15:04:05.0000Z.
func main() {
s := time.Now()
args := os.Args[1:]
if len(args) != 6 {
fmt.Println("Please give proper command line arguments")
return
}
startTimeArg := args[1]
finishTimeArg := args[3]
fileName := args[5]
// open file and parse timestamps ...
// call Process or ProcessChunk as needed
fmt.Println("
Time taken - ", time.Since(s))
}Benchmarking on a 16 GB log file shows the extraction completes in about 25 seconds, demonstrating that chunked reading, memory pooling, and concurrent processing can handle massive files efficiently.
For the full source code, see the original Medium article: https://medium.com/swlh/processing-16gb-file-in-seconds-go-lang-3982c235dfa2
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Go Development Architecture Practice
Daily sharing of Golang-related technical articles, practical resources, language news, tutorials, real-world projects, and more. Looking forward to growing together. Let's go!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
