Operations 14 min read

Linux Page Cache Optimization for Kafka: Concepts, Parameter Tuning, and Performance Evaluation

The article explains Linux page cache fundamentals, shows how to inspect and reclaim cache, and provides detailed tuning of vm.dirty_* and vm.swappiness parameters to smooth Kafka write traffic, reduce I/O spikes, and improve overall performance, illustrated with before‑and‑after benchmarks.

vivo Internet Technology
vivo Internet Technology
vivo Internet Technology
Linux Page Cache Optimization for Kafka: Concepts, Parameter Tuning, and Performance Evaluation

This article describes the background of Linux Page Cache optimization, basic concepts of Page Cache, previous solutions for Kafka I/O bottlenecks, how to adjust Page Cache related parameters, and a performance comparison before and after optimization.

1. Optimization Background

When business volume grows to trillions of records per day, Kafka clusters face huge pressure on disk I/O, which becomes the biggest performance bottleneck. Sudden spikes in inbound or outbound traffic can saturate disk I/O, causing request failures and even broker node avalanches.

The article focuses on the following optimization:

Optimizing Linux Page Cache parameters (the main content of this article).

2. Basic Concepts

What is Page Cache? Page Cache is a memory cache for the file system that stores file data in RAM to reduce disk I/O.

Two reasons make it effective:

Disk access is orders of magnitude slower than memory.

Frequently accessed data is likely to be accessed again.

Read Cache

When a read request arrives, the kernel first checks if the data is already in Page Cache. If it is, the request is served from memory (cache hit). If not, the kernel reads from disk, caches the data, and subsequent reads hit the cache.

Write Cache

Write requests are written to the cache first; the underlying storage is not updated immediately. The kernel marks the pages as dirty, adds them to the dirty list, and periodically flushes them to disk. Flushing is triggered when either of the following conditions is met:

The dirty data has existed longer than dirty_expire_centisecs (default 300 centiseconds = 30 s).

The proportion of dirty pages exceeds dirty_background_ratio (default 10 % of total memory).

3. Page Cache Inspection Tool

The cachestat tool can be used to view cache statistics.

Installation:

mkdir /opt/bigdata/app/cachestat
cd /opt/bigdata/app/cachestat
git clone --depth 1 https://github.com/brendangregg/perf-tools

Running the tool and interpreting its output (HITS, MISSES, DIRTIES, RATIO, BUFFERS_MB, CACHED_MB) are shown in the original table.

4. How to Reclaim Page Cache

Execute the script:

echo 1 > /proc/sys/vm/drop_caches

After reclamation, buff/cache should be close to zero unless there are ongoing writes.

Images before and after reclamation illustrate the effect.

5. Parameter Tuning

View current Page Cache parameters:

sysctl -a | grep dirty

Default values (Linux kernel):

vm.dirty_background_bytes = 0  # works with vm.dirty_background_ratio
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0  # works with vm.dirty_ratio
vm.dirty_ratio = 20
vm.dirty_expire_centisecs = 3000  # 30 s
vm.dirty_writeback_centisecs = 500  # 5 s

Potential problems when a large amount of data stays cached:

Higher risk of data loss on power failure.

Long IO spikes that degrade write performance.

Optimization Recommendations

vm.dirty_background_ratio : Reduce the ratio to trigger more frequent, smaller flushes, smoothing write spikes. Example setting:

# Temporary change
sysctl -w vm.dirty_background_ratio=1
# Permanent change
echo "vm.dirty_background_ratio=1" >> /etc/sysctl.conf
sysctl -p /etc/sysctl.conf

Alternative: create a dedicated file in /etc/sysctl.d/ :

touch /etc/sysctl.d/kafka-optimization.conf
echo "vm.dirty_background_ratio=1" >> /etc/sysctl.d/kafka-optimization.conf
sysctl --system

vm.dirty_ratio : Increase for heavy write workloads, decrease for lighter workloads. If the dirty ratio is exceeded, the kernel stalls all application writes until the data is flushed.

vm.dirty_expire_centisecs : Works together with vm.dirty_background_ratio . It defines a time‑based flush trigger, ensuring data is persisted even if the size threshold is not reached.

vm.dirty_writeback_centisecs : Smaller values increase flush frequency; ensure the interval is long enough for the flush to complete.

vm.swappiness : Set to 0 to disable swap and avoid swapping out cached pages.

6. Performance Comparison

After tuning, the following metrics improved:

Write traffic became smoother, with fewer spikes.

Disk I/O utilization showed a flatter curve.

Network inbound traffic remained unchanged.

Charts illustrating these improvements are included in the original article.

7. Summary

Different hardware configurations may yield varying absolute results, but the trend of parameter changes is consistent:

Increasing vm.dirty_background_ratio or vm.dirty_expire_centisecs leads to larger traffic and I/O spikes.

Decreasing those parameters smooths traffic and I/O.

Setting vm.dirty_ratio too low (<10) creates periodic traffic valleys due to write stalls.

Setting vm.dirty_ratio high (>40) keeps traffic smooth.

An example of a well‑balanced configuration: vm.dirty_background_ratio=1 , vm.dirty_ratio=80 , vm.dirty_expire_centisecs=1000 .

The overall flow of traffic remains unaffected while the system becomes more stable.

KafkaPerformance TuningLinuxpage cachesysctlio-optimization
vivo Internet Technology
Written by

vivo Internet Technology

Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.