Master Log Management: Automate Cleanup with crontab & logrotate
This guide explains log management goals, special scenarios that cause uncontrolled log growth, and practical solutions using Linux's crontab for scheduled cleanup and the logrotate tool for automated rotation and retention across common services like MySQL, nginx, and Kafka.
Log Management Goals
Log management generally includes two parts: (1) the log content itself—well‑structured logs (anchors, formats, etc.) provide valuable records for troubleshooting, and (2) log archiving rules—such as splitting by date or size and retaining a limited number of archives (e.g., only the last month).
For custom services, developers can tailor logging via components like Logback or Log4j.
For third‑party components (MySQL, nginx, Redis, Nacos, Sentinel, etc.) that do not expose rich log configuration, precise log management is often impossible.
Special Log Scenarios
Some services or components, if not specially configured, will produce uncontrolled log files that eventually exhaust disk space. Common cases include:
Applications started with
nohupwithout log redirection, causing output to accumulate in
nohup.outor a single redirected file.
MySQL can set a log file path but cannot automatically clean old logs.
nginx allows configuring log templates and paths (default
access.log,
error.log) but does not auto‑clean.
Special Tool – Scheduled Cleanup
You can use Linux's built‑in scheduler
crontabtogether with a cleanup script. Example:
<code>crontab -e
# Clean logs, keep the most recent 7 days
0 * * * * find /logs.dir/ -mtime +7 | xargs rm -rf</code>Important Considerations
When rotating logs, Linux keeps file handles open, leading to issues such as:
Renaming a log file while the service continues writing to the original file.
Deleting a log file (
rm -f) requires restarting the service; otherwise the file remains occupied.
Deleted but still‑open files are invisible to
lsand
du, yet
dfshows real disk usage;
lsofis needed to locate them.
Solutions:
For one‑time cleanup, truncate the file (e.g.,
echo > log.log) instead of deleting it.
If the file was already removed, use
lsof | grep -i deletedto find the owning process and restart it.
To retain content while controlling size, use
logrotatewith a copy‑and‑truncate approach.
Special Tool – logrotate
For components that cannot be managed individually, Linux provides
logrotate, which can be scheduled via
crontaband supports custom retention policies. Common options include:
Log rotation period (daily, weekly, monthly)
Log file extension
Rotation method (copytruncate, create, etc.)
Compression of rotated logs
Number of retained archives
Typical
logrotatecommand syntax:
<code>logrotate [OPTION...] <configfile>
-d, --debug # test configuration
-f, --force # force rotation
-m, --mail=command # mail after compression
-s, --state=statefile
-v, --verbose # verbose output</code>Example nginx configuration (
/etc/logrotate.d/nginx):
<code>/usr/share/nginx/log/*.log{
daily
missingok
rotate 7
compress
delaycompress
notifempty
create 644 root root
sharedscripts
postrotate
[ ! -f /var/run/nginx.pid ] || kill -USR1 `cat /var/run/nginx.pid`
endscript
}</code>Schedule via crontab:
<code>echo "0 0 * * * /usr/sbin/logrotate -vf /etc/logrotate.d/nginx > /dev/null 2>&1" >> /var/spool/cron/root</code>Appendix: Simple logrotate Configurations
<code># MySQL
/data/mysql/log/mysqld.log{
daily
dateext
dateyesterday
copytruncate
notifempty
missingok
olddir backup
rotate 60
compress
}
# nginx
/usr/local/nginx/logs/access.log
/usr/local/nginx/logs/error.log{
daily
dateext
dateyesterday
copytruncate
notifempty
missingok
olddir backup
rotate 30
compress
}</code>Appendix: Log Management for Common Components
nginx does not auto‑clean; logs grow in a single file.
MySQL does not auto‑clean; logs grow in a single file.
Zookeeper supports auto‑clean via log4j size/number limits.
Redis records only minimal core logs; no cleanup needed.
Kafka data logs (topic, offset) support auto‑clean via configuration.
Kafka operation logs rotate automatically but are not auto‑cleaned; managed via log4j.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.