Operations 9 min read

How to Build a Real‑Time Log Collector with Java and Tail – Why -F Beats -f

This article shows how to use the Linux tail command together with a simple Java program to collect logs in real time, explains the difference between tail -f and tail -F, demonstrates file rotation handling, and offers practical tips for reliable log monitoring.

macrozheng
macrozheng
macrozheng
How to Build a Real‑Time Log Collector with Java and Tail – Why -F Beats -f
tail command can display log scrolling, which is very convenient. So xjjdog wondered whether we could use tail for log collection.

Imagine you need a fast real‑time log collection tool; tail is an excellent choice, far simpler than Flume, Logstash, or Filebeat. In the past, I built such a collector with tail and it worked well.

Below is Java code that reads the output of a tail process line by line, allowing you to process the logs however you like.

import java.io.BufferedReader;
import java.io.InputStreamReader;

public class TailReader {
    public static void main(String[] args) throws Exception {
        ProcessBuilder ps = new ProcessBuilder("tail", "-f", "/tmp/tail0");
        // also capture error output
        ps.redirectErrorStream(true);
        Process process = ps.start();
        // continuously read tail output
        try (BufferedReader in = new BufferedReader(new InputStreamReader(process.getInputStream()))) {
            String line;
            while ((line = in.readLine()) != null) {
                setLogToKafka(line);
                // avoid throwing exceptions here, otherwise the loop stops
            }
        }
    }

    // simulate sending to Kafka; here we just print
    static void setLogToKafka(String line) {
        System.out.println(line);
    }
}

The main idea is to start a child tail process from Java, continuously monitor the file output, and redirect both stdout and stderr to a BufferedReader, after which you can handle the lines as needed.

Be aware that if the tail process is killed, the Java program loses its function.

Do you know the difference between tail -f and tail -F ?

Before answering, recall how common Java logging frameworks handle log files.

<configuration>
  <appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
    <prudent>true</prudent>
    <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
      <fileNamePattern>logFile.%d{yyyy-MM-dd}.log</fileNamePattern>
      <maxHistory>30</maxHistory>
      <totalSizeCap>3GB</totalSizeCap>
    </rollingPolicy>
    <encoder>
      <pattern>%-4relative [%thread] %-5level %logger{35} - %msg%n</pattern>
    </encoder>
  </appender>
  <root level="DEBUG">
    <appender-ref ref="FILE" />
  </root>
</configuration>

This configuration rolls a new file each night.

We can simulate the roll with:

mv run.log run.2020-11-02.log
touch run.log

Test

Will tail still follow after the file rolls?

Step 1: create the file to monitor

touch /tmp/tail0

Step 2: start the Java program

Step 3: generate a continuous stream

watch -n 1 'echo `date` >> /tmp/tail0 '

The command appends the current date every second; the Java side receives the data.

Step 4: simulate file rotation

mv /tmp/tail0 /tmp/tail.bak
touch /tmp/tail0

After this, the Java program stops receiving data.

Why?

Check the tail process status:

ps -ef|grep tail
 501 21374 21373   0 1:51PM ??        0:00.01 tail -f /tmp/tail0

Inspect the files associated with the process:

lsof -p 21374 | awk '{print $4 "\t"  $9}'
FD NAME
cwd /tmp/
txt /usr/bin/tail
txt /usr/lib/dyld
3r /private/tmp/tail.bak

The tail process is actually watching tail.bak, not the new file.

Writing to tail.bak makes the Java process react again:

echo "haha: xjjdog, i am from tail.bak" >> /tmp/tail.bak

Solution

Replace tail -f with tail -F. The -f option follows by file descriptor, while -F follows by filename and retries, so it continues after rotation.

Thus, change the f flag to F in the Java code.

End

Understanding these nuances helps explain many mysterious issues in daily work. For example, deleting a file that is still held open by a process (using rm) leaves the data on disk until the process exits. To truncate such logs safely, redirect to /dev/null:

cat /dev/null > logpath

Instead of removing the file, overwrite it with an empty stream.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Javareal-time monitoringlog collectionLinux operationstail command
macrozheng
Written by

macrozheng

Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.