Operations 14 min read

Using strace for Linux Troubleshooting, Debugging, and Performance Analysis

This article explains what strace is, how it works via ptrace, and demonstrates its practical use in diagnosing service startup failures, process exits, shared‑memory errors, and performance bottlenecks through detailed examples and common command‑line options.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Using strace for Linux Troubleshooting, Debugging, and Performance Analysis

According to its official description, strace is a Linux user‑space tracer used for diagnosis, debugging, and teaching; it monitors interactions between user processes and the kernel such as system calls, signals, and process state changes by leveraging the kernel's ptrace feature.

In daily operations, fault handling and diagnosis are essential skills, and strace helps locate process and service issues by revealing the “black box” of system calls. An example shows a failing some_server binary where the log file cannot be opened; strace uncovers an open call returning -1 ENOENT , indicating a missing log directory, which once created allows the service to start.

The article also reviews basic system‑call concepts, describing how user‑space programs request privileged services from the kernel via system calls such as open , fork , execve , and shmget . It lists the main categories of Linux system calls (file/device access, process management, signals, memory management, IPC, networking, others).

strace can be invoked in two modes: prefixing the command to start a new process (e.g., strace ls -lh /var/log/messages ) or attaching to an existing PID with -p (e.g., strace -p 17553 ). The tool offers many options; a typical comprehensive command is: strace -tt -T -v -f -e trace=file -o /data/log/strace.log -s 1024 -p 23489 which adds timestamps, execution time, verbose output, follows child processes, filters for file‑related calls, writes to a log file, and expands string arguments.

Practical cases include tracking nginx startup to see accessed files, diagnosing a process killed by SIGKILL , and observing that a program exiting normally results in an exit_group system call with an “exited with X” line.

Another case examines a shared‑memory error where shmget fails with EINVAL because the requested size does not match an existing segment; using ipcs confirms the size mismatch caused by mixing 32‑bit and 64‑bit builds.

The article also compares performance of two shell scripts for counting source‑code lines, showing how strace -c reveals that the inefficient script spawns over 100 000 processes, while the optimized version completes in seconds, highlighting the cost of process creation.

In summary, strace is a powerful tool for uncovering what a Linux process is doing, aiding rapid fault isolation; however, when a process is stuck in user space without system‑call activity, other tools like gdb , perf , or SystemTap are needed.

LinuxTroubleshootingperformance analysissystem callsstrace
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.