How Does strace Peek Inside Other Processes? A Deep Dive into ptrace
This article explains the inner workings of the classic strace command by walking through a hand‑crafted C program that uses ptrace to attach to a target process, set syscall tracing, wait for signals, read the ORIG_RAX register, and translate syscall numbers into readable names, while also discussing the performance impact of such tracing.
Why strace works despite process isolation
strace observes the system calls a process makes by attaching to the target PID and printing each syscall as it occurs.
Hand‑crafting a minimal strace
A small C program (full source at https://github.com/yanfeizhang/coder-kung-fu/blob/main/tests/cpu/test11/main.c) implements the core logic of strace:
int main(int argc, char *argv[]) {
// 1. attach to target pid
ptrace(PTRACE_ATTACH, pid, NULL, NULL);
while (1) {
// 2. request syscall tracing
ptrace(PTRACE_SYSCALL, pid, NULL, NULL);
// 3. wait for the target to stop on a syscall
waitpid(pid, &status, 0);
// 4. read and decode the syscall number
long syscall_number = ptrace(PTRACE_PEEKUSER, pid, 8*ORIG_RAX, NULL);
const char *syscall_name = decode(syscall_number);
printf("Syscall: %s (number: %ld)
", syscall_name, syscall_number);
}
}The program follows three logical steps:
Attach : ptrace(PTRACE_ATTACH, …) creates a tracing relationship (requires root).
Register as syscall debugger : ptrace(PTRACE_SYSCALL, …) tells the kernel to notify the tracer whenever the target executes a syscall.
Read the syscall : The kernel stores the current syscall number in the ORIG_RAX register. The tracer reads it with PTRACE_PEEKUSER, translates the number using /usr/include/x86_64-linux-gnu/asm/unistd_64.h, and prints the name.
Attaching to the target process
The kernel function ptrace_attach (kernel/ptrace.c) looks up the target’s task_struct via find_get_task_by_vpid(pid), performs permission checks, and links the tracer to the target with ptrace_link. ptrace_link inserts the target into the tracer’s ptraced list and sets the tracer as the target’s parent, enabling the tracer to receive SIGTRAP signals via waitpid.
// kernel/ptrace.c (excerpt)
SYSCALL_DEFINE4(ptrace, long request, long pid, unsigned long addr, ...){
struct task_struct *child;
child = find_get_task_by_vpid(pid);
// permission checks omitted
if (request == PTRACE_ATTACH || request == PTRACE_SEIZE)
ret = ptrace_attach(child, request, addr, data);
...
}
static int ptrace_attach(struct task_struct *task, long request, ...){
// permission checks omitted
ptrace_link(task, current);
...
}
void __ptrace_link(struct task_struct *child, struct task_struct *new_parent, ...){
list_add(&child->ptrace_entry, &new_parent->ptraced);
child->parent = new_parent;
...
}Capturing the target’s syscall
Setting the wait condition
After attaching, the tracer repeatedly calls ptrace(PTRACE_SYSCALL, …) and then blocks in waitpid(pid, &status, 0). The kernel’s arch_ptrace adds the SYSCALL_TRACE flag to the target via set_task_syscall_work, causing the target to stop with SIGTRAP whenever it enters a syscall.
// kernel/ptrace.c (excerpt)
static int ptrace_resume(struct task_struct *child, long request, unsigned long data){
if (request == PTRACE_SYSCALL)
set_task_syscall_work(child, SYSCALL_TRACE);
...
}
#define set_task_syscall_work(t, fl) \
set_bit(SYSCALL_WORK_BIT_##fl, &task_thread_info(t)->syscall_work)Waking the tracer
When the target hits a syscall, ptrace_stop marks it as TASK_TRACED, records the exit code, notifies the parent with do_notify_parent_cldstop, and calls schedule() to suspend the target.
// kernel/signal.c (excerpt)
static int ptrace_stop(int exit_code, int why, unsigned long message, kernel_siginfo_t *info){
set_special_state(TASK_TRACED);
current->ptrace_message = message;
current->last_siginfo = info;
current->exit_code = exit_code;
if (current->ptrace)
do_notify_parent_cldstop(current, true, why);
schedule();
...
}Reading the syscall number
The tracer reads ORIG_RAX using PTRACE_PEEKUSER (address 8*ORIG_RAX). The retrieved number is matched against a table (e.g., 5 → read, 6 → write, 10 → open, 11 → close) to obtain the syscall name.
// arch/x86/kernel/ptrace.c (excerpt)
case PTRACE_PEEKUSR: {
if (addr < sizeof(struct user_regs_struct))
tmp = getreg(child, addr);
ret = put_user(tmp, datap);
break;
}Putting it all together
Tracer calls ptrace(PTRACE_ATTACH,…) to link to the target.
Tracer calls ptrace(PTRACE_SYSCALL,…) to enable syscall tracing.
Tracer calls waitpid and blocks until the target raises SIGTRAP on a syscall entry.
Tracer reads ORIG_RAX, translates the number, and prints the syscall.
The loop repeats, allowing the target to continue execution after each syscall.
Because each syscall forces the target to stop, strace introduces noticeable context switches and can increase the target’s runtime. It is therefore best suited for short‑term debugging rather than continuous production monitoring.
Conclusion
strace works by using ptrace to attach to a process, set a syscall‑trace flag, wait for SIGTRAP signals, read the ORIG_RAX register, and map numbers to names. The overhead of stopping the traced process on every syscall makes it unsuitable for long‑running online monitoring.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
