Operations 36 min read

Understanding Linux Kernel Tracing Mechanisms: Probes, Tracepoints, Kprobes, Uprobes, and Ftrace

This article provides a comprehensive overview of Linux kernel tracing mechanisms—including probes, tracepoints, kprobes, uprobes, and ftrace—explaining their principles, implementation details, practical tools, and real‑world case studies for performance optimization, fault diagnosis, and security monitoring.

Deepin Linux

Jan 18, 2025

Understanding Linux Kernel Tracing Mechanisms: Probes, Tracepoints, Kprobes, Uprobes, and Ftrace

1. Introduction to Tracing Mechanisms

In the vast world of Linux systems, the kernel is the soul that controls the entire system, and the Linux kernel tracing mechanism is a key tool for deeply understanding kernel behavior. It acts like a map in a complex maze, allowing developers to see internal execution paths, locate performance bottlenecks, diagnose faults, and monitor security risks.

With the rapid growth of internet applications, Linux is increasingly used in servers, embedded devices, and cloud computing. As application complexity rises, requirements for performance, stability, and security become stricter. Kernel tracing helps meet these challenges by providing insight for optimization and maintenance.

2. Overview of Core Concepts

2.1 Probes

A probe is the "scout" of the tracing system. It is a carefully placed trap that captures runtime information. Probes are divided into static probes (fixed at compile time) and dynamic probes (inserted at runtime). For example, kprobes is a dynamic probe that can attach to any kernel function to retrieve arguments, register state, and other context.

2.2 Tracepoints

Tracepoints are predefined hooks in the kernel, similar to markers placed on critical code paths such as system‑call entry/exit or scheduler events. When execution reaches a tracepoint, an associated handler records the event with minimal performance impact.

2.3 Events

Events are abstract descriptions of specific behaviors or state changes (e.g., hardware interrupts, function calls, memory allocation). Each event carries rich information such as timestamps, PID, and parameters, enabling detailed system‑state analysis.

3. Significance of Tracing

Tracing is indispensable for performance optimization, fault diagnosis, and security monitoring. By collecting fine‑grained events, developers can pinpoint CPU hotspots, detect memory leaks, reconstruct crash scenarios, and identify unauthorized system calls.

4. Main Tracing Technologies and Principles

4.1 Kprobes

Kprobes allow dynamic insertion of probe handlers without modifying kernel source. Two types exist: kprobe (inserts at any instruction) and kretprobe (fires on function return). When a probe is registered, the kernel copies the original instruction, replaces its first byte with a breakpoint (int3 on x86), and redirects execution to the handler. After the handler runs, the CPU flag is set to enable single‑step execution, generating an int1 exception that invokes the post‑handler before resuming normal flow.

4.2 Uprobes

Uprobes apply the same principle to user‑space binaries. The user‑space offset of the target instruction is calculated from ELF symbol and section tables. Example code to compile a simple program and locate the test function offset is shown below:

root@zfane-maxpower:~/traceing# cat hello.c
#include <stdio.h>
void test(){
    printf("hello world");
}
int main() {
    test();
    return 0;
}
root@zfane-maxpower:~/traceing# gcc hello.c -o hello

Using readelf to obtain symbol and section addresses, the offset is computed as:

offset = test_virtual_addr - .text_virtual_addr + .text_file_offset
       = 0x1149 - 0x1060 + 0x1060
       = 0x1149

A kernel module can then register an uprobe at that offset, as illustrated:

#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/fs.h>
#include <linux/uprobes.h>
#include <linux/namei.h>
#include <linux/string.h>
#include <linux/uaccess.h>

#define DEBUGGEE_FILE "/home/zfane/hello/hello"
#define DEBUGGEE_FILE_OFFSET (0x1149)
static struct inode *debuggee_inode;

static int uprobe_sample_handler(struct uprobe_consumer *con,
                                 struct pt_regs *regs)
{
    printk("handler is executed, arg0: %s
", regs->di);
    return 0;
}

static int uprobe_sample_ret_handler(struct uprobe_consumer *con,
                                     unsigned long func,
                                     struct pt_regs *regs)
{
    printk("ret_handler is executed
");
    return 0;
}

static struct uprobe_consumer uc = {
    .handler = uprobe_sample_handler,
    .ret_handler = uprobe_sample_ret_handler
};

static int __init init_uprobe_sample(void)
{
    int ret;
    struct path path;
    ret = kern_path(DEBUGGEE_FILE, LOOKUP_FOLLOW, &path);
    if (ret) return -1;
    debuggee_inode = igrab(path.dentry->d_inode);
    path_put(&path);
    ret = uprobe_register(debuggee_inode, DEBUGGEE_FILE_OFFSET, &uc);
    if (ret < 0) return -1;
    printk(KERN_INFO "insmod uprobe_sample
");
    return 0;
}

static void __exit exit_uprobe_sample(void)
{
    uprobe_unregister(debuggee_inode, DEBUGGEE_FILE_OFFSET, &uc);
    printk(KERN_INFO "rmmod uprobe_sample
");
}

module_init(init_uprobe_sample);
module_exit(exit_uprobe_sample);
MODULE_LICENSE("GPL");

Uprobes have no blacklist restrictions, making them flexible for user‑space tracing, performance tuning, and security monitoring.

4.3 Ftrace

Ftrace is a versatile tracing framework that can inject probe handlers at compile time via the -pg (or -mfentry) option, generating mcount calls. During boot, the kernel replaces these calls with NOPs; when tracing is enabled, the NOP is patched to a breakpoint that redirects to the probe handler. Modern kernels also support fprobe, a wrapper that uses ftrace’s injection capability without requiring full function instrumentation.

Example fprobe module:

#define pr_fmt(fmt) "%s: " fmt, __func__

#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/fprobe.h>
#include <linux/sched/debug.h>
#include <linux/slab.h>

#define BACKTRACE_DEPTH 16
#define MAX_SYMBOL_LEN 4096
static struct fprobe sample_probe;
static unsigned long nhit;
static char symbol[MAX_SYMBOL_LEN] = "kernel_clone";
module_param_string(symbol, symbol, sizeof(symbol), 0644);
MODULE_PARM_DESC(symbol, "Probed symbol(s), given by comma separated symbols or a wildcard pattern.");
static char nosymbol[MAX_SYMBOL_LEN] = "";
module_param_string(nosymbol, nosymbol, sizeof(nosymbol), 0644);
MODULE_PARM_DESC(nosymbol, "Not-probed symbols, given by a wildcard pattern.");
static bool stackdump = true;
module_param(stackdump, bool, 0644);
MODULE_PARM_DESC(stackdump, "Enable stackdump.");
static bool use_trace = false;
module_param(use_trace, bool, 0644);
MODULE_PARM_DESC(use_trace, "Use trace_printk instead of printk. This is only for debugging.");

static void show_backtrace(void)
{
    unsigned long stacks[BACKTRACE_DEPTH];
    unsigned int len;
    len = stack_trace_save(stacks, BACKTRACE_DEPTH, 2);
    stack_trace_print(stacks, len, 24);
}

static void sample_entry_handler(struct fprobe *fp, unsigned long ip, struct pt_regs *regs)
{
    if (use_trace)
        trace_printk("Enter <%pS> ip = 0x%p
", (void *)ip, (void *)ip);
    else
        pr_info("Enter <%pS> ip = 0x%p
", (void *)ip, (void *)ip);
    nhit++;
    if (stackdump)
        show_backtrace();
}

static void sample_exit_handler(struct fprobe *fp, unsigned long ip, struct pt_regs *regs)
{
    unsigned long rip = instruction_pointer(regs);
    if (use_trace)
        trace_printk("Return from <%pS> ip = 0x%p to rip = 0x%p (%pS)
",
                    (void *)ip, (void *)ip, (void *)rip, (void *)rip);
    else
        pr_info("Return from <%pS> ip = 0x%p to rip = 0x%p (%pS)
",
                (void *)ip, (void *)ip, (void *)rip, (void *)rip);
    nhit++;
    if (stackdump)
        show_backtrace();
}

static int __init fprobe_init(void)
{
    char *p, *symbuf = NULL;
    const char **syms;
    int ret, count, i;
    sample_probe.entry_handler = sample_entry_handler;
    sample_probe.exit_handler = sample_exit_handler;
    if (strchr(symbol, '*')) {
        ret = register_fprobe(&sample_probe, symbol,
                              nosymbol[0] == '\0' ? NULL : nosymbol);
        goto out;
    } else if (!strchr(symbol, ',')) {
        symbuf = symbol;
        ret = register_fprobe_syms(&sample_probe, (const char **)&symbuf, 1);
        goto out;
    }
    symbuf = kstrdup(symbol, GFP_KERNEL);
    if (!symbuf) return -ENOMEM;
    p = symbuf;
    count = 1;
    while ((p = strchr(++p, ',')) != NULL)
        count++;
    pr_info("%d symbols found
", count);
    syms = kcalloc(count, sizeof(char *), GFP_KERNEL);
    if (!syms) {
        kfree(symbuf);
        return -ENOMEM;
    }
    p = symbuf;
    for (i = 0; i < count; i++)
        syms[i] = strsep(&p, ",");
    ret = register_fprobe_syms(&sample_probe, syms, count);
    kfree(syms);
    kfree(symbuf);
out:
    if (ret < 0)
        pr_err("register_fprobe failed, returned %d
", ret);
    else
        pr_info("Planted fprobe at %s
", symbol);
    return ret;
}

static void __exit fprobe_exit(void)
{
    unregister_fprobe(&sample_probe);
    pr_info("fprobe at %s unregistered. %ld times hit, %ld times missed
",
            symbol, nhit, sample_probe.nmissed);
}

module_init(fprobe_init);
module_exit(fprobe_exit);
MODULE_LICENSE("GPL");

Ftrace can trace function calls, system calls, interrupts, and timers, providing detailed timing and call‑graph information useful for performance analysis and debugging.

5. Tracing Tools Overview

5.1 perf

perf

is a powerful performance analysis tool that can profile CPU usage, cache misses, hardware and software events, and system‑call frequencies. Commands like perf top, perf record, and perf report help identify hot functions and optimize code.

5.2 ftrace Toolset

The ftrace toolset includes tracers such as function (records function entry/exit) and function_graph (produces a call‑graph with timing). Users mount debugfs and write the desired tracer name to /sys/kernel/debug/tracing/current_tracer, optionally filtering functions via set_ftrace_filter.

5.3 ply

ply leverages BPF to write C‑like scripts that are compiled into kernel BPF programs. It combines kprobes and tracepoints, enabling lightweight, high‑performance tracing. Example commands demonstrate measuring vfs_read return sizes, counting error‑returning processes, and printing TCP reset packet information.

6. Real‑World Case Studies

6.1 Performance Optimization

A large e‑commerce platform experienced slow response times. Using perf top the team identified a CPU‑intensive order‑processing function. With perf record and perf report they pinpointed a costly algorithm. Applying ftrace function_graph revealed frequent calls and long execution times. After optimizing the algorithm and data structures, CPU usage dropped and response times improved dramatically.

6.2 Fault Diagnosis

In a production environment a Linux server suffered frequent crashes. System logs hinted at memory errors. The ops team placed kprobes on key memory‑management functions to capture parameters and return values. Tracing showed a kernel module leaking memory during large allocations. Using ftrace function they followed the allocation path, fixed the leak, and restored system stability.

Overall, Linux kernel tracing mechanisms—probes, tracepoints, kprobes, uprobes, and ftrace—provide essential insight for performance tuning, debugging, and security monitoring. Combined with tools such as perf, the ftrace suite, and ply, they enable engineers to solve complex system problems efficiently.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

debugging Linux performance-analysis kernel tracing ftrace Kprobes uprobes

Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.