Operations 21 min read

Inside Android’s traced_perf: Architecture, IPC, and Unwinding Explained

This article provides an in‑depth technical walkthrough of Android’s traced_perf component, covering its code layout, integration with Perfetto, IPC mechanisms, perf_event handling, sample acquisition, stack unwinding, and trace writing, while illustrating each step with diagrams and reference links.

OPPO Kernel Craftsman
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Inside Android’s traced_perf: Architecture, IPC, and Unwinding Explained

1. Overview of perf tools

Linux includes many performance analysis utilities. The perf tool (linux‑tools perf) was introduced in kernel 2.6.31 (2009) and can trace hardware performance counters, tracepoints, software counters, and dynamic probes. It exposes these as events via the perf_event_open syscall and provides sub‑commands such as stats, top, record, and report.

Android does not use the vanilla linux‑tools perf; it ships a customized version called simpleperf (available since Android 6.0) and a newer component called traced_perf , which is built as a consumer of Perfetto.

traced_perf code layout diagram
traced_perf code layout diagram

2. traced_perf code layout

traced_perf lives under the AOSP path external/perfetto/src/profiling/perf/ , making it a sub‑directory of the Perfetto project.

The source tree can be divided into three parts:

BUILD.gn – build scripts

X_unittest.cc – unit tests

Main implementation files

The executable is launched by the rc script external/perfetto/traced_perf.rc and installed to /system/bin/traced_perf as a daemon.

traced_perf runtime diagram
traced_perf runtime diagram
traced_perf lifecycle diagram
traced_perf lifecycle diagram

3. traced_perf architecture

3.1 Perfetto framework

traced_perf acts as a Perfetto producer. Perfetto follows a service model consisting of a Tracing Service, producers, and consumers.

3.1.1 Producer

traced_perf registers two DataSourceDescriptors (“linux.perf” and “perfetto.metatrace”) with the Tracing Service via the ProducerEndPoint.

3.1.2 Tracing Service

The Tracing Service runs as the traced process, receives configuration from consumers, and forwards it to producers. Communication uses an IPC channel (Unix domain socket) and shared memory.

3.1.3 Consumer

Consumers such as Perfetto UI, shell commands, or custom Android consumers read trace data from the service.

3.2 Interaction flow

The overall flow involves connecting the producer to the service, registering data sources, receiving IPC messages, and handling events. Key classes include PerfProducer, ProducerEndPoint, ClientImpl, and the various callback methods ( OnConnect, OnDisconnect, SetupDataSource, StartDataSource, etc.).

overall interaction diagram
overall interaction diagram

4. Event handling in traced_perf

traced_perf implements the required Producer callbacks to receive profiling events, open perf_event file descriptors, and forward samples to the unwinder.

Key steps:

OnConnect – set state to kConnected and register data sources.

StartDataSource – receives a DataSourceInstanceID and a DataSourceConfig protobuf.

MetaTraceSource – creates a TraceWriter for meta‑trace data.

Tracepoint‑ID mapping – reads the ID from /sys/kernel/debug/tracing/events/.../id.

Opening perf events – builds a perf_event_attr structure and calls perf_event_open.

perf_event_open flow
perf_event_open flow

5. Sample acquisition

Samples are read from the kernel‑provided ring buffer (perf‑event mmap area). The read loop uses EventReader::ReadUntilSample to obtain ParsedSample objects, which contain common data, registers, user stack, and kernel IPs.

Parsing the perf_event_header and the optional fields depends on the sample_type bits set in perf_event_attr.

sample structure diagram
sample structure diagram

6. Unwinding

After samples are parsed, Unwinder::ConsumeAndUnwindReadySamples performs stack unwinding.

Kernel‑stack unwinding uses /proc/kallsyms to map addresses to symbols. User‑space unwinding requires registers, stack memory, and the target process’s /proc/<pid>/maps and /proc/<pid>/mem. Because of Android’s stricter SELinux policies, traced_perf obtains these files via a signal‑based request handled by AndroidRemoteDescriptorGetter, which receives file descriptors over a Unix socket.

signal based descriptor request
signal based descriptor request

Once the descriptors are received, they are passed to the UnwinderHandle, which calls libunwindstack::Unwind to produce a call‑stack.

7. Writing samples

Unwound data are written back to the trace using the TraceWriter’s TrackPacket protobuf messages.

trace writer packet
trace writer packet

8. Summary

traced_perf integrates into Perfetto as a producer, acquires perf events, parses them, performs kernel and user‑space unwinding, and writes the results to the trace. The design follows a clear service‑producer‑consumer model.

9. References

traced_perf source: https://cs.android.com/

Perfetto documentation: https://perfetto.dev/docs

Linux perf history: https://en.wikipedia.org/wiki/Perf_(Linux)

simpleperf documentation: https://android.googlesource.com/platform/prebuilts/simpleperf/+/782cdf2ea6e33f2414b53884742d59fe11f01ebe/README.md

perf_event_open man page: https://man7.org/linux/man-pages/man2/perf_event_open.2.html

IPCperf_event_openPerfettoLinux performanceAndroid profilingtraced_perfunwinding
OPPO Kernel Craftsman
Written by

OPPO Kernel Craftsman

Sharing Linux kernel-related cutting-edge technology, technical articles, technical news, and curated tutorials

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.