Unlock Faster System Performance Analysis with Alibaba’s Open‑Source ssar Tool
This article introduces the open‑source ssar system‑performance monitoring tool, explains its architecture, compares it with traditional sar utilities, demonstrates fast‑iteration development, showcases load5s metrics and detailed command usage, and provides configuration guidance for precise Linux performance diagnostics.
1. System Performance Analysis Tool ssar Positioning
The book *The Performance Golden Age* classifies observability tools into counters, tracing, profiling, and monitoring. Based on data acquisition method and real‑time nature, tools are grouped into four quadrants (A‑D). The article focuses on quadrant B (system performance monitoring) and quadrant C (tracing & profiling).
2. Introduction to ssar
ssar is an open‑source system‑performance monitoring tool released by Alibaba and hosted in the Dragon Lizard community. It covers all traditional sar functions and adds machine‑level, process‑level, and unique load5s metrics for precise load diagnosis.
ssar consists of a collector (sresar), a generic query command (ssar), a classic tsar2 wrapper, and an enhanced ssar+ wrapper.
3. Rapid Development & Iteration
Traditional sar tools require code changes for new metrics, leading to long release cycles. ssar collects data per file, allowing new metrics to be added by editing sys.conf without recompiling. Queries are performed with a single ssar command, and complex logic can be handled in Python wrappers.
4. Machine‑Level Metrics
ssar provides both incremental (/s suffix) and instantaneous metrics. The unique load5s metric reports R+D thread count with 5‑second granularity, offering more accurate load detection than traditional load1.
Additional views (loadrd, stack, loadr, loadd, psr, stackinfo) expose thread states, call stacks, and per‑CPU aggregation for deep diagnosis.
5. Process‑Level Metrics
The ssar procs command displays historical process information similar to ps. Options support time range selection, field selection, sorting, and special views (--job, --sched) for group and scheduling analysis.
6. Load5s Metric Demonstration
Experiments with stress and custom uninterruptible workloads show load5s reacts instantly to load changes, while load1 lags, confirming load5s as a more precise indicator of system pressure.
7. Configuration Files
The main configuration /etc/ssar/ssar.conf contains [main], [load], and [proc] sections. Options control data retention, disk‑space thresholds, feature flags, and collection intervals. The collector reads files defined in /etc/ssar/sys.conf (e.g., /proc/stat, /proc/meminfo).
8. CPU Usage Correlation
Comparisons between top, tsar2, and ssar illustrate how ssar’s raw tick counts map to percentage values, linking machine‑wide CPU usage with per‑process statistics.
9. Memory Reclamation Case Study
A detailed scenario shows how massive Java memory allocation triggers kswapd, direct memory reclamation, high sys‑CPU, and elevated load5s. ssar metrics trace the entire chain from free‑memory thresholds to kernel‑level thread states.
Source code and binaries are available at https://gitee.com/anolis/tracing-ssar.git .
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
