Tagged articles
240 articles
Page 1 of 3
Linux Kernel Journey
Linux Kernel Journey
May 7, 2026 · Backend Development

KernelScript: A Unified Language for Full‑Stack eBPF Development

KernelScript tackles the growing complexity of eBPF projects by unifying kernel‑side programs, userspace loaders, and kernel modules into a single codebase, using annotations to let the compiler generate the necessary glue code, thereby reducing boilerplate and improving team productivity.

Compiler designKernelScriptLinux kernel
0 likes · 15 min read
KernelScript: A Unified Language for Full‑Stack eBPF Development
Linux Kernel Journey
Linux Kernel Journey
May 6, 2026 · Operations

How eBPF and AI Redefine Mobile Microarchitectural Energy‑Efficiency Analysis

By combining low‑overhead eBPF data collection with AI‑driven diagnosis and an agent‑based execution layer, the authors present a three‑tier system that shifts mobile optimization from peak performance to sustained energy efficiency, achieving sub‑1% monitoring overhead and up to 20% power savings in real‑world video workloads.

AIAgent ArchitectureeBPF
0 likes · 12 min read
How eBPF and AI Redefine Mobile Microarchitectural Energy‑Efficiency Analysis
Linux Kernel Journey
Linux Kernel Journey
Apr 30, 2026 · Information Security

Understanding the BPF Loader and Its Signature Schemes

This article provides an in‑depth technical walkthrough of the eBPF loader implementation, explains two signature schemes, details map creation, hash calculation, CO‑RE relocation handling, the use_loader mode, kernel‑side verification via Hornet LSM, and discusses the advantages, limitations, and TOCTOU concerns.

BPF loaderCO-REHornet LSM
0 likes · 31 min read
Understanding the BPF Loader and Its Signature Schemes
IT Services Circle
IT Services Circle
Apr 7, 2026 · Industry Insights

How a Single 8 GB Server Powered 500 K Users for 15 Years – The Webminal Story

Webminal, a free online Linux learning platform, has survived for fifteen years on a single 8 GB CentOS server, serving over half a million users by using a minimalist stack—including Python 2.7, Flask, Shellinabox, User Mode Linux and eBPF—while deliberately avoiding modern container orchestration and commercial monetisation.

Case StudyInfrastructureOnline Linux
0 likes · 10 min read
How a Single 8 GB Server Powered 500 K Users for 15 Years – The Webminal Story
Black & White Path
Black & White Path
Apr 3, 2026 · Information Security

Can You Trust ps, netstat, and ss on a Compromised Linux Host? Meet LinIR

The article examines why traditional Linux commands like ps, netstat, and ss cannot be trusted on a potentially root‑kit‑infected system, introduces the LinIR tool that collects forensic data without relying on the host's user‑space toolchain, and compares it against manual scripts, other automation tools, and commercial EDR solutions.

GoLinIRLinux incident response
0 likes · 14 min read
Can You Trust ps, netstat, and ss on a Compromised Linux Host? Meet LinIR
Java Tech Enthusiast
Java Tech Enthusiast
Apr 2, 2026 · Industry Insights

How a Single 8 GB Server Powered 500k Users for 15 Years – The Webminal Story

Webminal, a free online Linux learning platform built in 2010, has survived 15 years on a single 8 GB CentOS server, serving over 500,000 users by using minimalist architecture, User Mode Linux, Shellinabox, eBPF monitoring, and a community‑first mindset despite multiple outages and failed monetisation attempts.

CaseStudyLinuxEducationShellinabox
0 likes · 9 min read
How a Single 8 GB Server Powered 500k Users for 15 Years – The Webminal Story
Woodpecker Software Testing
Woodpecker Software Testing
Mar 4, 2026 · Artificial Intelligence

Deep Dive into Adversarial Testing Performance Optimization for AI Systems

The article examines Adversarial Testing Performance Optimization (ATPO) as a new industrial-quality paradigm, detailing how adversarial samples expose hidden performance bottlenecks across AI pipelines, presenting three typical adversarial loads with corresponding optimization targets, common implementation pitfalls, and emerging intelligent approaches using reinforcement learning and digital twins.

AI pipelinesDigital TwinPerformance Optimization
0 likes · 8 min read
Deep Dive into Adversarial Testing Performance Optimization for AI Systems
Deepin Linux
Deepin Linux
Dec 27, 2025 · Operations

Master Linux Kernel Stack Tracing: From GDB Basics to Advanced ftrace & eBPF

This practical guide walks you through Linux kernel stack backtracing, covering GDB installation, core dump analysis, handling corrupted stacks, and advanced tracing techniques using ftrace and eBPF, with step‑by‑step commands, code examples, and troubleshooting tips to pinpoint the root cause of crashes.

LinuxeBPFftrace
0 likes · 32 min read
Master Linux Kernel Stack Tracing: From GDB Basics to Advanced ftrace & eBPF
Linux Kernel Journey
Linux Kernel Journey
Dec 21, 2025 · Artificial Intelligence

How to Trace Intel NPU Kernel Driver Operations Using eBPF and bpftrace

This tutorial explains how to use eBPF and bpftrace to monitor the Intel NPU kernel driver on Lunar Lake and Meteor Lake CPUs, mapping Level Zero API calls to kernel ioctls, tracking memory allocation, IPC communication, and identifying performance bottlenecks through detailed function‑call statistics.

Intel NPULevel Zero APIbpftrace
0 likes · 17 min read
How to Trace Intel NPU Kernel Driver Operations Using eBPF and bpftrace
Alibaba Cloud Observability
Alibaba Cloud Observability
Nov 25, 2025 · Cloud Native

How SysOM Uncovers Hidden Memory Usage in Cloud‑Native Environments

In cloud‑native deployments, container abstraction hides memory consumption, leading to high file cache, SReclaimable, cgroup leaks, and invisible kernel‑allocated memory, but SysOM’s non‑intrusive, low‑overhead diagnostics map pages to inodes and containers to pinpoint the root causes quickly.

Cloud NativeSysOMcontainer monitoring
0 likes · 13 min read
How SysOM Uncovers Hidden Memory Usage in Cloud‑Native Environments
Efficient Ops
Efficient Ops
Nov 17, 2025 · Operations

Mastering pwru: A Step‑by‑Step Guide to eBPF Packet Tracing with Cilium

This article introduces pwru, Cilium's eBPF‑based packet‑tracing tool, explains kernel requirements, shows how to install the pre‑built binary, details command‑line options, and provides practical examples for filtering, output customization, and debugging dropped packets in Linux networking.

CiliumLinux networkingOperations
0 likes · 6 min read
Mastering pwru: A Step‑by‑Step Guide to eBPF Packet Tracing with Cilium
Tencent Technical Engineering
Tencent Technical Engineering
Nov 17, 2025 · Backend Development

How Profile‑Guided Optimization Supercharged WeChat’s Backend Services

This article details the year‑long exploration of Profile‑Guided Optimization (PGO) for WeChat’s backend, covering its theory, compiler implementations, practical experiments with Propeller and BOLT, transparent eBPF sampling, engineering challenges, and the measurable CPU and memory savings achieved across production services.

PGObackend servicescompiler
0 likes · 48 min read
How Profile‑Guided Optimization Supercharged WeChat’s Backend Services
vivo Internet Technology
vivo Internet Technology
Nov 12, 2025 · Fundamentals

Linux Kernel Innovations Powering the AI Agent Era – Highlights from China’s 20th CLK

The 20th China Linux Kernel Developers Conference, hosted by vivo, presented eleven technical talks covering AI‑driven kernel challenges, memory‑compression techniques, heterogeneous compression, async file‑cache management, uncached I/O, direct I/O for compressed files, parallel writeback, host‑initiated defragmentation, zoned storage, energy‑efficient I/O, and eBPF‑based CPU idle policies, each with concrete performance results and implementation details.

AIFile SystemsLinux kernel
0 likes · 12 min read
Linux Kernel Innovations Powering the AI Agent Era – Highlights from China’s 20th CLK
Linux Kernel Journey
Linux Kernel Journey
Nov 4, 2025 · Operations

How to Use Kernel Tracepoints for Zero‑Overhead GPU Driver Monitoring

This tutorial explains how to leverage Linux kernel tracepoints with eBPF and bpftrace to capture real‑time GPU driver activity—including job scheduling, memory management, and command submission—across Intel, AMD, Nouveau, and NVIDIA GPUs, providing detailed examples, scripts, and analysis of the resulting data.

DRMGPUPerformance Monitoring
0 likes · 20 min read
How to Use Kernel Tracepoints for Zero‑Overhead GPU Driver Monitoring
MaGe Linux Operations
MaGe Linux Operations
Oct 27, 2025 · Operations

Essential Ops Playbook: Real‑World Linux Tuning & Incident Diagnosis

This article walks ops engineers through a real production incident, explains why deep Linux kernel knowledge is crucial, presents typical high‑traffic, log‑burst, and DB‑slow‑query scenarios, and shares a three‑step practical tuning methodology with code snippets, monitoring scripts, and future‑proof tips such as eBPF and AIOps.

LinuxOperationsSystem Tuning
0 likes · 14 min read
Essential Ops Playbook: Real‑World Linux Tuning & Incident Diagnosis
Linux Kernel Journey
Linux Kernel Journey
Oct 27, 2025 · Fundamentals

Exploring eBPF‑Based Programmable Memory Management in the Linux Kernel

This article examines recent efforts to make Linux kernel memory management programmable with eBPF, covering BPF‑MM patches for mTHP order, cache‑ext’s customizable LRU, FetchBPF prefetch policies, and BPF OOM hooks, and discusses their design, implementation details, and performance impacts.

LRULinux kernelMemory Management
0 likes · 8 min read
Exploring eBPF‑Based Programmable Memory Management in the Linux Kernel
Linux Kernel Journey
Linux Kernel Journey
Oct 21, 2025 · Industry Insights

Bridging the GPU Observability Gap: Why eBPF on GPUs Matters

The article explains how bpftime extends eBPF to NVIDIA and AMD GPUs, exposing fine‑grained execution details that traditional CPU‑side tools miss, and demonstrates a unified, programmable observability stack that overcomes the limitations of existing GPU profilers in both synchronous and asynchronous workloads.

CUDAGPUObservability
0 likes · 23 min read
Bridging the GPU Observability Gap: Why eBPF on GPUs Matters
Linux Code Review Hub
Linux Code Review Hub
Oct 9, 2025 · Operations

Non‑Intrusive MCP Observability with eBPF: Introducing MCPSpy

The article explains how the emerging Model Context Protocol (MCP) for AI tools lacks visibility, outlines security and monitoring challenges, compares alternative tracing methods, and presents MCPSpy—a Linux‑only eBPF‑based, non‑intrusive solution that captures MCP stdio traffic, parses JSON‑RPC messages, and outputs human‑readable or JSON logs.

AI securityGoMCP
0 likes · 17 min read
Non‑Intrusive MCP Observability with eBPF: Introducing MCPSpy
Efficient Ops
Efficient Ops
Sep 28, 2025 · Cloud Native

Why Cilium Is the Game-Changing Cloud‑Native CNI for Kubernetes

Cilium leverages eBPF to provide a high‑performance, secure, and observable cloud‑native networking solution for Kubernetes, offering flat L3 networking, flexible routing, advanced load balancing, identity‑based security policies, and seamless integration via CNI, Helm, and Hubble, with step‑by‑step deployment instructions.

CNICiliumCloud Native Networking
0 likes · 8 min read
Why Cilium Is the Game-Changing Cloud‑Native CNI for Kubernetes
MaGe Linux Operations
MaGe Linux Operations
Sep 24, 2025 · Operations

How I Pinpointed the Real Culprit of a 100% CPU Spike in Production in Just 3 Minutes

When a production server hit 100% CPU at 3 AM, the author walks through a three‑minute, step‑by‑step method—quickly identifying the offending process, drilling into threads, and pinpointing problematic code—while sharing useful shell commands, common pitfalls, advanced safeguards like cgroup limits and eBPF tracing.

CPU troubleshootingLinux performanceOperations
0 likes · 9 min read
How I Pinpointed the Real Culprit of a 100% CPU Spike in Production in Just 3 Minutes
Ops Community
Ops Community
Sep 3, 2025 · Operations

Master TCP/IP, Routing, and Firewall Techniques for Advanced Ops Engineers

An in‑depth guide for operations engineers covering TCP/IP stack fundamentals, practical routing and firewall configurations, kernel and NIC tuning, automation scripts, and emerging technologies such as eBPF, providing real‑world case studies and step‑by‑step commands to master network reliability and performance.

Kernel TuningTCP/IPeBPF
0 likes · 24 min read
Master TCP/IP, Routing, and Firewall Techniques for Advanced Ops Engineers
Deepin Linux
Deepin Linux
Aug 24, 2025 · Information Security

How PacketScope Uses eBPF to Visualize and Secure TCP/IP Protocol Interactions

PacketScope leverages eBPF to provide a real-time, kernel-level visualization of TCP/IP protocol interactions, enabling detailed security analysis, performance diagnostics, and zero-delay defense, while offering installation guides and a UI that highlights packet analysis, function call chains, and cross-layer metrics.

eBPFnetwork tracingprotocol interaction
0 likes · 12 min read
How PacketScope Uses eBPF to Visualize and Secure TCP/IP Protocol Interactions
Alibaba Cloud Native
Alibaba Cloud Native
Jul 29, 2025 · Cloud Native

How LoongCollector Redefines Cloud‑Native Observability for AI Workloads

LoongCollector, the core component of Alibaba Cloud's LoongSuite, delivers zero‑intrusion, multi‑tenant, high‑performance data collection and processing for AI services, integrating logs, metrics, traces, events, and profiles into a unified, programmable pipeline that scales elastically across heterogeneous GPU clusters.

AIcloud-nativedata collection
0 likes · 17 min read
How LoongCollector Redefines Cloud‑Native Observability for AI Workloads
MaGe Linux Operations
MaGe Linux Operations
Jul 23, 2025 · Cloud Native

Build a Real‑Time eBPF‑Based Kubernetes Network Anomaly Detector

This article walks through designing and implementing a zero‑intrusion, real‑time network anomaly detection system for Kubernetes using eBPF, covering architecture, kernel‑space eBPF programs, Go user‑space collectors, deployment via DaemonSet, performance optimizations, alerting integration with Prometheus/Grafana, and real‑world case studies.

GoGrafanaKubernetes
0 likes · 16 min read
Build a Real‑Time eBPF‑Based Kubernetes Network Anomaly Detector
Big Data Technology Tribe
Big Data Technology Tribe
Jun 11, 2025 · Fundamentals

Mastering eBPF with BCC: A Step‑by‑Step Guide to Building the opensnoop Tool

This article outlines the standard BCC workflow for creating eBPF tools, then dissects the opensnoop source code, covering requirement analysis, kernel‑space program writing, BPF map configuration, user‑space Python integration, argument handling, testing, optimization, and deployment steps to monitor open system calls.

BCCLinux tracingPython
0 likes · 13 min read
Mastering eBPF with BCC: A Step‑by‑Step Guide to Building the opensnoop Tool
Linux Kernel Journey
Linux Kernel Journey
Jun 9, 2025 · Fundamentals

How to Trace CUDA GPU Operations with eBPF

This tutorial explains how to build an eBPF‑based tracing tool that intercepts CUDA runtime API calls via uprobes, captures detailed event data such as memory sizes, transfer directions, kernel launches and errors, and presents it in a readable format for debugging and performance analysis.

BenchmarkCUDAGPU tracing
0 likes · 17 min read
How to Trace CUDA GPU Operations with eBPF
Deepin Linux
Deepin Linux
Jun 6, 2025 · Fundamentals

How eBPF Can Tackle Linux Memory Fragmentation and Boost Android Performance

This article explains the problem of internal and external memory fragmentation in Linux systems, introduces eBPF as a powerful tracing tool, and provides step‑by‑step guidance for building, loading, and running eBPF programs to analyze and mitigate fragmentation on both Linux and Android platforms.

AndroidBPFLinux kernel
0 likes · 22 min read
How eBPF Can Tackle Linux Memory Fragmentation and Boost Android Performance
Alibaba Cloud Observability
Alibaba Cloud Observability
May 19, 2025 · Information Security

How Tool‑Poisoning Attacks Exploit MCP and What to Do About It

This article analyzes the security risks of the Model Context Protocol (MCP), demonstrates a tool‑poisoning attack that steals private keys via malicious tool descriptions, explores client‑side and server‑side threat vectors, and presents observability‑based mitigation using eBPF and LoongCollector.

AI model securityMCPObservability
0 likes · 23 min read
How Tool‑Poisoning Attacks Exploit MCP and What to Do About It
Linux Kernel Journey
Linux Kernel Journey
May 5, 2025 · Operations

Reflections on the 3rd eBPF Developer Conference: Harnessing eBPF for AI

The article recaps the 3rd eBPF Developer Conference in Xi'an, highlighting talks on BPF‑on‑MPTCP, system‑wide PGO, bperf, autonomous‑driving use cases, and AI‑driven observability, while sharing the author's insights on continuous profiling, SysOM, and future challenges of scaling eBPF with large models.

AILinuxObservability
0 likes · 10 min read
Reflections on the 3rd eBPF Developer Conference: Harnessing eBPF for AI
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
May 1, 2025 · Artificial Intelligence

Fine-grained Profiling of Online AI Workloads on Kubernetes Using ACK AI Profiling

This article demonstrates how to use ACK AI Profiling, built on eBPF and dynamic process injection, to perform non-intrusive, low‑overhead profiling of Kubernetes‑deployed large‑language‑model inference services, identify GPU memory growth causes, and apply optimization recommendations to prevent OOM issues.

AI profilingGPU MemoryKubernetes
0 likes · 10 min read
Fine-grained Profiling of Online AI Workloads on Kubernetes Using ACK AI Profiling
Linux Kernel Journey
Linux Kernel Journey
Apr 23, 2025 · Industry Insights

Highlights from the 3rd eBPF Developer Conference: A Technical Recap

The 3rd eBPF Developer Conference held on April 19, 2025 at Xi'an University of Posts and Telecommunications featured 36 expert talks on eBPF advancements, network and security innovations, observability, performance optimization, a vibrant project marketplace, student projects, and provides video and PPT resources for the community.

Linux kernelObservabilitySecurity
0 likes · 7 min read
Highlights from the 3rd eBPF Developer Conference: A Technical Recap
Linux Kernel Journey
Linux Kernel Journey
Apr 15, 2025 · Operations

Efficiently Resolving Performance Bottlenecks and Jitter with Process Hotspot Tracing in Alibaba Cloud OS Console

The article explains how Alibaba Cloud's SysOM console uses low‑overhead process hotspot tracing, stack unwinding, symbol resolution, eBPF and AI diagnostics to pinpoint CPU, memory, lock and network issues, offering visual flame‑graph analysis and real‑world case studies for faster root‑cause identification.

AI diagnosticsCloud NativeSysOM
0 likes · 15 min read
Efficiently Resolving Performance Bottlenecks and Jitter with Process Hotspot Tracing in Alibaba Cloud OS Console
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Apr 14, 2025 · Operations

Process Hotspot Tracing and Performance Analysis with Sysom

This article explains the concept of process hotspot tracing, analyzes common performance pain points in cloud‑native environments, and details Sysom's solution—including stack unwinding, symbol resolution, flame‑graph generation, and real‑world case studies—to help developers and operators quickly locate and resolve system bottlenecks.

SysOMeBPFflamegraph
0 likes · 17 min read
Process Hotspot Tracing and Performance Analysis with Sysom
Deepin Linux
Deepin Linux
Apr 2, 2025 · Operations

Comprehensive Guide to bpftrace: Features, Architecture, Installation, and Practical Use Cases

This article introduces bpftrace, an eBPF‑based dynamic tracing tool for Linux, explains its core concepts, technical architecture, installation methods, basic syntax, and demonstrates real‑world performance analysis, fault diagnosis, and security monitoring scenarios while comparing it with DTrace, SystemTap, and BCC.

DebuggingLinux performanceSystem Tracing
0 likes · 24 min read
Comprehensive Guide to bpftrace: Features, Architecture, Installation, and Practical Use Cases
DataFunSummit
DataFunSummit
Mar 20, 2025 · Artificial Intelligence

Evolution of AI Training Stability and Baidu Baige’s Full-Stack Solutions for Large-Scale Model Training

The article traces the evolution of AI training stability from early manual operations on small GPU clusters to sophisticated, fault‑tolerant infrastructures for thousand‑card and ten‑thousand‑card models, detailing Baidu Baige’s metrics, monitoring, eBPF‑based diagnostics, and checkpoint strategies that reduce invalid training time and accelerate fault recovery.

Distributed SystemsLarge-Scale Trainingcheckpointing
0 likes · 22 min read
Evolution of AI Training Stability and Baidu Baige’s Full-Stack Solutions for Large-Scale Model Training
Baidu Geek Talk
Baidu Geek Talk
Mar 17, 2025 · Industry Insights

From Manual Restarts to Automated Fault Tolerance: The Evolution of AI Training Stability

This article traces the decade‑long evolution of AI training stability—from early small‑model manual operations to large‑scale, multi‑thousand‑GPU clusters—detailing metrics like invalid training time, fault‑tolerance architectures, eBPF‑based hidden‑fault detection, BCCL enhancements, multi‑level restart strategies, and trigger‑based checkpointing that together shrink downtime from minutes to seconds.

AI trainingDistributed SystemsInfrastructure
0 likes · 22 min read
From Manual Restarts to Automated Fault Tolerance: The Evolution of AI Training Stability
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Mar 10, 2025 · Artificial Intelligence

How Baidu Baige Achieves Near‑Zero Downtime in Massive AI Model Training

The article examines how Baidu Baige evolved AI training stability from manual operations to precise engineering, detailing metrics, fault‑perception techniques, eBPF‑based diagnostics, multi‑level restart strategies, and trigger‑based checkpointing that together achieve sub‑minute recovery and 99.5% effective training time on massive GPU clusters.

AI trainingLarge-Scale Clusterscheckpointing
0 likes · 25 min read
How Baidu Baige Achieves Near‑Zero Downtime in Massive AI Model Training
Linux Kernel Journey
Linux Kernel Journey
Feb 25, 2025 · Operations

How to Dynamically Trace Kernel Functions with eBPF Using Last Branch Record

Last Branch Record (LBR) is a CPU‑level feature that records branch jumps; the Linux kernel’s bpf_get_branch_snapshot helper (since 5.16) enables eBPF programs to capture LBR data, and the bpflbr tool demonstrates how to trace kernel functions and bpf program execution, disassemble code, and output call stacks.

Last Branch RecordLinuxbpf_get_branch_snapshot
0 likes · 9 min read
How to Dynamically Trace Kernel Functions with eBPF Using Last Branch Record
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Feb 21, 2025 · Mobile Development

UprobeStats: Dynamic User‑Space Instrumentation on Android via eBPF uprobe

UprobeStats, introduced in Android 15, uses the Linux kernel eBPF uprobe mechanism to dynamically insert probes into user‑space methods, capture timestamps and arguments, load BPF programs, and forward the data to StatsD via configurable protobufs, enabling flexible, source‑free instrumentation with minimal overhead.

AndroidBPFInstrumentation
0 likes · 16 min read
UprobeStats: Dynamic User‑Space Instrumentation on Android via eBPF uprobe
Alibaba Cloud Observability
Alibaba Cloud Observability
Feb 17, 2025 · Operations

What’s Driving Observability in 2025? AIOps, OpenTelemetry, and eBPF Trends

The article outlines 2025 observability trends, covering the rise of AIOps platforms, AI‑driven prediction, OpenTelemetry becoming the de‑facto standard, unified telemetry platforms, the shift of observability left and right, eBPF’s role in platform engineering, and cost‑effective strategies for modern cloud‑native environments.

ObservabilityOpenTelemetryaiops
0 likes · 10 min read
What’s Driving Observability in 2025? AIOps, OpenTelemetry, and eBPF Trends
Infra Learning Club
Infra Learning Club
Feb 16, 2025 · Operations

GPUprobe: Using eBPF to Monitor CUDA Memory Leaks

The article introduces GPUprobe, an eBPF‑based tool that provides lightweight, continuous, application‑level monitoring of CUDA memory allocation, leaks, and kernel launches, compares it with NSight Systems and DCGM, and demonstrates near‑zero overhead integration with Prometheus and Grafana through detailed code examples and real‑world output analysis.

GPU monitoringGrafanaObservability
0 likes · 13 min read
GPUprobe: Using eBPF to Monitor CUDA Memory Leaks
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 13, 2025 · Operations

What Will Observability Look Like in 2025? Key Trends and Technologies

This article compiles predictions from multiple sources to outline ten common observability trends for 2025, covering AIOps platform evolution, AI‑driven prediction, OpenTelemetry adoption, unified monitoring, edge observability, shift‑left development, eBPF integration, log‑centric analytics, cost‑saving strategies, and proactive reliability.

2025 trendsOpenTelemetryaiops
0 likes · 12 min read
What Will Observability Look Like in 2025? Key Trends and Technologies
Linux Kernel Journey
Linux Kernel Journey
Feb 12, 2025 · Cloud Native

Dynamic Filtering of Function Parameters with eBPF

The article explains how to add runtime‑configurable filtering of kernel function arguments in eBPF programs by parsing a C‑style expression, validating its AST, converting it to BPF instructions using BTF metadata, and injecting the generated code into the probe, with a complete example for skb filtering.

BPFBTFGo
0 likes · 15 min read
Dynamic Filtering of Function Parameters with eBPF
Architect
Architect
Feb 4, 2025 · Databases

How to Detect Redis Big Keys in Real Time with Zero Code Changes

This article presents a lightweight, non‑intrusive eBPF‑based method for instantly identifying Redis big‑key operations, explains the underlying kernel and user‑space implementation, provides complete code samples, and evaluates performance before and after optimization.

GoPerformance Monitoringbig key detection
0 likes · 21 min read
How to Detect Redis Big Keys in Real Time with Zero Code Changes
21CTO
21CTO
Jan 30, 2025 · Cloud Native

How ByteDance Uses eBPF netkit to Replace veth for Faster Container Networking

ByteDance engineers are adopting the Linux kernel's new netkit feature, an eBPF‑based container network device that bypasses veth's L2 bottlenecks, delivering up to 10% performance gains and lower CPU usage while maintaining compatibility with existing workloads.

Cloud NativeVethcontainer networking
0 likes · 7 min read
How ByteDance Uses eBPF netkit to Replace veth for Faster Container Networking
Linux Kernel Journey
Linux Kernel Journey
Jan 13, 2025 · Operations

Why the sched_ext BPF Scheduler Is Booming in 2024

The article explains how eBPF‑based sched_ext enables painless design, implementation and deployment of new Linux schedulers, offering faster iteration, better observability, lower entry barriers, and showcases simple FIFO examples, advanced LAVD and rustland schedulers, their adoption in major distros, and performance gains for gaming workloads.

BPFLAVDLinux scheduler
0 likes · 7 min read
Why the sched_ext BPF Scheduler Is Booming in 2024
Linux Kernel Journey
Linux Kernel Journey
Jan 1, 2025 · Backend Development

eBPF Tailcall: 6 Common Pitfalls and How to Detect Them

The article outlines six distinct kernel‑level bugs affecting the eBPF tailcall feature across multiple Linux versions, explains the underlying causes and the commits that fixed them, and introduces a detection tool to verify whether a running kernel is affected.

BPFLinux kernelbpf2bpf
0 likes · 7 min read
eBPF Tailcall: 6 Common Pitfalls and How to Detect Them
Alibaba Cloud Observability
Alibaba Cloud Observability
Dec 24, 2024 · Operations

How to Achieve Full Observability for Go Apps Without Intrusive Agents

This article compares three Go observability solutions—SDK instrumentation, eBPF‑based monitoring, and compile‑time code injection—explaining their mechanisms, open‑source implementations, trade‑offs, and why Alibaba Cloud's Instgo compile‑time approach offers a low‑overhead, non‑intrusive APM alternative.

Cloud NativeGoInstrumentation
0 likes · 11 min read
How to Achieve Full Observability for Go Apps Without Intrusive Agents
Linux Kernel Journey
Linux Kernel Journey
Dec 24, 2024 · Operations

How to Use tcpw with eBPF to Capture Curl’s Five‑Tuple Information

This article introduces tcpw, a small eBPF‑based utility that traces TCP, UDP, and Unix‑domain sockets to display the five‑tuple of commands like curl or telnet, explains its command‑line options, shows concrete usage examples, and details the underlying BPF and Go implementation, including connect, accept, and fork tracing.

GoLinux tracingaccept
0 likes · 8 min read
How to Use tcpw with eBPF to Capture Curl’s Five‑Tuple Information
Linux Kernel Journey
Linux Kernel Journey
Dec 21, 2024 · Fundamentals

Identify the Most Time‑Consuming Process Functions with eBPF

This tutorial shows how to use an eBPF program with the PERF_EVENT type to trace kernel activity, collect samples via performance counters, and pinpoint which processes and functions consume the most execution time, covering dynamic tracing concepts and overflow handling.

Linux profilingPerformance Monitoringdynamic tracing
0 likes · 3 min read
Identify the Most Time‑Consuming Process Functions with eBPF
Linux Kernel Journey
Linux Kernel Journey
Dec 18, 2024 · Operations

Tracing Linux Soft Interrupts with eBPF: Measuring Processing Time

This article demonstrates how to write an eBPF program that attaches to Linux soft‑interrupt entry and exit points, records timestamps in eBPF maps, computes handling duration, updates counters and histograms, and exposes the data to user space for performance analysis.

LinuxPerformance MonitoringeBPF
0 likes · 5 min read
Tracing Linux Soft Interrupts with eBPF: Measuring Processing Time
Linux Kernel Journey
Linux Kernel Journey
Dec 16, 2024 · Fundamentals

eBPF Talk: Manually Performing Backtrace in arm64 fentry

The article explains why backtracing with eBPF fentry on arm64 is harder than on x86, details the stack layout differences, shows how recent commits changed register saving, and provides a practical detection routine to locate the frame pointer and retrieve the tracee's instruction pointer.

ARM64BPFbacktrace
0 likes · 5 min read
eBPF Talk: Manually Performing Backtrace in arm64 fentry
Linux Code Review Hub
Linux Code Review Hub
Dec 4, 2024 · Cloud Native

How Dahua and openEuler’s Kmesh‑bwm Cut Latency 50% and Double Container Density

Facing bandwidth contention when high‑volume video analytics compete with online services, Dahua partnered with the openEuler community to replace the tc htb limiter with an eBPF‑based Kmesh‑bwm solution that introduces lock‑free packet scheduling, directional monitoring and multi‑priority bandwidth guarantees, achieving over 50 % latency reduction, more than 50 % increase in container deployment density, and roughly 30 % overall resource savings.

KmeshQoSVideo Streaming
0 likes · 6 min read
How Dahua and openEuler’s Kmesh‑bwm Cut Latency 50% and Double Container Density
Linux Kernel Journey
Linux Kernel Journey
Nov 20, 2024 · Operations

eBPF Talk: Who Modified My BPF Map?

This article demonstrates how to use eBPF together with BTF to trace BPF map update and delete functions, shows concrete command‑line output, explains the code that identifies target kernel functions, and details the data‑dumping logic for debugging map contents.

BPF mapBTFDebugging
0 likes · 8 min read
eBPF Talk: Who Modified My BPF Map?
Linux Kernel Journey
Linux Kernel Journey
Nov 14, 2024 · Artificial Intelligence

Deep Dive: How DeepFlow Collects Business Metrics for Large‑Model Services

This article explains how China Mobile built a hybrid‑cloud production environment for its customer‑service LLM, using eBPF and WebAssembly plugins from DeepFlow to achieve zero‑intrusion observability, automatically capture full‑stack topology, application/network metrics, and key LLM business indicators such as TTFT, TPOT, and token throughput.

DeepFlowGrafanaLLM
0 likes · 19 min read
Deep Dive: How DeepFlow Collects Business Metrics for Large‑Model Services
Linux Kernel Journey
Linux Kernel Journey
Nov 12, 2024 · Operations

eBPF Talk: Fixing a 7‑Year‑Old Bug in bpftool

The article details how a long‑standing bug that displayed incorrect call‑address information in bpftool’s JIT disassembly was reproduced, analyzed, and fixed by correcting the PC parameter to use the function’s kernel symbol address, with patches applied to both LLVM and libbfd back‑ends.

LLVMbpftooldisassembly
0 likes · 9 min read
eBPF Talk: Fixing a 7‑Year‑Old Bug in bpftool
Linux Kernel Journey
Linux Kernel Journey
Nov 7, 2024 · Information Security

Using eBPF to Protect, Detect, and Audit Malicious eBPF Programs

The article analyzes how attackers can abuse eBPF to steal data, elevate privileges, execute commands, and hide processes, then presents concrete eBPF code for such attacks and outlines practical protection, detection, and auditing techniques—including file analysis, bpftool usage, and kernel tracing—to mitigate these threats.

Kernel SecuritybpftooleBPF
0 likes · 27 min read
Using eBPF to Protect, Detect, and Audit Malicious eBPF Programs
Linux Kernel Journey
Linux Kernel Journey
Nov 5, 2024 · Artificial Intelligence

Understanding AI Flame Graphs: Insights from Brendan Gregg

The article introduces Intel's AI Flame Graph, a low‑overhead profiling tool that visualizes AI accelerator and GPU workloads across the full software stack, explains its design, demonstrates SYCL matrix‑multiply benchmarks, discusses challenges of AI instruction analysis, and outlines future adoption and impact.

AI profilingGPUIntel
0 likes · 16 min read
Understanding AI Flame Graphs: Insights from Brendan Gregg
Linux Code Review Hub
Linux Code Review Hub
Nov 2, 2024 · Artificial Intelligence

Inside Intel’s AI Flame Graph: Low‑Overhead Profiling for Faster, Greener AI

The article introduces Intel’s AI Flame Graph, a low‑overhead profiling tool that visualizes AI accelerator and GPU execution alongside the full software stack, explains its design, shows SYCL matrix‑multiply examples, discusses challenges of AI workload analysis, and outlines future adoption and impact on performance and energy savings.

AI profilingGPUIntel
0 likes · 16 min read
Inside Intel’s AI Flame Graph: Low‑Overhead Profiling for Faster, Greener AI
Linux Kernel Journey
Linux Kernel Journey
Oct 31, 2024 · Information Security

A New Perspective on eBPF Security: Auditing Complex Attack Techniques

This article demonstrates how to use eBPF to audit fileless command‑execution attacks and reverse‑shell techniques by tracing memfd_create, Kprobe/LSM hooks, dup2 redirections, and related kernel functions, providing concrete code examples and analysis of the detection logic.

KprobeLSMLinux security
0 likes · 18 min read
A New Perspective on eBPF Security: Auditing Complex Attack Techniques
Linux Kernel Journey
Linux Kernel Journey
Oct 29, 2024 · Fundamentals

Exploring KPROBE_OVERRIDE for Kernel Error Injection

This article examines how the KPROBE_OVERRIDE feature, combined with eBPF, enables precise kernel‑level error injection, discusses its configuration requirements, demonstrates a practical example on a Mellanox NIC driver, and evaluates the associated security and performance implications.

ALLOW_ERROR_INJECTIONKPROBE_OVERRIDELinux
0 likes · 15 min read
Exploring KPROBE_OVERRIDE for Kernel Error Injection
Linux Code Review Hub
Linux Code Review Hub
Oct 29, 2024 · Information Security

How to Audit and Intercept File Read/Write Operations Using eBPF

This guide explains how to leverage eBPF’s Kprobe, Tracepoint, and LSM features to audit file read/write activity, extract process and file details, and optionally block operations using helpers like bpf_send_signal or bpf_override_return, with complete code examples and configuration steps.

File AuditingKprobeLSM
0 likes · 17 min read
How to Audit and Intercept File Read/Write Operations Using eBPF
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Oct 28, 2024 · Operations

How Zero‑Intrusion eBPF Transforms TCP Network Monitoring and Troubleshooting

This article explains how zero‑intrusion eBPF technology enables detailed, non‑disruptive TCP network monitoring, covering data collection interfaces, aggregation methods, implementation steps, usage limitations, and practical installation and visualization guidance for improving network performance and fault analysis.

Linux kernelNetwork MonitoringObservability
0 likes · 9 min read
How Zero‑Intrusion eBPF Transforms TCP Network Monitoring and Troubleshooting
Linux Kernel Journey
Linux Kernel Journey
Oct 25, 2024 · Operations

Tracing Linux Process Capability Changes with eBPF

The article explains how to use eBPF tracepoints to monitor and record changes in Linux process capabilities, detailing the kernel data structures, BPF program logic, and user‑space handling needed to debug real‑world capability issues such as tcpdump failures and systemd service launches.

BPF mapsLinux capabilitieseBPF
0 likes · 14 min read
Tracing Linux Process Capability Changes with eBPF
Refining Core Development Skills
Refining Core Development Skills
Oct 22, 2024 · Operations

netcap: An eBPF‑Based Next‑Generation Kernel Network Capture Tool

netcap is an open‑source eBPF‑driven kernel network packet capture tool that extends tcpdump syntax to trace skb‑related functions across the Linux network stack, offering detailed packet tracing, customizable filters, multi‑trace aggregation, and user‑defined output to improve debugging of packet loss and performance issues.

Packet CaptureeBPFnetcap
0 likes · 9 min read
netcap: An eBPF‑Based Next‑Generation Kernel Network Capture Tool
21CTO
21CTO
Oct 15, 2024 · Fundamentals

Can eBPF Run on Windows? Exploring Cross‑Platform Kernel Programmability

At the recent virtual eBPF summit, Isovalent CTO Thomas Graf revealed that Microsoft is developing a Windows version of eBPF, aiming for cross‑platform compatibility with Linux, while the IETF works on standardizing the eBPF ISA and verifier to ensure secure, portable kernel bytecode execution.

Cross‑PlatformKernelLinux
0 likes · 5 min read
Can eBPF Run on Windows? Exploring Cross‑Platform Kernel Programmability
Linux Kernel Journey
Linux Kernel Journey
Oct 9, 2024 · Fundamentals

Understanding Linux Process Creation and Termination (Part 1)

This article walks through the Linux kernel mechanisms for creating and destroying processes, covering copy‑on‑write, the fork/vfork/clone system calls, the kernel_clone implementation in kernels 5.0 and 6.5, the copy_process workflow, and the steps the kernel takes to wake up a new task and clean up a terminated one.

KernelLinuxclone
0 likes · 26 min read
Understanding Linux Process Creation and Termination (Part 1)
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Oct 9, 2024 · Operations

Introducing Kyanos: A Lightweight eBPF‑Based Tool for Fast Network Issue Diagnosis

Kyanos is an open‑source command‑line utility that leverages eBPF to provide low‑overhead, kernel‑compatible network tracing and performance analysis for HTTP, MySQL, and Redis traffic, offering simple watch and stat commands that replace slow tcpdump workflows with seconds‑level diagnostics.

ObservabilityPerformance debuggingcommand-line tool
0 likes · 11 min read
Introducing Kyanos: A Lightweight eBPF‑Based Tool for Fast Network Issue Diagnosis
Linux Kernel Journey
Linux Kernel Journey
Oct 7, 2024 · Operations

retsnoop: Kernel Error Debugging Tool that Traces All Functions and Shows Stack on Failure

retsnoop is an eBPF‑based tracing utility that uses wildcard patterns to hook kernel functions, automatically captures full stack traces whenever a function returns an error, and offers three complementary modes—stack trace, function‑call trace, and LBR—to quickly pinpoint the source of kernel failures, with practical examples and source‑code insights.

LinuxeBPFkernel debugging
0 likes · 9 min read
retsnoop: Kernel Error Debugging Tool that Traces All Functions and Shows Stack on Failure
Linux Kernel Journey
Linux Kernel Journey
Oct 2, 2024 · Operations

eBPF Tutorial 36: Tracing Nginx Requests with bpftrace

This tutorial shows how to use eBPF, bpftrace, and the funclatency tool to instrument key Nginx functions, measure their execution latency, analyze the distribution of request processing times, and identify performance bottlenecks for optimization.

Linux tracingNGINXPerformance Monitoring
0 likes · 9 min read
eBPF Tutorial 36: Tracing Nginx Requests with bpftrace
Linux Kernel Journey
Linux Kernel Journey
Sep 30, 2024 · Cloud Native

How to Eliminate the Per‑CPU Map in XDP TCP‑Option Parsing

This article walks through removing the per‑CPU map used to pass the TCP‑option offset in an XDP program, shows the required code changes, explains the verifier errors that arise, and presents the final fix using an int offset with a bitwise mask.

BPF verifierLinux kernelNetworking
0 likes · 8 min read
How to Eliminate the Per‑CPU Map in XDP TCP‑Option Parsing
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Sep 29, 2024 · Cloud Native

Building a Production‑Grade Observability System for Alibaba Cloud ACK Container Service

The presentation outlines Alibaba Cloud's ACK container service observability framework, covering its architecture, key capabilities such as eBPF‑based tracing, GPU profiling, network diagnostics, storage monitoring, and FinOps integration, and demonstrates how these features support AI workloads, large‑scale production stability, and automated incident response.

AICloud NativeContainer Service
0 likes · 15 min read
Building a Production‑Grade Observability System for Alibaba Cloud ACK Container Service
Linux Kernel Journey
Linux Kernel Journey
Sep 27, 2024 · Fundamentals

Understanding eBPF Ringbuf: Design, API, and Comparison

The article explains the motivation, design, and API of the new multi‑producer single‑consumer eBPF Ring Buffer, compares it with perf buffers and other alternatives, and provides complete BPF and userspace code examples demonstrating reservation, commit, and polling of events while preserving ordering across CPUs.

BPF_MAP_TYPE_RINGBUFRing BuffereBPF
0 likes · 8 min read
Understanding eBPF Ringbuf: Design, API, and Comparison
BirdNest Tech Talk
BirdNest Tech Talk
Sep 26, 2024 · Operations

How to Trace Linux Packet Drops with eBPF and kfree_skb_reason

This article explains why packets are dropped in Linux, introduces the kfree_skb_reason API added in kernel 5.17, and shows step‑by‑step how to use bpftrace to capture drop reasons, five‑tuple details, and stack traces for precise network debugging.

Linux kernelbpftraceeBPF
0 likes · 9 min read
How to Trace Linux Packet Drops with eBPF and kfree_skb_reason