Tagged articles
134 articles
Page 1 of 2
Linux Kernel Journey
Linux Kernel Journey
May 6, 2026 · Operations

How eBPF and AI Redefine Mobile Microarchitectural Energy‑Efficiency Analysis

By combining low‑overhead eBPF data collection with AI‑driven diagnosis and an agent‑based execution layer, the authors present a three‑tier system that shifts mobile optimization from peak performance to sustained energy efficiency, achieving sub‑1% monitoring overhead and up to 20% power savings in real‑world video workloads.

AIAgent ArchitectureeBPF
0 likes · 12 min read
How eBPF and AI Redefine Mobile Microarchitectural Energy‑Efficiency Analysis
Coder Trainee
Coder Trainee
Apr 28, 2026 · Backend Development

Spring Cloud Microservices Series #7: Implementing Distributed Tracing with SkyWalking

This article explains why distributed tracing is essential for Spring Cloud microservices, introduces SkyWalking’s core concepts, compares it with other tracing tools, shows how to deploy SkyWalking via Docker Compose, integrate the Java agent, and use the UI to analyze performance, errors, and alerts.

AlertingDistributed TracingDocker Compose
0 likes · 15 min read
Spring Cloud Microservices Series #7: Implementing Distributed Tracing with SkyWalking
AI Era Action Guide
AI Era Action Guide
Apr 21, 2026 · Industry Insights

How to Use IBM Processing Mining to Uncover Complex Multi‑Agent Collaboration Workflows

The article explains how multi‑agent AI systems create hidden bottlenecks and abnormal paths in customer‑service workflows, demonstrates how IBM Processing Mining automatically discovers end‑to‑end processes, quantifies performance, identifies variants and root causes, and provides concrete optimization steps that deliver measurable business value.

AI workflowIBMbusiness optimization
0 likes · 21 min read
How to Use IBM Processing Mining to Uncover Complex Multi‑Agent Collaboration Workflows
Java Tech Enthusiast
Java Tech Enthusiast
Apr 18, 2026 · Artificial Intelligence

Why Claude Code Is Failing Complex Engineering Tasks: AMD’s Deep Dive Reveals Four Critical Flaws

An AMD AI director’s GitHub issue sparked a data‑driven investigation that uncovered four major shortcomings, a 67% drop in thinking depth, a surge in API usage costs, and concrete recommendations to restore trust in Claude Code’s ability to handle complex engineering workloads.

AI coding assistantAMD AI directorClaude Code
0 likes · 12 min read
Why Claude Code Is Failing Complex Engineering Tasks: AMD’s Deep Dive Reveals Four Critical Flaws
Architects' Tech Alliance
Architects' Tech Alliance
Apr 15, 2026 · Industry Insights

How DeepSeek V4 Uses Huawei Ascend 950PR to Outperform Nvidia H20 by 2.9×

The article analyzes DeepSeek V4's migration to Huawei's Ascend 950PR chip and CANN framework, detailing three hardware‑level innovations, the CUDA‑to‑CANN transition, and the resulting 35× inference speed boost, 2.87× performance over Nvidia H20, and dramatic cost reductions for trillion‑parameter models.

AI hardwareCANN frameworkDeepSeek
0 likes · 10 min read
How DeepSeek V4 Uses Huawei Ascend 950PR to Outperform Nvidia H20 by 2.9×
AI Large-Model Wave and Transformation Guide
AI Large-Model Wave and Transformation Guide
Apr 7, 2026 · Artificial Intelligence

Why Claude Code Is Getting Dumber: Data‑Driven Dive into AI Programming Decline

An in‑depth analysis of 6,852 Claude Code sessions reveals a 67‑75% drop in reasoning depth, concrete lazy‑output patterns, and systemic cost‑driven optimizations that degrade model performance, while offering practical mitigation strategies for developers facing similar AI tool regressions.

AI model degradationClaudePrompt Engineering
0 likes · 7 min read
Why Claude Code Is Getting Dumber: Data‑Driven Dive into AI Programming Decline
Linux Kernel Journey
Linux Kernel Journey
Dec 21, 2025 · Artificial Intelligence

How to Trace Intel NPU Kernel Driver Operations Using eBPF and bpftrace

This tutorial explains how to use eBPF and bpftrace to monitor the Intel NPU kernel driver on Lunar Lake and Meteor Lake CPUs, mapping Level Zero API calls to kernel ioctls, tracking memory allocation, IPC communication, and identifying performance bottlenecks through detailed function‑call statistics.

Intel NPULevel Zero APIbpftrace
0 likes · 17 min read
How to Trace Intel NPU Kernel Driver Operations Using eBPF and bpftrace
Linux Kernel Journey
Linux Kernel Journey
Oct 21, 2025 · Industry Insights

Bridging the GPU Observability Gap: Why eBPF on GPUs Matters

The article explains how bpftime extends eBPF to NVIDIA and AMD GPUs, exposing fine‑grained execution details that traditional CPU‑side tools miss, and demonstrates a unified, programmable observability stack that overcomes the limitations of existing GPU profilers in both synchronous and asynchronous workloads.

CUDAGPUObservability
0 likes · 23 min read
Bridging the GPU Observability Gap: Why eBPF on GPUs Matters
Java Tech Enthusiast
Java Tech Enthusiast
Oct 5, 2025 · Backend Development

Why Go Needs Goroutine IDs: A Proposal to Enhance Runtime Profiling

This article explains a Go proposal to add unique Goroutine identifiers and start program counters to the runtime profiling API, detailing the background problem, example code, the suggested API changes, community discussion, and the practical impact on performance analysis.

GoroutineRuntimegoid
0 likes · 9 min read
Why Go Needs Goroutine IDs: A Proposal to Enhance Runtime Profiling
Baobao Algorithm Notes
Baobao Algorithm Notes
Sep 28, 2025 · Artificial Intelligence

How Much GPU Memory Do LLMs Really Need? A Deep Dive into Training & Inference

This article breaks down the GPU memory requirements of large language models during training and inference, detailing the contributions of model weights, optimizer states, activations, KV cache, and activation recomputation, and provides concrete formulas, examples, and scaling insights for models like Qwen3 and DeepSeek V3.

GPU MemoryKV cacheLLM
0 likes · 18 min read
How Much GPU Memory Do LLMs Really Need? A Deep Dive into Training & Inference
Deepin Linux
Deepin Linux
Sep 9, 2025 · Fundamentals

Master ARM32/64 Architecture: From Instruction Sets to Performance Analysis

This intensive two‑day course covers ARM32/64 processor instruction sets, mode switching, exception vectors, system call mechanisms, memory management, atomic operations, cache synchronization, and top‑down performance analysis with perf, while also introducing M‑series MCU architectures and providing hands‑on labs for embedded Linux developers.

ARMLinuxembedded systems
0 likes · 7 min read
Master ARM32/64 Architecture: From Instruction Sets to Performance Analysis
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 13, 2025 · Operations

Unlock Faster System Performance Analysis with Alibaba’s Open‑Source ssar Tool

This article introduces the open‑source ssar system‑performance monitoring tool, explains its architecture, compares it with traditional sar utilities, demonstrates fast‑iteration development, showcases load5s metrics and detailed command usage, and provides configuration guidance for precise Linux performance diagnostics.

Linuxload metricsperformance analysis
0 likes · 27 min read
Unlock Faster System Performance Analysis with Alibaba’s Open‑Source ssar Tool
Linux Kernel Journey
Linux Kernel Journey
Jul 21, 2025 · Fundamentals

Mastering CUDA GPU Performance Analysis and Tracing

This guide walks you through a complete workflow for profiling CUDA applications, covering GPU performance fundamentals, key metrics, NVIDIA Nsight tools, CUPTI programming, example code, common bottlenecks, and best‑practice recommendations to identify and eliminate performance limits.

CUDACUPTIGPU profiling
0 likes · 13 min read
Mastering CUDA GPU Performance Analysis and Tracing
Su San Talks Tech
Su San Talks Tech
Jul 18, 2025 · Backend Development

Mastering Production Debugging: How Arthas Instantly Pinpoints Java Issues

This article explains why traditional monitoring tools often fail in production, introduces Arthas as a lightweight, non‑intrusive Java diagnostic solution, and walks through five real‑world scenarios—slow interfaces, thread blockage, memory leaks, hot‑fixes, and data inconsistency—showing exact commands, code snippets, and visualizations to quickly locate and resolve root causes.

ArthasJava debuggingProduction troubleshooting
0 likes · 11 min read
Mastering Production Debugging: How Arthas Instantly Pinpoints Java Issues
Linux Kernel Journey
Linux Kernel Journey
Jun 9, 2025 · Fundamentals

How to Trace CUDA GPU Operations with eBPF

This tutorial explains how to build an eBPF‑based tracing tool that intercepts CUDA runtime API calls via uprobes, captures detailed event data such as memory sizes, transfer directions, kernel launches and errors, and presents it in a readable format for debugging and performance analysis.

BenchmarkCUDAGPU tracing
0 likes · 17 min read
How to Trace CUDA GPU Operations with eBPF
Open Source Linux
Open Source Linux
May 9, 2025 · Artificial Intelligence

Inside Huawei’s Ascend 910C AI Chip: Architecture, Performance Gaps & Strategy

This article translates and expands on analyst Lennart Heim’s X‑platform report, dissecting Huawei’s newly mass‑produced Ascend 910C AI accelerator, its dual‑chip packaging, performance estimates versus NVIDIA’s H100 and upcoming B200, supply‑chain origins, potential domestic production, and the broader strategic impact on China’s AI competitiveness.

AI ChipAI strategyAscend 910C
0 likes · 18 min read
Inside Huawei’s Ascend 910C AI Chip: Architecture, Performance Gaps & Strategy
Open Source Linux
Open Source Linux
Apr 25, 2025 · Artificial Intelligence

What Lies Behind Huawei’s Ascend 910C AI Chip? Performance, Supply Chain, and Strategic Impact

This article translates and analyzes Lennart Heim’s deep dive into Huawei’s Ascend 910C AI accelerator, covering its dual‑chip architecture, packaging trade‑offs, performance versus NVIDIA’s H100 and upcoming B200, mysterious supply‑chain origins, and the broader strategic implications for China’s AI competition.

AI ChipAI competitionAscend 910C
0 likes · 17 min read
What Lies Behind Huawei’s Ascend 910C AI Chip? Performance, Supply Chain, and Strategic Impact
Architects' Tech Alliance
Architects' Tech Alliance
Apr 17, 2025 · Artificial Intelligence

Can Huawei’s Ascend 910C Challenge Nvidia’s H100? A Deep Dive into Architecture, Performance, and Strategy

This article dissects Huawei's Ascend 910C AI accelerator, examining its dual‑chip architecture, cost‑focused packaging, performance metrics that reach roughly 80% of Nvidia's H100, speculative supply‑chain origins, and the broader strategic implications for China's position in the global AI chip race.

AI acceleratorAscend 910CHuawei
0 likes · 19 min read
Can Huawei’s Ascend 910C Challenge Nvidia’s H100? A Deep Dive into Architecture, Performance, and Strategy
Linux Kernel Journey
Linux Kernel Journey
Apr 15, 2025 · Operations

Efficiently Resolving Performance Bottlenecks and Jitter with Process Hotspot Tracing in Alibaba Cloud OS Console

The article explains how Alibaba Cloud's SysOM console uses low‑overhead process hotspot tracing, stack unwinding, symbol resolution, eBPF and AI diagnostics to pinpoint CPU, memory, lock and network issues, offering visual flame‑graph analysis and real‑world case studies for faster root‑cause identification.

AI diagnosticsCloud NativeSysOM
0 likes · 15 min read
Efficiently Resolving Performance Bottlenecks and Jitter with Process Hotspot Tracing in Alibaba Cloud OS Console
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Apr 14, 2025 · Operations

Process Hotspot Tracing and Performance Analysis with Sysom

This article explains the concept of process hotspot tracing, analyzes common performance pain points in cloud‑native environments, and details Sysom's solution—including stack unwinding, symbol resolution, flame‑graph generation, and real‑world case studies—to help developers and operators quickly locate and resolve system bottlenecks.

SysOMeBPFflamegraph
0 likes · 17 min read
Process Hotspot Tracing and Performance Analysis with Sysom
Deepin Linux
Deepin Linux
Mar 31, 2025 · Fundamentals

Understanding and Using Ftrace for Linux Kernel Tracing

This article provides a comprehensive guide to Linux's ftrace tool, explaining its purpose, various tracers, how to set up and use it via debugfs, detailed command examples, implementation details, practical use cases for performance tuning and debugging, and a comparison with other tracing utilities.

DebuggingSystem Tracingftrace
0 likes · 40 min read
Understanding and Using Ftrace for Linux Kernel Tracing
BirdNest Tech Talk
BirdNest Tech Talk
Mar 7, 2025 · Operations

Mastering ltrace: How to Trace Library Calls on Linux for Debugging

This guide explains what ltrace is, how it works, how to install it, basic and advanced usage examples, real‑world scenarios, output interpretation, performance impact, troubleshooting, and comparisons with other debugging tools, providing a complete tutorial for Linux developers.

DebuggingLinuxlibrary tracing
0 likes · 11 min read
Mastering ltrace: How to Trace Library Calls on Linux for Debugging
BirdNest Tech Talk
BirdNest Tech Talk
Mar 7, 2025 · Operations

Master Linux Debugging: How to Use Strace for Deep System Call Insight

This guide explains what strace is, how it leverages ptrace to intercept system calls, shows installation steps, demonstrates basic and advanced usage patterns, discusses real‑world scenarios, interprets output formats, highlights performance trade‑offs, and compares alternative tracing tools.

linux debuggingperformance analysisptrace
0 likes · 9 min read
Master Linux Debugging: How to Use Strace for Deep System Call Insight
Linux Kernel Journey
Linux Kernel Journey
Dec 2, 2024 · Operations

Getting Started with rtla timerlat: Cross‑Compilation and Usage Guide

This article introduces rtla timerlat, a Linux scheduling‑latency analysis tool that breaks latency into detailed phases, explains how to cross‑compile its libtracefs and libtraceevent dependencies, compile the tool, enable the kernel timerlat tracer, configure common parameters, and run a typical command with sample output.

Linuxcross-compilationkernel tracing
0 likes · 11 min read
Getting Started with rtla timerlat: Cross‑Compilation and Usage Guide
Architect
Architect
Nov 22, 2024 · Backend Development

Why Is Async Log4j2 Logging So Slow? A Deep Dive into Disruptor and JNI Overheads

The article investigates a severe performance bottleneck in a Java service caused by massive async Log4j2 logging, analyzes the Disruptor‑based async logger, explores JNI stack‑trace overhead, reproduces the issue with benchmarks, and provides practical recommendations to eliminate the slowdown.

DisruptorJNIJava
0 likes · 18 min read
Why Is Async Log4j2 Logging So Slow? A Deep Dive into Disruptor and JNI Overheads
Linux Kernel Journey
Linux Kernel Journey
Nov 5, 2024 · Artificial Intelligence

Understanding AI Flame Graphs: Insights from Brendan Gregg

The article introduces Intel's AI Flame Graph, a low‑overhead profiling tool that visualizes AI accelerator and GPU workloads across the full software stack, explains its design, demonstrates SYCL matrix‑multiply benchmarks, discusses challenges of AI instruction analysis, and outlines future adoption and impact.

AI profilingGPUIntel
0 likes · 16 min read
Understanding AI Flame Graphs: Insights from Brendan Gregg
Linux Code Review Hub
Linux Code Review Hub
Nov 2, 2024 · Artificial Intelligence

Inside Intel’s AI Flame Graph: Low‑Overhead Profiling for Faster, Greener AI

The article introduces Intel’s AI Flame Graph, a low‑overhead profiling tool that visualizes AI accelerator and GPU execution alongside the full software stack, explains its design, shows SYCL matrix‑multiply examples, discusses challenges of AI workload analysis, and outlines future adoption and impact on performance and energy savings.

AI profilingGPUIntel
0 likes · 16 min read
Inside Intel’s AI Flame Graph: Low‑Overhead Profiling for Faster, Greener AI
Tencent Cloud Developer
Tencent Cloud Developer
Aug 1, 2024 · Backend Development

Linux Performance Analysis Tools and Troubleshooting Methods for Backend Development

The article presents a concise mind‑map of essential Linux performance tools and a flexible troubleshooting workflow, guiding backend developers through CPU, memory, disk, and network issues by using utilities such as top, oprofile, slabtop, iotop, netstat, and strace to quickly pinpoint and resolve bottlenecks.

Backend DevelopmentCPUDisk I/O
0 likes · 11 min read
Linux Performance Analysis Tools and Troubleshooting Methods for Backend Development
Baidu Geek Talk
Baidu Geek Talk
Jul 31, 2024 · Artificial Intelligence

Quantitative Analysis of Transformer Architecture and Llama Model Performance

This engineering‑focused document reviews transformer fundamentals, derives precise FLOP and memory formulas for attention and feed‑forward layers, defines the MFU performance metric, analyzes memory components and parallelism strategies, examines recent architecture variants such as MQA, GQA, sliding‑window attention and MoE, and provides practice problems applying these calculations.

AIGPU computingTransformer
0 likes · 30 min read
Quantitative Analysis of Transformer Architecture and Llama Model Performance
DaTaobao Tech
DaTaobao Tech
Jul 15, 2024 · Mobile Development

First‑Frame Optimization for Mobile Apps: Principles, Metrics, and Strategies

Optimizing a mobile app’s first frame—by defining scope, measuring latency, using profiling tools, and applying strategies such as pre‑loading, lazy initialization, parallel processing, and skeleton screens—boosts brand perception, conversion rates, and resource efficiency, while requiring continuous monitoring, A/B testing, and anti‑degradation safeguards.

Androidfirst frame optimizationmobile performance
0 likes · 15 min read
First‑Frame Optimization for Mobile Apps: Principles, Metrics, and Strategies
Kujiale Project Management
Kujiale Project Management
Jun 13, 2024 · R&D Management

Balancing Agile Metrics: How to Prevent Single-Number Pitfalls

This article explores why agile teams must interpret measurement data holistically, showing how focusing on a single metric can create trade‑offs, and offers a systematic approach to analyzing productivity, stability, and quality indicators for continuous improvement.

Software Managementagile metricsperformance analysis
0 likes · 8 min read
Balancing Agile Metrics: How to Prevent Single-Number Pitfalls
Liangxu Linux
Liangxu Linux
May 29, 2024 · Operations

Master Linux System Monitoring: Top, Free, Vmstat, Iostat, Mpstat, Sar, Netstat, Uptime, Ps, Watch, Strace & Lsof

This comprehensive guide explains how to use essential Linux monitoring commands—including top, free, vmstat, iostat, mpstat, sar, netstat, uptime, ps, watch, strace, and lsof—detailing their purpose, key options, output fields, and practical examples to help you diagnose system performance and resource usage.

Linuxcommand-line toolsperformance analysis
0 likes · 39 min read
Master Linux System Monitoring: Top, Free, Vmstat, Iostat, Mpstat, Sar, Netstat, Uptime, Ps, Watch, Strace & Lsof
Test Development Learning Exchange
Test Development Learning Exchange
May 20, 2024 · Backend Development

Using HiPlot for Visualizing API Test Results

This article demonstrates how to employ HiPlot in API automated testing to efficiently visualize and analyze large sets of test data, covering single-run results, version comparisons, parameter impact studies, long‑running test sequences, and multi‑environment performance evaluations.

API testingHiPlotPython
0 likes · 5 min read
Using HiPlot for Visualizing API Test Results
Test Development Learning Exchange
Test Development Learning Exchange
Mar 30, 2024 · Operations

Monitoring macOS and Windows System Resources with Python

This guide explains why and how to monitor CPU, memory, and disk I/O on macOS or Windows using Python's psutil, matplotlib, and numpy libraries, covering performance analysis, troubleshooting, capacity planning, automated alerts, and includes a complete example script that visualizes resource usage over time.

Pythonperformance analysispsutil
0 likes · 6 min read
Monitoring macOS and Windows System Resources with Python
Tencent Cloud Developer
Tencent Cloud Developer
Jan 9, 2024 · Operations

Tencent Cloud APM Full-Link Tracing Implementation and Best Practices

The article explains how Tencent Cloud APM implements full‑link tracing using OpenTelemetry standards, addresses challenges such as protocol compatibility, massive trace storage, and bytecode overhead with solutions like conversion gateways, tail sampling and thread profiling, and showcases best‑practice scenarios for topology analysis, front‑end/back‑end integration, and log‑trace correlation within the broader TCOP observability suite.

APMFull‑Link TracingObservability
0 likes · 11 min read
Tencent Cloud APM Full-Link Tracing Implementation and Best Practices
Efficient Ops
Efficient Ops
Dec 12, 2023 · Databases

How to Diagnose MySQL Lock Waits and Transaction Timeouts with MyAWR

This article explains how commercial banks can analyze MySQL lock‑wait events and transaction timeouts using MDL metadata lock monitoring, innodb_lock_wait_timeout, custom MyAWR collection, SQL de‑parameterization, and post‑mortem queries to pinpoint blocking SQL and its source.

Lock WaitSQL Optimizationmyawr
0 likes · 19 min read
How to Diagnose MySQL Lock Waits and Transaction Timeouts with MyAWR
Coolpad Technology Team
Coolpad Technology Team
Nov 10, 2023 · Mobile Development

Adapting Mesa GPU Driver for OpenHarmony on the Spreadtrum T606 Platform

This article details the process of integrating the open‑source Mesa 3D graphics stack and the Panfrost GPU driver into OpenHarmony 3.2.2 on a Spreadtrum T606 (Mali‑G57) platform, covering hardware setup, compilation options, code modifications, debugging steps, performance analysis, and lessons learned.

GPU DriverMesaOpenHarmony
0 likes · 15 min read
Adapting Mesa GPU Driver for OpenHarmony on the Spreadtrum T606 Platform
Beijing SF i-TECH City Technology Team
Beijing SF i-TECH City Technology Team
Sep 27, 2023 · Backend Development

Performance Analysis and Optimization Report for a Logistics Platform

This report analyzes performance issues in a logistics platform, identifying backend IC card interface timeouts causing database connection pool exhaustion and frontend problems from excessive DOM elements and network loading, then proposes optimizations including interface migration, CDN, compression, preloading, domain sharding, and virtual scrolling.

Frontend OptimizationIC card interfacebackend optimization
0 likes · 9 min read
Performance Analysis and Optimization Report for a Logistics Platform
Baidu Tech Salon
Baidu Tech Salon
Sep 20, 2023 · Artificial Intelligence

Live Session: Introduction to NVIDIA Nsight Systems and Compute for AI Performance Analysis

In a live session, NVIDIA senior deep‑learning solutions architect Zhai Jian demonstrates how to use Nsight Systems and Nsight Compute to analyze a simple neural‑network training workload, accelerate BERT with mixed precision, and examine matrix‑transpose kernels, with registration via QR code and a detailed event schedule.

AI toolsBERTGPU performance
0 likes · 2 min read
Live Session: Introduction to NVIDIA Nsight Systems and Compute for AI Performance Analysis
Architect's Guide
Architect's Guide
Sep 17, 2023 · Operations

Full‑Link Monitoring in Microservice Architectures: Concepts, Requirements, Architecture and Comparison of Zipkin, SkyWalking and Pinpoint

This article explains the need for full‑link monitoring in microservice systems, outlines its goals and functional modules, describes the core data structures of tracing such as Span and Annotation, and provides a detailed comparison of three popular APM solutions—Zipkin, SkyWalking and Pinpoint—covering performance impact, scalability, data analysis, developer transparency, topology visualization and community support.

APMFull‑Link MonitoringMicroservices
0 likes · 24 min read
Full‑Link Monitoring in Microservice Architectures: Concepts, Requirements, Architecture and Comparison of Zipkin, SkyWalking and Pinpoint
FunTester
FunTester
Aug 4, 2023 · Operations

How Tencent Scales Its Services for Chinese New Year: Inside Cloud Load‑Testing Strategies

This article details Tencent's cloud load‑testing approach for handling massive traffic spikes during Chinese New Year, covering background challenges, model selection, script authoring options, data construction, report analysis, and real‑world case studies that demonstrate capacity planning and performance optimization.

Load TestingMicroservicesRPS
0 likes · 21 min read
How Tencent Scales Its Services for Chinese New Year: Inside Cloud Load‑Testing Strategies
DataFunSummit
DataFunSummit
Jul 24, 2023 · Big Data

Design and Practice of OPPO Big Data Diagnostic Platform

This article presents the background, technical architecture, feature set, workflow, and practical results of OPPO's big data diagnostic platform, illustrating how intelligent, non‑intrusive task analysis improves efficiency, stability, and cost across massive offline and real‑time workloads.

Data PlatformOPPOTask Optimization
0 likes · 10 min read
Design and Practice of OPPO Big Data Diagnostic Platform
Programmer DD
Programmer DD
Jul 24, 2023 · Backend Development

Boost Your Spring Boot Startup Speed with Spring Startup Analyzer

Spring Startup Analyzer is an open-source tool that captures Spring Boot startup data, generates interactive reports, and offers async bean initialization to speed up launches, with detailed statistics, bean timelines, method call metrics, unused JAR detection, and flame-graph visualizations, plus step-by-step usage instructions.

Backend DevelopmentJavaSpring Boot
0 likes · 9 min read
Boost Your Spring Boot Startup Speed with Spring Startup Analyzer
Sohu Tech Products
Sohu Tech Products
Jun 7, 2023 · Mobile Development

Perfetto: Android Performance Tracing Tool – Concepts, Usage, and Analysis

This article introduces Perfetto, the Android platform‑level tracing tool, explains its architecture and data sources, details how to record and import trace files, demonstrates analysis features such as slice, counter and lock‑contention views, provides SQL query examples, and shares practical troubleshooting cases for mobile developers.

Android TracingMobile DevelopmentPerfetto
0 likes · 14 min read
Perfetto: Android Performance Tracing Tool – Concepts, Usage, and Analysis
Zhuanzhuan Tech
Zhuanzhuan Tech
Jun 7, 2023 · Backend Development

Understanding Rule Engines and Their Application in ZuanZuan Wallet with EasyRules Performance Analysis

This article explains the concepts of rule engines versus imperative programming, demonstrates rule‑engine implementation with EasyRules through code examples, discusses its business value and application in ZuanZuan Wallet, and analyzes the performance of EasyRules in rule evaluation and execution.

Backend DevelopmentBusiness RulesJava
0 likes · 10 min read
Understanding Rule Engines and Their Application in ZuanZuan Wallet with EasyRules Performance Analysis
JavaEdge
JavaEdge
Apr 2, 2023 · Operations

How Big Tech Analyzes System Performance: The Proven RESAR 7‑Step Method

The article presents the RESAR seven‑step performance analysis method used by large tech companies, detailing how to build a performance‑analysis decision tree, collect and correlate system counters, and combine global and targeted monitoring to uncover bottleneck evidence chains with concrete Linux commands and diagrams.

CPU profilingLinux monitoringRESAR Method
0 likes · 17 min read
How Big Tech Analyzes System Performance: The Proven RESAR 7‑Step Method
Architects' Tech Alliance
Architects' Tech Alliance
Feb 22, 2023 · Industry Insights

RDNA 2 vs Nvidia Ampere: Architecture, Cache, and Game Performance

This article provides an in‑depth technical analysis of AMD’s RDNA 2 GPU architecture, comparing its compute units, cache hierarchy, latency and bandwidth characteristics with Nvidia’s Ampere, and evaluates real‑world game performance in titles such as Cyberpunk 2077, Titanic Honor & Glory, and Gunner HEAT PC.

AMDGPU architectureRDNA 2
0 likes · 30 min read
RDNA 2 vs Nvidia Ampere: Architecture, Cache, and Game Performance
Top Architect
Top Architect
Dec 26, 2022 · Operations

An Introduction to eBPF: Concepts, Use Cases, and Practical Examples

This article provides a comprehensive overview of eBPF, explaining its origins, core concepts, comparison with SystemTap and DTrace, common use cases such as network monitoring, security filtering, and performance analysis, and includes step‑by‑step Python examples with BCC for tracing and latency measurement.

BCCLinux kernelNetwork Monitoring
0 likes · 21 min read
An Introduction to eBPF: Concepts, Use Cases, and Practical Examples
Code Ape Tech Column
Code Ape Tech Column
Nov 16, 2022 · Operations

Using VisualVM for JVM Monitoring and Memory Leak Analysis

This article introduces VisualVM, a Java profiling tool bundled with the JDK, explains how to install and use its plugins for monitoring CPU, memory, threads, and garbage collection, and demonstrates step‑by‑step memory‑leak detection and remote Tomcat monitoring with code examples.

JVM MonitoringJava profilingVisualVM
0 likes · 7 min read
Using VisualVM for JVM Monitoring and Memory Leak Analysis
DataFunSummit
DataFunSummit
Oct 25, 2022 · Databases

Design and Implementation of Meituan's Database Autonomy Service (DAS)

This article presents the background, challenges, architectural design, technical solutions, and future roadmap of Meituan's Database Autonomy Service (DAS), a platform that leverages big‑data collection, AI‑assisted root‑cause analysis, and automated operations to improve database performance, reliability, and self‑service capabilities.

AIBig DataDatabase Autonomy
0 likes · 18 min read
Design and Implementation of Meituan's Database Autonomy Service (DAS)
Alibaba Cloud Native
Alibaba Cloud Native
Sep 21, 2022 · Cloud Native

Why Continuous Profiling Is Essential for Cloud‑Native Java Applications

Continuous profiling (CP) bridges production and development by constantly feeding performance data back to developers, enabling on‑CPU and off‑CPU analysis, reducing overhead, and supporting tools like JFR and async‑profiler to diagnose CPU, memory, lock, and I/O bottlenecks in cloud‑native Java services.

JVMJavaProfiling Tools
0 likes · 20 min read
Why Continuous Profiling Is Essential for Cloud‑Native Java Applications
Baidu Intelligent Testing
Baidu Intelligent Testing
Aug 24, 2022 · Artificial Intelligence

Intelligent Test Analysis Practices: Contract Validation, Memory‑Leak Detection, Performance Diff, Test‑Case Completion, and Visual UI Recall

This article presents a comprehensive overview of intelligent test analysis techniques—including contract‑based validation point generation, time‑sliced C++ memory‑leak detection with DTW and CART, dynamic‑threshold performance diff, transformer‑based test‑case completion, and visual UI recall—demonstrating how data, algorithms, and engineering combine to improve testing accuracy and efficiency.

AI testingcontract testingmemory leak detection
0 likes · 11 min read
Intelligent Test Analysis Practices: Contract Validation, Memory‑Leak Detection, Performance Diff, Test‑Case Completion, and Visual UI Recall
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Aug 5, 2022 · Operations

Understanding Linux ftrace Function Graph Tracer on ARM64

The article details how Linux’s function‑graph ftrace tracer works on ARM64, explaining required kernel configs, how -pg inserts _mcount calls, the runtime patching of ftrace_graph_caller, register usage for argument passing, and return handling, and why shadow‑call‑stack must be disabled to enable precise call‑graph and timing analysis.

ARM64Linuxftrace
0 likes · 10 min read
Understanding Linux ftrace Function Graph Tracer on ARM64
Architects' Tech Alliance
Architects' Tech Alliance
Jun 10, 2022 · Cloud Computing

In‑Depth Analysis of AWS Graviton 3: Architecture, Performance, and Comparison with x86 Competitors

The article provides a comprehensive technical review of AWS’s Graviton 3 ARM server CPU, detailing its SVE support, branch prediction, front‑end, renamer, execution units, cache hierarchy, and performance comparisons with Neoverse N1, Intel Ice Lake, and AMD Zen 3, while discussing cloud‑centric design trade‑offs.

CPU architectureSVEperformance analysis
0 likes · 18 min read
In‑Depth Analysis of AWS Graviton 3: Architecture, Performance, and Comparison with x86 Competitors
Model Perspective
Model Perspective
Jun 6, 2022 · Operations

How to Derive the Core Formulas of a Single-Server Queueing System

This article walks through the theoretical derivation of the classic M/M/1 queueing model, detailing arrival and service rates, state balance equations, performance metrics such as utilization, average number in system, average waiting time, and average residence time, with illustrative formulas and explanations.

M/M/1Operations Researchperformance analysis
0 likes · 4 min read
How to Derive the Core Formulas of a Single-Server Queueing System
Efficient Ops
Efficient Ops
Feb 22, 2022 · Operations

Mastering ssar: A Deep Dive into Alibaba’s Open‑Source System Performance Tool

ssar is Alibaba’s open‑source system performance monitoring tool that extends traditional sar capabilities with comprehensive machine‑level, process‑level, and load metrics, offering rapid development, flexible configuration, and advanced diagnostics such as load5s, thread analysis, and custom Python query extensions for detailed OS troubleshooting.

Linuxdiagnosticsopen source
0 likes · 29 min read
Mastering ssar: A Deep Dive into Alibaba’s Open‑Source System Performance Tool
Tencent Cloud Developer
Tencent Cloud Developer
Feb 14, 2022 · Cloud Computing

Feedback‑Driven Compiler Optimizations for Cloud C/C++ Applications

The article shows how profiling‑driven compiler and OS techniques—such as sampling and instrumentation PGO, BOLT code layout, AutoFDO pipelines, basic‑block reordering, partial inlining, branch and function reordering—can alleviate instruction‑cache and front‑end stalls in large C/C++ cloud workloads, delivering up to 18 % performance gains.

C++Compiler OptimizationProfile Guided Optimization
0 likes · 16 min read
Feedback‑Driven Compiler Optimizations for Cloud C/C++ Applications
vivo Internet Technology
vivo Internet Technology
Dec 22, 2021 · Frontend Development

In-Depth Analysis of Chrome DevTools Architecture, Protocols, and Performance Tools

The article offers a thorough technical examination of Chrome DevTools’s client‑server architecture, the JSON‑based Chrome DevTools Protocol, and performance tooling within Android Chromium 87, guiding front‑end developers and engineers through its history, core implementation, code examples, JavaScript evaluation, performance diagnostics, and broader ecosystem impact.

AndroidCDPChrome DevTools
0 likes · 27 min read
In-Depth Analysis of Chrome DevTools Architecture, Protocols, and Performance Tools
IT Architects Alliance
IT Architects Alliance
Dec 12, 2021 · Operations

System Performance Issue Analysis and Optimization Process for Business Applications

The article outlines a comprehensive process for diagnosing and optimizing performance problems in production business systems, covering causes such as high concurrency, data growth, hardware constraints, and detailing analysis of hardware, OS, database, middleware, JVM settings, code inefficiencies, and the role of monitoring and APM tools.

BackendDatabase TuningJVM
0 likes · 13 min read
System Performance Issue Analysis and Optimization Process for Business Applications
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Dec 10, 2021 · Fundamentals

The Life Cycle of Block IO: From Generation to Return

The article walks through the entire block I/O lifecycle in mobile devices—defining block devices, describing how user actions and system processes generate I/O, detailing scheduling, bio-to-request conversion, dispatch via SCSI/UFS, and the interrupt‑driven return path that wakes waiting processes, linking each stage to performance and power metrics.

SCSI subsystemScheduling AlgorithmsUFS devices
0 likes · 9 min read
The Life Cycle of Block IO: From Generation to Return
Code Ape Tech Column
Code Ape Tech Column
Nov 15, 2021 · Operations

A Comprehensive Guide to Using Apache SkyWalking for Distributed Tracing, Logging, and Performance Analysis

This article introduces Apache SkyWalking as a powerful open‑source APM solution, compares it with Spring Cloud Sleuth+ZipKin, explains its architecture, walks through server and client setup, data persistence, log collection, performance profiling, alert configuration, and provides practical code snippets and configuration examples.

Distributed TracingJavaObservability
0 likes · 14 min read
A Comprehensive Guide to Using Apache SkyWalking for Distributed Tracing, Logging, and Performance Analysis
Code Ape Tech Column
Code Ape Tech Column
Jun 29, 2021 · Industry Insights

Which Distributed Tracing Tool Wins? A Deep Dive into Dapper, Zipkin, Pinpoint, and SkyWalking

This article examines the challenges of monitoring complex micro‑service architectures, outlines the objectives of full‑link tracing, explains the Span/Trace data model, describes core functional modules, and provides a detailed performance and feature comparison of Google Dapper, Zipkin, Pinpoint, and SkyWalking.

APMDistributed TracingFull‑Link Monitoring
0 likes · 22 min read
Which Distributed Tracing Tool Wins? A Deep Dive into Dapper, Zipkin, Pinpoint, and SkyWalking
Top Architect
Top Architect
Mar 16, 2021 · Operations

Full-Link Monitoring: Concepts, Architecture, and Comparison of Zipkin, SkyWalking, and Pinpoint

This article explains the fundamentals of full‑link (distributed) monitoring, describes its core components such as spans, traces and annotations, outlines typical system architecture, and provides a detailed performance and feature comparison of three popular APM solutions—Zipkin, SkyWalking, and Pinpoint.

APMDistributed TracingFull‑Link Monitoring
0 likes · 22 min read
Full-Link Monitoring: Concepts, Architecture, and Comparison of Zipkin, SkyWalking, and Pinpoint
Tencent Music Tech Team
Tencent Music Tech Team
Mar 4, 2021 · Mobile Development

Analysis of Android Virtual Memory and Address Space in QQ Music/Karaoke App

The article explains Android virtual memory concepts, address space limits for 32‑ and 64‑bit apps, layout of QQ Music/Karaoke process memory, tools for inspecting /proc/pid/smaps, and shows how memory growth (e.g., loading libYTCommon.so) can cause 32‑bit apps to hit the 4 GB limit and crash.

AndroidMemory ManagementMobile Development
0 likes · 20 min read
Analysis of Android Virtual Memory and Address Space in QQ Music/Karaoke App
Beike Product & Technology
Beike Product & Technology
Feb 23, 2021 · Mobile Development

Beike iOS App Startup Optimization: Measurement, Analysis, and Governance Practices

This article details Beike's iOS app startup optimization workflow, covering online startup speed measurement, offline performance analysis with a custom profiler, and governance practices such as +load method control, code stripping, and binary reordering to improve launch time and user experience.

BKTimeProfilerBinary ReorderingMobile Development
0 likes · 16 min read
Beike iOS App Startup Optimization: Measurement, Analysis, and Governance Practices
MaGe Linux Operations
MaGe Linux Operations
Feb 13, 2021 · Operations

Comparing Full‑Link Tracing Tools: Zipkin vs Pinpoint vs SkyWalking

This article examines the challenges of monitoring distributed micro‑service architectures, outlines the requirements for a full‑link tracing system, and provides a detailed comparison of three popular APM solutions—Zipkin, Pinpoint, and SkyWalking—covering performance impact, scalability, data analysis, developer transparency, and topology visualization.

APMDistributed TracingFull‑Link Monitoring
0 likes · 28 min read
Comparing Full‑Link Tracing Tools: Zipkin vs Pinpoint vs SkyWalking
Efficient Ops
Efficient Ops
Jan 26, 2021 · Operations

How Full‑Link Tracing Tools Compare: Zipkin vs SkyWalking vs Pinpoint

This article examines the challenges of monitoring complex micro‑service architectures, outlines the goals and functional modules of full‑link tracing systems, explains Google Dapper’s core concepts such as Span, Trace and Annotation, and provides a detailed performance, scalability and feature comparison of three popular APM solutions—Zipkin, SkyWalking and Pinpoint.

APMDistributed TracingFull‑Link Monitoring
0 likes · 25 min read
How Full‑Link Tracing Tools Compare: Zipkin vs SkyWalking vs Pinpoint
Amap Tech
Amap Tech
Jan 15, 2021 · Mobile Development

MemTower: A Rust‑Based Native Memory Profiling Solution for Android

MemTower is a Rust‑rewritten native memory profiler for Android that supports versions from 4.x onward, uses an LD_PRELOAD custom allocator to avoid recursive malloc loops, provides fast ELF‑based stack unwinding, multi‑dimensional leak analysis and flame‑graph visualisation, and cuts leak‑investigation time from days to minutes.

AndroidRustmemory profiling
0 likes · 13 min read
MemTower: A Rust‑Based Native Memory Profiling Solution for Android
Architects' Tech Alliance
Architects' Tech Alliance
Sep 28, 2020 · Fundamentals

In‑Depth Analysis of Loongson GS464E CPU Architecture and Performance

This article provides a comprehensive technical review of the Chinese Loongson GS464E processor, covering its micro‑architectural design choices, instruction‑fetch and out‑of‑order execution units, cache hierarchy, benchmark results, manufacturing details, and the challenges it faces in competing with mainstream Intel and AMD CPUs.

CPU architectureGS464ELoongson
0 likes · 21 min read
In‑Depth Analysis of Loongson GS464E CPU Architecture and Performance
Aikesheng Open Source Community
Aikesheng Open Source Community
Sep 22, 2020 · Databases

Using pt-query-digest to Analyze MySQL Slow Query Logs

This article introduces pt-query-digest from Percona Toolkit, explains how to install the toolkit, configure MySQL slow‑query logging, run pt-query-digest on the slow log, interpret its detailed output, and locate specific SQL statements for performance tuning.

Slow Query Logmysqlpercona-toolkit
0 likes · 15 min read
Using pt-query-digest to Analyze MySQL Slow Query Logs
Liangxu Linux
Liangxu Linux
Aug 12, 2020 · Fundamentals

Why Microkernels Can Beat Monolithic Kernels: A Deep Dive with C Simulations

The article examines the performance drawbacks of traditional monolithic kernels, especially IPC overhead, and argues that microkernel designs using arbitration can reduce lock contention, supported by C code simulations and benchmark graphs that compare execution time, CPU utilization, and scalability across thread and CPU counts.

IPCMonolithic KernelOperating System
0 likes · 19 min read
Why Microkernels Can Beat Monolithic Kernels: A Deep Dive with C Simulations
Efficient Ops
Efficient Ops
Jul 7, 2020 · Operations

Mastering Linux Performance: From CPU to Flame Graphs

This article presents a comprehensive guide to Linux performance analysis, covering background, methodology, tools, and step‑by‑step case studies for CPU, memory, disk I/O, network, system load, and flame‑graph techniques to quickly locate and resolve bottlenecks.

CPU profilingLinuxflame graph
0 likes · 19 min read
Mastering Linux Performance: From CPU to Flame Graphs