Tagged articles

431 articles

Page 1 of 5

May 8, 2026 · Operations

Deep Dive into Server Performance: Analyzing CPU, Memory, Disk, and Network Bottlenecks

This article explains how to identify and troubleshoot the four main resource bottlenecks—CPU, memory, disk I/O, and network—by detailing Linux internals, key metrics, practical command examples, real‑world case studies, and a step‑by‑step decision tree for accurate diagnosis and tuning.

CPUDisk I/OLinux

0 likes · 46 min read

Deep Dive into Server Performance: Analyzing CPU, Memory, Disk, and Network Bottlenecks

Su San Talks Tech

May 5, 2026 · Databases

Alibaba Interview: Key Considerations for Indexing Tens‑Millions‑Row Tables

The article explains how to safely add indexes to a tens‑of‑millions‑row MySQL table, covering lock duration, disk‑space impact, write‑performance degradation, and six practical principles—selective columns, proper order, covering indexes, avoiding redundancy, short‑lock tools, and ongoing monitoring—plus pros, cons, and suitable use cases.

Large TablesOnline DDLPerformance Monitoring

0 likes · 10 min read

Alibaba Interview: Key Considerations for Indexing Tens‑Millions‑Row Tables

Java Architecture Diary

Apr 15, 2026 · Operations

Unlock JVM Mysteries: How Arthas and AI Turn Debugging into a One‑Click Process

When a service’s P99 latency spikes to seconds and CPU hits 90% without logs, Arthas lets you inspect the JVM in real time without code changes, and its AI‑driven MCP extension automates command selection, enabling developers to diagnose, trace, decompile, and monitor issues through simple natural‑language prompts and Spring Boot integration.

AI OperationsArthasJVM debugging

0 likes · 8 min read

Unlock JVM Mysteries: How Arthas and AI Turn Debugging into a One‑Click Process

Alibaba Cloud Observability

Mar 30, 2026 · Mobile Development

Why Your Android App’s Network Requests Stall and How RUM Reveals the Fix

The article explains how mobile network performance issues—especially long connection‑pool wait times—cause slow page loads in Android apps, demonstrates how to interpret Alibaba Cloud RUM’s Resource event metrics and timing data, walks through a real‑world case study with detailed stage‑by‑stage analysis, and provides concrete diagnostic steps and optimization recommendations for OkHttp connection‑pool configuration and other common bottlenecks.

AndroidConnection PoolMobile Development

0 likes · 16 min read

Why Your Android App’s Network Requests Stall and How RUM Reveals the Fix

Alibaba Cloud Native

Mar 24, 2026 · Mobile Development

Why Your Android App Is Slow: Uncovering Hidden Network Bottlenecks with RUM

This article explains how mobile network diversity, device fragmentation, and limited visibility make performance troubleshooting hard, then introduces Alibaba Cloud RUM's Resource event model, walks through its attribute and metric fields, demonstrates a real‑world Android case study with step‑by‑step timing analysis, and provides concrete diagnosis and optimization guidelines for connection‑pool, DNS, SSL, and TTFB issues.

AndroidOkHttpPerformance Monitoring

0 likes · 18 min read

Why Your Android App Is Slow: Uncovering Hidden Network Bottlenecks with RUM

JD Retail Technology

Mar 3, 2026 · Frontend Development

How JD’s Order Module Achieved One‑Code‑Three‑Platform Success with Taro

This article details JD.com’s six‑month engineering effort to refactor its high‑traffic order module into a single Taro codebase that runs on Android, iOS and HarmonyOS, covering business background, preparation, multi‑mode adaptation, core challenges, quality assurance, efficiency gains and the resulting business impact.

HarmonyOSOrder ModulePerformance Monitoring

0 likes · 21 min read

How JD’s Order Module Achieved One‑Code‑Three‑Platform Success with Taro

java1234

Mar 3, 2026 · Backend Development

One‑Line Java Time Tracker: Millisecond Precision and Up to 300% Faster Performance

The article shows how to replace repetitive System.currentTimeMillis() timing code with a concise TimeTracker utility that leverages try‑with‑resources, functional interfaces, and flexible exception handling to achieve millisecond‑level measurement and dramatically improve code readability and performance.

Exception HandlingFunctional InterfaceJava

0 likes · 13 min read

One‑Line Java Time Tracker: Millisecond Precision and Up to 300% Faster Performance

Linux Tech Enthusiast

Feb 25, 2026 · Fundamentals

How Linux File Systems and Disk I/O Work

The article explains Linux's core storage components—inode, dentry, superblock, and logical blocks—how the Virtual File System abstracts different file systems, the classification of file systems and I/O types, disk technologies, the block layer, I/O schedulers, and practical performance metrics and monitoring tools.

Disk I/OI/O schedulerLinux

0 likes · 20 min read

How Linux File Systems and Disk I/O Work

Deepin Linux

Feb 14, 2026 · Operations

Master Linux Memory Troubleshooting: Detect Leaks and High Usage Efficiently

This guide walks Linux operators through the fundamentals of memory management, explains key metrics, and provides step‑by‑step instructions for using tools like top, free, vmstat, pmap, valgrind, smem, smaps and slabtop to pinpoint and resolve memory leaks and excessive memory consumption in production systems.

Debugging ToolsPerformance MonitoringSystem Administration

0 likes · 45 min read

Master Linux Memory Troubleshooting: Detect Leaks and High Usage Efficiently

Woodpecker Software Testing

Feb 12, 2026 · Operations

How to Build a Full‑Chain JMeter Load Test for an E‑Commerce Mega‑Sale

This article walks through designing and implementing a complete JMeter load‑testing solution for an e‑commerce platform's big‑sale scenario, covering business‑flow mapping, request correlation, multi‑stage stress testing, real‑time monitoring with InfluxDB + Grafana, bottleneck identification, and practical optimization tips.

GrafanaInfluxDBJMeter

0 likes · 7 min read

How to Build a Full‑Chain JMeter Load Test for an E‑Commerce Mega‑Sale

Java Backend Technology

Feb 10, 2026 · Operations

Boost Java Service Performance with MyPerf4J: A High‑Throughput Monitoring Tool

This article introduces MyPerf4J, a Java‑agent based, low‑overhead performance monitoring tool that provides real‑time metrics such as RPS, latency percentiles, memory usage and GC statistics, and explains how to install, configure, run and uninstall it in development and production environments.

BackendJavaJavaAgent

0 likes · 7 min read

Boost Java Service Performance with MyPerf4J: A High‑Throughput Monitoring Tool

Sohu Tech Products

Feb 4, 2026 · Frontend Development

How to Seamlessly Integrate Legacy Backends with Vue and Qiankun Micro‑Frontend

This article presents a comprehensive case study on unifying multiple legacy backend systems by building a modern Vue‑based front‑end, combining native iframe integration with the qiankun micro‑frontend framework to achieve a single entry point, smooth migration, reduced costs, and an evolvable architecture for enterprise applications.

Frontend ArchitectureIframe IntegrationPerformance Monitoring

0 likes · 18 min read

How to Seamlessly Integrate Legacy Backends with Vue and Qiankun Micro‑Frontend

Coder Trainee

Jan 24, 2026 · Operations

Essential Linux Commands for User, System Management and Performance Monitoring

This article explains essential Linux commands for programmers, covering user management, system administration, and performance monitoring, describes how Linux treats everything as a file, distinguishes Linux and shell commands, and suggests learning 10‑20 commands daily, with visual examples.

Performance MonitoringShell CommandsUser Management

0 likes · 2 min read

Essential Linux Commands for User, System Management and Performance Monitoring

Full-Stack DevOps & Kubernetes

Jan 5, 2026 · Operations

Why High Load Doesn’t Mean High CPU: Uncovering the Real Cause of Linux Server Bottlenecks

A production incident shows a server with 80% CPU usage but a Load Average over 40, revealing that high load often stems from IO wait and soft interrupts rather than CPU saturation, and provides a step‑by‑step troubleshooting guide using top, vmstat, iostat and ps.

CPUIO WaitLoad Average

0 likes · 9 min read

Why High Load Doesn’t Mean High CPU: Uncovering the Real Cause of Linux Server Bottlenecks

Woodpecker Software Testing

Dec 23, 2025 · Mobile Development

Ultimate Apache Weex UX Testing Guide: 10 Key Metrics and Optimization Tips

This article explains how to evaluate and improve Apache Weex applications by using built‑in performance monitors, defining essential metrics such as first‑screen render time and interaction latency, and applying ten practical testing and optimization techniques for both iOS and Android platforms.

AndroidMobile DevelopmentPerformance Monitoring

0 likes · 7 min read

Ultimate Apache Weex UX Testing Guide: 10 Key Metrics and Optimization Tips

IT Architects Alliance

Dec 14, 2025 · Operations

How to Build a Scientific KPI System for Enterprise Architecture Efficiency

This article explains why many enterprises lack quantitative architecture efficiency metrics, outlines the multidimensional challenges of assessing technical, business, cost, and organizational performance, and provides a detailed, step‑by‑step KPI framework—including technical, business, cost, and organizational indicators, data collection automation, monitoring dashboards, and continuous improvement practices—to enable data‑driven architecture optimization.

EnterpriseKPIOperations

0 likes · 9 min read

How to Build a Scientific KPI System for Enterprise Architecture Efficiency

Alibaba Cloud Native

Dec 13, 2025 · Mobile Development

How to Diagnose Android Crash Issues with RUM: A Step‑by‑Step Case Study

This article walks through a real‑world Android crash investigation, showing how to collect, visualize, and analyze crash data with Alibaba Cloud RUM, trace user behavior, identify the root cause in ProductListAdapter, and apply symbolication and version comparison to resolve the issue efficiently.

AndroidPerformance MonitoringRUM

0 likes · 13 min read

How to Diagnose Android Crash Issues with RUM: A Step‑by‑Step Case Study

Raymond Ops

Dec 7, 2025 · Operations

Master Linux System Performance: Top, htop, vmstat, iostat & Advanced Tuning Secrets

This comprehensive guide walks you through Linux system performance monitoring using tools like top, htop, vmstat, iostat, sar, netstat, ps, and free, explains each metric, provides practical shell scripts for real‑time analysis, alerts, and detailed tuning strategies for CPU, memory, disk, and network.

LinuxPerformance MonitoringShell Scripts

0 likes · 34 min read

Master Linux System Performance: Top, htop, vmstat, iostat & Advanced Tuning Secrets

Tech Stroll Journey

Dec 4, 2025 · Operations

Understanding Linux CPU Load Average and Utilization: A Practical Guide

This article explains the key CPU performance metrics on Linux—including load average, CPU usage percentages, and process states—showing how to interpret top, ps, uptime, and mpstat outputs and how to differentiate between load and utilization in various workload scenarios.

CPULinuxLoad Average

0 likes · 11 min read

Understanding Linux CPU Load Average and Utilization: A Practical Guide

Ray's Galactic Tech

Dec 1, 2025 · Databases

Build an End-to-End MySQL Slow Query Log Collection and Analysis System with ELK

This guide walks through configuring MySQL slow query logging, using Logstash to ingest and parse logs, storing them in Elasticsearch, visualizing with Kibana dashboards, and setting up alerts for real-time performance monitoring.

Database OptimizationELKKibana

0 likes · 5 min read

Build an End-to-End MySQL Slow Query Log Collection and Analysis System with ELK

Architecture Digest

Nov 24, 2025 · Operations

Boost Java Service Performance with MyPerf4J: A High‑Speed, Low‑Impact Monitoring Tool

MyPerf4J is an open‑source, high‑performance Java monitoring and statistics tool that uses a JavaAgent for zero‑intrusion, records up to ten million method calls per second with nanosecond precision, and provides real‑time metrics such as QPS, latency percentiles, memory and GC stats, making it ideal for both development and production environments.

JavaJavaAgentPerformance Monitoring

0 likes · 6 min read

Boost Java Service Performance with MyPerf4J: A High‑Speed, Low‑Impact Monitoring Tool

Liangxu Linux

Nov 23, 2025 · Operations

20 Essential Linux Commands Every Ops Engineer Must Master

This article presents twenty indispensable Linux command‑line tools—covering system monitoring, performance analysis, process management, network diagnostics, disk handling, and kernel tuning—explaining their syntax, practical tips, common pitfalls, and how they integrate with modern cloud‑native environments.

LinuxNetwork DiagnosticsOperations

0 likes · 12 min read

20 Essential Linux Commands Every Ops Engineer Must Master

Tech Stroll Journey

Nov 13, 2025 · Operations

Understanding Disk Performance Metrics and Process‑Level I/O on Linux

This guide explains Linux disk performance indicators—utilization, saturation, IOPS, throughput, and latency—their interrelationships, and how to monitor both overall and per‑process I/O using tools like iotop, pidstat, and the /proc filesystem.

Disk PerformanceI/O MetricsLinux

0 likes · 10 min read

Understanding Disk Performance Metrics and Process‑Level I/O on Linux

Ray's Galactic Tech

Nov 12, 2025 · Operations

Mastering JVM: The Ultimate Toolbox for Monitoring, Profiling, and Debugging

Explore the comprehensive suite of built‑in and third‑party JVM tools—including jps, jstack, jmap, jcmd, JConsole, VisualVM, async‑profiler, Arthas, MAT, JProfiler, and APM solutions—detailing their core features, common commands, and practical use‑cases for monitoring, diagnosing, and optimizing Java applications in development and production.

JVMJava ToolsPerformance Monitoring

0 likes · 9 min read

Mastering JVM: The Ultimate Toolbox for Monitoring, Profiling, and Debugging

Alibaba Cloud Infrastructure

Nov 12, 2025 · Operations

How Alibaba Cloud’s One‑Click IO Diagnosis Solves Multi‑Tenant Performance Bottlenecks

The article explains how Alibaba Cloud’s OS console implements a one‑click IO diagnostic that automatically detects, classifies, and resolves high‑latency, burst, and iowait IO issues in multi‑tenant cloud environments by using dynamic thresholds, periodic metric collection, and targeted root‑cause analysis.

Alibaba CloudIO diagnosticsPerformance Monitoring

0 likes · 11 min read

How Alibaba Cloud’s One‑Click IO Diagnosis Solves Multi‑Tenant Performance Bottlenecks

Alibaba Cloud Observability

Nov 10, 2025 · Cloud Native

How to Diagnose and Fix Memory & CPU Latency Issues in Cloud‑Native Kubernetes Clusters

This article explains why resource over‑commit in cloud‑native Kubernetes clusters leads to memory and CPU latency, shows how to visualize kernel delays with the ack‑sysom‑monitor exporter, outlines common latency scenarios, and provides step‑by‑step troubleshooting and remediation guidance.

CPU schedulingCloud NativeKubernetes

0 likes · 11 min read

How to Diagnose and Fix Memory & CPU Latency Issues in Cloud‑Native Kubernetes Clusters

Linux Kernel Journey

Nov 4, 2025 · Operations

How to Use Kernel Tracepoints for Zero‑Overhead GPU Driver Monitoring

This tutorial explains how to leverage Linux kernel tracepoints with eBPF and bpftrace to capture real‑time GPU driver activity—including job scheduling, memory management, and command submission—across Intel, AMD, Nouveau, and NVIDIA GPUs, providing detailed examples, scripts, and analysis of the resulting data.

DRMGPUPerformance Monitoring

0 likes · 20 min read

How to Use Kernel Tracepoints for Zero‑Overhead GPU Driver Monitoring

Architect's Guide

Nov 1, 2025 · Backend Development

How to Use Java Agents for Non‑Intrusive Method Timing and Tracing

This article demonstrates how to use Java Agent and the java.lang.instrument API to non‑intrusively measure method execution time, dynamically modify bytecode with ASM, leverage Attach for runtime instrumentation, and explore related tools like Arthas and Bytekit for tracing and debugging Java applications.

ArthasBytekitInstrumentation

0 likes · 24 min read

How to Use Java Agents for Non‑Intrusive Method Timing and Tracing

Ops Community

Oct 28, 2025 · Operations

Master Linux Performance: Top, iotop, pidstat, sar – Real‑World Diagnostic Guide

This guide covers Linux performance analysis tools—including top, htop, iotop, pidstat, iostat, sar, and vmstat—detailing installation, usage, key metrics, troubleshooting scenarios, monitoring integration with Prometheus, and best‑practice recommendations for effective system diagnostics and capacity planning.

OpsPerformance Monitoringiotop

0 likes · 29 min read

Master Linux Performance: Top, iotop, pidstat, sar – Real‑World Diagnostic Guide

Architect's Guide

Oct 22, 2025 · Backend Development

Top 10 Essential Java Development Tools from Alibaba for Faster Coding

This article introduces ten widely used Alibaba‑developed Java tools—including the Java Initializr scaffolding service, Arthas diagnostic utility, Dragonwell JDK, code‑style scanner, ARMS monitoring platform, PTS performance tester, AHAS high‑availability suite, EasyExcel, HandyJSON, and the Druid connection pool—highlighting their features, use cases, and links to official sites.

Backend DevelopmentJavaOpenJDK

0 likes · 12 min read

Top 10 Essential Java Development Tools from Alibaba for Faster Coding

Alibaba Cloud Observability

Oct 20, 2025 · Mobile Development

Accelerate iOS Issue Diagnosis with Cloud‑Native Data Collection SDK

Mobile developers often struggle with unreproducible crashes and lag reported by users, spending days sifting through logs and isolated stack traces; this article explains how a cloud‑native iOS SDK links performance metrics, error logs, and user behavior through systematic data collection to dramatically speed up issue diagnosis.

Method SwizzlingMobile DevelopmentPerformance Monitoring

0 likes · 9 min read

Accelerate iOS Issue Diagnosis with Cloud‑Native Data Collection SDK

Tech Stroll Journey

Oct 17, 2025 · Fundamentals

Understanding Linux Memory Management: From Physical RAM to Virtual Address Spaces

Linux memory management relies on virtual address spaces that map to physical RAM, using page tables, multi-level paging, and dynamic allocation across segments like stack, heap, and mmap, while tools such as free, top, and ps let administrators monitor usage and handle OOM scenarios.

LinuxMemory ManagementOperating System

0 likes · 9 min read

Understanding Linux Memory Management: From Physical RAM to Virtual Address Spaces

Alibaba Cloud Native

Oct 14, 2025 · Mobile Development

How Alibaba Cloud RUM SDK Captures iOS App Performance and Crashes

The article explains the architecture, data collection methods, and crash monitoring techniques of Alibaba Cloud's RUM SDK for iOS, detailing session tracing, performance metrics, Method Swizzling, system event handling, and KSCrash integration to improve issue diagnosis.

Crash ReportingMethod SwizzlingMobile Development

0 likes · 9 min read

How Alibaba Cloud RUM SDK Captures iOS App Performance and Crashes

AndroidPub

Oct 10, 2025 · Mobile Development

How Koin Annotations 2.2 Makes Migrating from Dagger/Hilt to Koin Seamless

Koin Annotations 2.2 introduces JSR‑330 compatibility, predefined scope archetypes, smart module configuration, and built‑in performance monitoring, enabling Android and Kotlin Multiplatform projects to migrate from Dagger or Hilt to Koin safely, progressively, and with minimal code changes.

AndroidJSR-330Koin

0 likes · 10 min read

How Koin Annotations 2.2 Makes Migrating from Dagger/Hilt to Koin Seamless

Mike Chen's Internet Architecture

Oct 7, 2025 · Databases

Essential MySQL Commands Cheat Sheet: Manage Databases, Tables, Users & More

This guide provides a comprehensive collection of essential MySQL commands covering database creation, table schema management, data manipulation, user permissions, performance monitoring, and backup/restore procedures, offering a practical reference for developers and administrators.

BackupPerformance MonitoringSQL

0 likes · 4 min read

Essential MySQL Commands Cheat Sheet: Manage Databases, Tables, Users & More

Raymond Ops

Sep 29, 2025 · Operations

Master Linux Performance: 5 Essential Monitoring Commands Explained

This article introduces five essential Linux performance monitoring commands—vmstat, iostat, free, df, and sar—detailing their purpose, key options, example usages, and the meaning of each output column to help you effectively track system resources.

FreeLinuxPerformance Monitoring

0 likes · 11 min read

Master Linux Performance: 5 Essential Monitoring Commands Explained

Alibaba Cloud Observability

Sep 29, 2025 · Cloud Native

What Makes HarmonyOS NEXT a Pure Cloud‑Native OS? Inside the Architecture and SDK

This article introduces HarmonyOS NEXT's pure, fast, and minimal design, its development base with DevEco Studio and ArkTS, the compilation artifacts (HAR, HSP, HAP), the system‑level open capability map, and the ARMS RUM SDK's architecture, session management, and three unobtrusive data‑collection schemes for performance monitoring.

Cloud NativeHarmonyOSPerformance Monitoring

0 likes · 11 min read

What Makes HarmonyOS NEXT a Pure Cloud‑Native OS? Inside the Architecture and SDK

Tech Freedom Circle

Sep 25, 2025 · Operations

RAGFlow Link Tracing: GPS‑Style Observability for LLM‑Powered Applications

The article explains why RAGFlow needs end‑to‑end link tracing, introduces OpenTelemetry’s core concepts, shows how custom tracing utilities are implemented in Python, describes the layered architecture, provides concrete Docker and YAML configurations, and offers best‑practice guidelines for performance monitoring and fault diagnosis.

Distributed SystemsLLMObservability

0 likes · 24 min read

RAGFlow Link Tracing: GPS‑Style Observability for LLM‑Powered Applications

Code Ape Tech Column

Sep 22, 2025 · Backend Development

How to Build an Elegant Java TimeTracker with AutoCloseable and Lambdas

This article explains how to design a lightweight, flexible Java TimeTracker utility that leverages AutoCloseable, try‑with‑resources, functional interfaces and lambda expressions to simplify performance monitoring, support automatic exception handling, and provide extensible features for real‑world backend development.

LambdaPerformance Monitoringautocloseable

0 likes · 13 min read

How to Build an Elegant Java TimeTracker with AutoCloseable and Lambdas

Liangxu Linux

Sep 21, 2025 · Operations

10 Essential Linux Performance Monitoring Commands Every Sysadmin Should Know

Master Linux system performance by learning ten powerful monitoring commands—top, vmstat, lsof, iotop, iostat, htop, netstat, iftop, tcpdump, and nethogs—each illustrated with usage examples and output, enabling quick diagnosis of CPU, memory, disk, and network issues.

LinuxLinux toolsOperations

0 likes · 10 min read

10 Essential Linux Performance Monitoring Commands Every Sysadmin Should Know

BirdNest Tech Talk

Sep 16, 2025 · Operations

How Go 1.25’s Trace Flight Recorder Enables Low‑Overhead Production Debugging

Go 1.25 introduces Trace Flight Recorder, a lightweight circular‑buffer tracing tool that lets developers capture recent execution data in production with minimal overhead, and the article walks through its concepts, configuration, code demos, analysis workflow, and practical use cases.

GoPerformance MonitoringRuntime

0 likes · 12 min read

How Go 1.25’s Trace Flight Recorder Enables Low‑Overhead Production Debugging

Architect's Guide

Sep 15, 2025 · Backend Development

How to Use Java Agents and Instrumentation for Non‑Intrusive Performance Monitoring

This article explains why inserting manual timing code is invasive, introduces the java.lang.instrument API, shows how to write a tiny Java Agent with premain and agentmain methods, demonstrates dynamic class redefinition via the Attach API, and explores how tools like Arthas and Bytekit leverage these mechanisms for runtime tracing and bytecode enhancement.

ArthasAttach APIInstrumentation

0 likes · 19 min read

How to Use Java Agents and Instrumentation for Non‑Intrusive Performance Monitoring

NiuNiu MaTe

Sep 12, 2025 · Backend Development

Mastering QPS: From Basics to Real‑World Interview Wins

This article explains what QPS (queries per second) really means, distinguishes it from TPS and concurrency, shows typical QPS ranges for different systems, and provides practical methods—production monitoring, APM tools, custom metrics, load‑testing and manual estimation—to obtain, validate, and interpret QPS for performance optimization and interview success.

Interview PreparationLoad TestingPerformance Monitoring

0 likes · 22 min read

Mastering QPS: From Basics to Real‑World Interview Wins

MaGe Linux Operations

Sep 4, 2025 · Operations

Master tcpdump: Real-World Linux Network Troubleshooting Techniques

This comprehensive guide walks you through why tcpdump is essential for ops engineers, how to install and configure it, basic and advanced filtering commands, real incident case studies, performance tuning, security analysis, and integration with other tools, turning raw packet captures into actionable insights.

Performance MonitoringSecuritynetwork troubleshooting

0 likes · 22 min read

Master tcpdump: Real-World Linux Network Troubleshooting Techniques

Code Ape Tech Column

Sep 2, 2025 · Operations

Avoid QPS Miscalculations: 5 Proven Methods to Accurately Measure Traffic

This article explains five practical ways to count QPS—from gateway and application instrumentation to monitoring tools, log analysis, and database metrics—while highlighting common pitfalls such as health‑check filtering, thread‑safety, and multi‑node aggregation, helping engineers make informed scaling decisions.

ELKJavaPerformance Monitoring

0 likes · 16 min read

Avoid QPS Miscalculations: 5 Proven Methods to Accurately Measure Traffic

Architect's Guide

Sep 1, 2025 · Operations

How Does Distributed Link Tracing Work? Inside SkyWalking’s Architecture

This article explains the concept of distributed link tracing, its principles, metrics, and implementation details—including monolithic and microservice approaches, OpenTracing standards, and how SkyWalking solves challenges like automatic span collection, context propagation, unique trace IDs, and sampling performance.

Distributed TracingMicroservicesObservability

0 likes · 12 min read

How Does Distributed Link Tracing Work? Inside SkyWalking’s Architecture

FunTester

Sep 1, 2025 · Operations

Why Load Testing Is Critical for High‑Traffic Apps and How to Do It Right

This article explains why load testing is essential for modern applications that must serve millions of users, outlines various test types and best‑practice steps, recommends tools and frameworks, and shows how continuous testing integrated into CI/CD pipelines ensures scalability, reliability, and optimal performance under unpredictable traffic spikes.

Load TestingPerformance MonitoringScalability

0 likes · 11 min read

Why Load Testing Is Critical for High‑Traffic Apps and How to Do It Right

Raymond Ops

Aug 27, 2025 · Operations

Essential Linux Commands for Quick System Performance Diagnosis

When a Linux server shows performance issues, this guide walks you through the most important one‑minute metrics—using commands like uptime, dmesg, vmstat, mpstat, pidstat, iostat, free, sar and top—to quickly pinpoint CPU, memory, I/O, and network bottlenecks.

LinuxPerformance Monitoringcommand-line

0 likes · 17 min read

Essential Linux Commands for Quick System Performance Diagnosis

dbaplus Community

Aug 25, 2025 · Operations

20 Essential Linux Disk Management Techniques Every Sysadmin Must Master

This guide presents twenty practical Linux disk‑management techniques—from basic inspection commands like lsblk and blkid to advanced LVM, RAID, and performance‑monitoring tools—providing step‑by‑step examples, command syntax, and best‑practice recommendations for reliable storage administration.

LVMLinuxPerformance Monitoring

0 likes · 8 min read

20 Essential Linux Disk Management Techniques Every Sysadmin Must Master

Code Ape Tech Column

Aug 12, 2025 · Operations

How to Use Meteor for Real-Time Java App Diagnosis and Performance Tuning

Meteor, built on Alibaba's Arthas, is a non‑intrusive Java diagnostic console that lets developers monitor running applications, locate performance bottlenecks, memory leaks, and thread deadlocks without restarting services, offering a SpringBoot‑based architecture, quick start commands, and features such as class inspection, live code editing, method monitoring, thread management, and dashboards.

ArthasJavaPerformance Monitoring

0 likes · 4 min read

How to Use Meteor for Real-Time Java App Diagnosis and Performance Tuning

MaGe Linux Operations

Aug 10, 2025 · Operations

How to Resolve 100% CPU Outages in Under 3 Minutes: A Real‑World Emergency Guide

This article walks through a real‑world 100% CPU incident on an e‑commerce platform, showing how to detect the problem within seconds, analyze Java threads, apply quick emergency fixes, implement permanent refactoring, and set up long‑term monitoring to prevent future outages.

CPUJavaOperations

0 likes · 14 min read

How to Resolve 100% CPU Outages in Under 3 Minutes: A Real‑World Emergency Guide

Tencent Architect

Jul 28, 2025 · Operations

How TencentOS NBS Solves Network Latency Mysteries: Real‑Time Trace Without Disruption

Network latency spikes often leave developers guessing whether the culprit lies in user‑space, the kernel stack, or the physical link; this article introduces TencentOS’s NBS (Net Blackboard System), a low‑overhead, zero‑disruption solution that pinpoints delay sources, supports continuous deployment, and outperforms traditional tools like tcpdump and bpftrace.

NBSNetwork LatencyOperations

0 likes · 14 min read

How TencentOS NBS Solves Network Latency Mysteries: Real‑Time Trace Without Disruption

Qunhe Technology Quality Tech

Jul 23, 2025 · Operations

How We Boosted CDN Stability by 80% with Automated Testing and Dual Architecture

This article details how test development engineers tackled CDN stability challenges by implementing a dual overseas architecture, automated backend interface validation, and a data‑driven front‑end image detection pipeline, achieving an 80% efficiency gain and minute‑level issue detection.

Automated TestingCDNDevOps

0 likes · 11 min read

How We Boosted CDN Stability by 80% with Automated Testing and Dual Architecture

Code Ape Tech Column

Jul 3, 2025 · Backend Development

How to Detect and Fix Memory Leaks in Spring Boot Applications

This guide explains the fundamentals of memory leaks in Java, outlines common causes in Spring Boot, and provides step‑by‑step techniques—including GC log analysis, JConsole, VisualVM, MAT, Actuator, custom endpoints, jstack, BTrace, and best‑practice recommendations—to identify, diagnose, and prevent memory leaks for stable long‑running services.

Heap DumpJVMPerformance Monitoring

0 likes · 18 min read

How to Detect and Fix Memory Leaks in Spring Boot Applications

Liangxu Linux

Jun 29, 2025 · Operations

Essential Linux Performance Monitoring Tools and How to Use Them

This guide introduces a comprehensive set of Linux performance monitoring utilities—including vmstat, iostat, dstat, iotop, pidstat, top, htop, mpstat, netstat, ps, strace, uptime, lsof, and perf—explaining their purpose, key options, and example commands for effective system analysis and optimization.

Performance MonitoringSystem Toolsiostat

0 likes · 14 min read

Essential Linux Performance Monitoring Tools and How to Use Them

macrozheng

Jun 26, 2025 · Operations

Master JVM Performance: Visual Tools, JConsole, VisualVM & Arthas Guide

This guide introduces JVM performance monitoring by explaining built‑in tools like JConsole and VisualVM, showcasing third‑party solutions such as Arthas, and providing step‑by‑step commands and screenshots to help developers quickly visualize and troubleshoot Java applications.

ArthasJConsoleJVM

0 likes · 12 min read

Master JVM Performance: Visual Tools, JConsole, VisualVM & Arthas Guide

Selected Java Interview Questions

Jun 20, 2025 · Backend Development

How to Use Java Agents for Non‑Intrusive Method Timing and Bytecode Tracing

This article explains how to replace invasive timing code with a Java Agent using the Instrumentation API, demonstrates both premain and attach approaches, shows ASM‑based bytecode transformation examples, and explores how Arthas trace leverages similar techniques for runtime monitoring.

ArthasBytecode ManipulationJava Agent

0 likes · 23 min read

How to Use Java Agents for Non‑Intrusive Method Timing and Bytecode Tracing

IT Xianyu

Jun 5, 2025 · Operations

Diagnose MySQL Slow Queries on AlmaLinux: One‑Click Shell Monitoring Scripts

Learn how to quickly pinpoint MySQL performance bottlenecks on AlmaLinux by using built‑in tools like top, iotop, ss, and netstat, and automate snapshot collection with a simple Bash script that records system and container metrics for later analysis.

AlmaLinuxDockerPerformance Monitoring

0 likes · 20 min read

Diagnose MySQL Slow Queries on AlmaLinux: One‑Click Shell Monitoring Scripts

Practical DevOps Architecture

May 20, 2025 · Cloud Computing

Understanding and Monitoring CPU Steal Time in Virtual Machines

This article explains what CPU steal time is, how to view it with the GNU top command, why it occurs in virtualized environments, and provides practical guidelines for interpreting and addressing high steal‑time values to ensure stable cloud VM performance.

CPU Steal TimePerformance Monitoringcloud computing

0 likes · 4 min read

Understanding and Monitoring CPU Steal Time in Virtual Machines

Deepin Linux

May 19, 2025 · Operations

Linux System Performance Bottleneck Analysis and Optimization Guide

This comprehensive guide explains how to monitor, diagnose, and resolve Linux server performance bottlenecks by covering essential command‑line tools, logging, automated scripts, and optimization techniques for CPU, memory, disk I/O, and network, with practical case studies and code examples.

Performance MonitoringShell scriptingSystem Administration

0 likes · 40 min read

Linux System Performance Bottleneck Analysis and Optimization Guide

Efficient Ops

May 18, 2025 · Operations

Mastering API Latency: What P90, P95, P99 and SLA Really Mean

This article explains key performance metrics such as API latency, SLA commitments, and percentile indicators P90, P95, and P99, illustrating how to calculate and interpret these values along with average and maximum latency to improve system reliability and user experience.

API latencyPerformance MonitoringSLA

0 likes · 5 min read

Mastering API Latency: What P90, P95, P99 and SLA Really Mean

IT Services Circle

May 16, 2025 · Fundamentals

Understanding Page Faults and Their Impact on System Performance

The article explains what page faults are, how the Linux kernel handles them, methods to measure their frequency with tools like perf, vmstat and ftrace, and discusses why frequent faults degrade performance and how to mitigate them through memory, configuration, and code optimizations.

LinuxPerformance MonitoringVirtual Memory

0 likes · 7 min read

Understanding Page Faults and Their Impact on System Performance

JD Cloud Developers

May 6, 2025 · Backend Development

How to Instantly Trace Java Method Call Stacks for Faster Debugging

This article explains how to build and use a Java StackTrace utility that extracts and filters method call chains, enabling developers to quickly locate error sources, streamline debugging, and improve operational efficiency by visualizing execution paths through customizable parameters and integration examples.

Backend DevelopmentDebuggingJava

0 likes · 17 min read

How to Instantly Trace Java Method Call Stacks for Faster Debugging

Test Development Learning Exchange

May 5, 2025 · Operations

Track User Access Stats in Locust: Real‑Time Dashboard, CSV Export, Custom Metrics

Locust provides a built‑in web UI for real‑time monitoring of metrics such as RPS, average response time, failures, and active users, supports headless CSV export for deeper analysis, and allows custom statistics via event listeners in your test scripts, enabling comprehensive user access reporting.

CSV exportLoad TestingLocust

0 likes · 5 min read

Track User Access Stats in Locust: Real‑Time Dashboard, CSV Export, Custom Metrics

Raymond Ops

Apr 13, 2025 · Operations

Master Linux Resource Monitoring: How to Use /usr/bin/time Effectively

This guide explains how to use the Linux /usr/bin/time utility to measure program resource usage—including user and kernel CPU time, memory consumption, and I/O statistics—covers its syntax, common options, custom format strings, and the distinction between the external command and the shell built‑in.

Performance Monitoringresource usagetime command

0 likes · 9 min read

Master Linux Resource Monitoring: How to Use /usr/bin/time Effectively

MaGe Linux Operations

Apr 5, 2025 · Operations

5 Essential Linux Commands for Real‑Time Performance Monitoring

This article introduces five Linux commands—vmstat, iostat, free, df, and sar—explaining their purpose, common options, example usages, and the meaning of each column in their output to help you monitor memory, CPU, I/O, and filesystem performance in real time.

FreePerformance Monitoringdf

0 likes · 11 min read

5 Essential Linux Commands for Real‑Time Performance Monitoring

Test Development Learning Exchange

Mar 31, 2025 · Mobile Development

Setting Monitoring Metrics and Creating Test Report Templates for Android Monkey Testing

This guide explains how to monitor key system resources such as CPU, memory, battery, network traffic, and startup time during Android Monkey testing, provides ADB commands for data collection, and offers a structured test report template with practical steps for automation and analysis.

AndroidMonkey testingPerformance Monitoring

0 likes · 5 min read

Setting Monitoring Metrics and Creating Test Report Templates for Android Monkey Testing

360 Zhihui Cloud Developer

Mar 20, 2025 · Operations

Unlocking Application Reliability: Core APM Modules and Yunzhou’s OpenTelemetry Design

This article explains Application Performance Monitoring (APM), its key benefits such as business continuity, performance optimization, and cost reduction, outlines essential APM modules, and details Yunzhou Observation’s OpenTelemetry‑based design, data ingestion, processing, visualization, and future roadmap for observability.

APMObservabilityOpenTelemetry

0 likes · 10 min read

Unlocking Application Reliability: Core APM Modules and Yunzhou’s OpenTelemetry Design

Alibaba Cloud Observability

Mar 17, 2025 · Cloud Native

How to Master LLM Observability in Cloud‑Native Environments

This article explains the unique observability challenges of large language model (LLM) applications, outlines essential performance, cost, and safety metrics, and presents a comprehensive cloud‑native solution—including trace, metric, and log collection, domain‑specific dashboards, and step‑by‑step integration with Alibaba Cloud's Python Agent—to ensure reliable, efficient LLM deployments.

AI gatewayCloud NativeLLM Observability

0 likes · 18 min read

How to Master LLM Observability in Cloud‑Native Environments

Liangxu Linux

Mar 5, 2025 · Fundamentals

Understanding Linux Context Switching: Concepts, Types, and Monitoring Tools

This article explains what Linux context switching is, details the information stored in a process control block, distinguishes between process, thread, and interrupt context switches, and shows how to monitor them using vmstat and pidstat commands.

LinuxOperating SystemPerformance Monitoring

0 likes · 6 min read

Understanding Linux Context Switching: Concepts, Types, and Monitoring Tools

MaGe Linux Operations

Feb 26, 2025 · Operations

Which Linux Metrics Should You Check in the First Minute of a Performance Issue?

When a Linux server shows performance problems, this guide walks you through the essential commands—uptime, dmesg, vmstat, mpstat, pidstat, iostat, free, sar, and top—explaining what each metric reveals and how to interpret the output to quickly narrow down the root cause.

Performance Monitoringcommand-linesystem metrics

0 likes · 16 min read

Which Linux Metrics Should You Check in the First Minute of a Performance Issue?

Rare Earth Juejin Tech Community

Feb 16, 2025 · Frontend Development

Building a Frontend Performance Monitoring SDK: Theory, Metrics, and Implementation

This article explains the importance of frontend performance monitoring, outlines core metrics such as FCP, LCP, FP, CLS, and demonstrates how to implement a comprehensive SDK using PerformanceObserver, custom configuration, and wrappers for fetch and XMLHttpRequest to capture resource and network data for batch reporting.

JavaScriptPerformance MonitoringSDK

0 likes · 16 min read

Building a Frontend Performance Monitoring SDK: Theory, Metrics, and Implementation

Architect

Feb 4, 2025 · Databases

How to Detect Redis Big Keys in Real Time with Zero Code Changes

This article presents a lightweight, non‑intrusive eBPF‑based method for instantly identifying Redis big‑key operations, explains the underlying kernel and user‑space implementation, provides complete code samples, and evaluates performance before and after optimization.

GoPerformance Monitoringbig key detection

0 likes · 21 min read

How to Detect Redis Big Keys in Real Time with Zero Code Changes

Java Tech Enthusiast

Feb 4, 2025 · Backend Development

Unlock Precise Method Timing with Cool Request’s New Trace Feature

The article introduces Cool Request, an IDEA plugin that now supports a Trace feature for measuring execution time of any method, automatic MyBatis function tracing, customizable duration coloring, and script-based environment manipulation, complete with usage examples and code snippets.

Cool RequestHTTP debuggingIDEA Plugin

0 likes · 6 min read

Unlock Precise Method Timing with Cool Request’s New Trace Feature

Test Development Learning Exchange

Feb 4, 2025 · Operations

Using Locust for HTTP Request Statistics: Real‑time Monitoring, CSV Export, Custom Metrics, and Analysis

This guide explains how to leverage Locust's built‑in statistics and reporting features to monitor HTTP requests in real time via its web UI, export results as CSV files, customize metric collection in scripts, and analyze the data with external tools.

CSV exportLoad TestingLocust

0 likes · 5 min read

Using Locust for HTTP Request Statistics: Real‑time Monitoring, CSV Export, Custom Metrics, and Analysis

Test Development Learning Exchange

Jan 30, 2025 · Operations

Locust Event System Overview and Usage

The article explains Locust's event system, detailing common events such as init, test_start, test_stop, request_success, request_failure, quitting, worker_report, and hatch_complete, and provides Python code examples for attaching listeners to customize load‑testing behavior.

Event SystemLoad TestingLocust

0 likes · 4 min read

Alibaba Cloud Developer

Jan 2, 2025 · Operations

Mastering Error and Latency Diagnosis for Online Applications

This article presents a systematic root‑cause diagnosis framework for online applications, covering how to identify and resolve both error ("wrong") and performance ("slow") problems using trace links, associated data, high‑quality observability, and large‑language‑model‑driven intelligence.

Performance MonitoringRoot Cause AnalysisTrace Analysis

0 likes · 12 min read

Mastering Error and Latency Diagnosis for Online Applications

Su San Talks Tech

Dec 27, 2024 · Operations

Unveiling Distributed Tracing: How SkyWalking Tackles Microservice Performance

This article explains the principles and benefits of distributed tracing, introduces OpenTracing and SkyWalking architecture, and shares practical implementations and performance comparisons that help identify bottlenecks in microservice systems.

Distributed TracingMicroservicesOpenTracing

0 likes · 17 min read

Unveiling Distributed Tracing: How SkyWalking Tackles Microservice Performance

Linux Kernel Journey

Dec 21, 2024 · Fundamentals

Identify the Most Time‑Consuming Process Functions with eBPF

This tutorial shows how to use an eBPF program with the PERF_EVENT type to trace kernel activity, collect samples via performance counters, and pinpoint which processes and functions consume the most execution time, covering dynamic tracing concepts and overflow handling.

Linux profilingPerformance Monitoringdynamic tracing

0 likes · 3 min read

Identify the Most Time‑Consuming Process Functions with eBPF

Linux Kernel Journey

Dec 18, 2024 · Operations

Tracing Linux Soft Interrupts with eBPF: Measuring Processing Time

This article demonstrates how to write an eBPF program that attaches to Linux soft‑interrupt entry and exit points, records timestamps in eBPF maps, computes handling duration, updates counters and histograms, and exposes the data to user space for performance analysis.

LinuxPerformance MonitoringeBPF

0 likes · 5 min read

Tracing Linux Soft Interrupts with eBPF: Measuring Processing Time

Full-Stack Cultivation Path

Dec 15, 2024 · Frontend Development

How Chrome Recorder Supercharges Development Efficiency

The article explains how Chrome's Recorder feature lets developers capture, replay, edit, and export user interaction flows, turning repetitive manual testing into a one‑click process that speeds up debugging, performance monitoring, and automated test creation.

Chrome DevToolsPerformance MonitoringRecorder

0 likes · 6 min read

How Chrome Recorder Supercharges Development Efficiency

Liangxu Linux

Nov 27, 2024 · Operations

Quick Guide to Linux System Performance Diagnosis with Common Commands

This article explains how to use essential Linux commands such as uptime, dmesg, vmstat, mpstat, pidstat, iostat, free, sar, and top to monitor system load, CPU usage, memory, I/O, and network activity, interpreting their outputs for effective troubleshooting.

LinuxPerformance Monitoringiostat

0 likes · 36 min read

Quick Guide to Linux System Performance Diagnosis with Common Commands

Test Development Learning Exchange

Nov 12, 2024 · Fundamentals

Python Decorators: Concepts, Examples, and Practical Applications

This article provides a comprehensive guide to Python decorators, covering their basic concepts, practical examples including logging, multiple decorators, parameterized decorators, metadata preservation, class decoration, and performance monitoring applications.

Metadata PreservationPerformance MonitoringPython

0 likes · 5 min read

Python Decorators: Concepts, Examples, and Practical Applications

Spring Full-Stack Practical Cases

Nov 8, 2024 · Backend Development

Master Java Agent with Spring Boot 3: Real‑World API Latency Monitoring

This tutorial explains Java Agent technology, shows how to implement a custom agent using the Instrumentation API and Javassist, integrates it with a Spring Boot 3 application to log API execution time, and provides step‑by‑step packaging and execution instructions.

Bytecode ManipulationInstrumentationJava Agent

0 likes · 10 min read

Master Java Agent with Spring Boot 3: Real‑World API Latency Monitoring

Test Development Learning Exchange

Oct 22, 2024 · Operations

Key Linux Server Performance Metrics, Monitoring Tools, and a Python Script for Automated Data Collection

When testing Linux server performance, you should monitor key metrics such as CPU usage, memory consumption, disk I/O, network bandwidth, process information, file system usage, system logs, boot and response times, context switches, and interrupts, using tools like top, vmstat, iostat, netstat, and custom Python scripts.

LinuxPerformance MonitoringPython

0 likes · 8 min read

Key Linux Server Performance Metrics, Monitoring Tools, and a Python Script for Automated Data Collection

ITPUB

Oct 15, 2024 · Databases

8 Essential MongoDB Admin Scripts to Master Database Management

This guide presents eight practical MongoDB shell scripts—covering database size analysis, connection monitoring, long‑running query termination, table‑scan detection, replica‑set health checks, and slow‑query profiling—plus usage tips and cautions for reliable administration.

Database AdministrationMongoDBPerformance Monitoring

0 likes · 15 min read

8 Essential MongoDB Admin Scripts to Master Database Management

Open Source Linux

Oct 11, 2024 · Operations

Essential IT Operations Metrics: Definitions, Formulas, and Benchmarks

This article explains why operations metrics are vital for businesses, describes how tracking availability, failure rate, MTTR, MTBF, response time, throughput, error rate, capacity utilization, latency, data integrity, backup success, recovery time, security patch time, server and network utilization can improve reliability, reduce costs, and boost competitiveness.

AvailabilityIT OperationsMTBF

0 likes · 7 min read

Essential IT Operations Metrics: Definitions, Formulas, and Benchmarks

Su San Talks Tech

Oct 9, 2024 · Operations

How to Instantly Diagnose Java Production Issues with Arthas

This article introduces Arthas, an open‑source Java diagnostic tool, and demonstrates how to install it, use its command‑line and web console, and apply common commands such as dashboard, thread, watch, trace, and tt to quickly locate CPU spikes, deadlocks, memory leaks, and other production problems without redeploying code.

ArthasJVM diagnosticsJava debugging

0 likes · 20 min read

How to Instantly Diagnose Java Production Issues with Arthas

Linux Kernel Journey

Oct 2, 2024 · Operations

eBPF Tutorial 36: Tracing Nginx Requests with bpftrace

This tutorial shows how to use eBPF, bpftrace, and the funclatency tool to instrument key Nginx functions, measure their execution latency, analyze the distribution of request processing times, and identify performance bottlenecks for optimization.

Linux tracingNGINXPerformance Monitoring

0 likes · 9 min read

eBPF Tutorial 36: Tracing Nginx Requests with bpftrace

dbaplus Community

Sep 22, 2024 · Operations

Mastering Linux Performance: A Deep Dive into the top Command and Thread Analysis

This guide walks through real‑world scenarios of high CPU and memory alerts, demonstrating how to use Linux's top tool, interpret its detailed output, convert thread IDs, and leverage jstack dumps to pinpoint and resolve performance bottlenecks.

CPUMemoryPerformance Monitoring

0 likes · 5 min read

Mastering Linux Performance: A Deep Dive into the top Command and Thread Analysis

Airbnb Technology Team

Sep 19, 2024 · Mobile Development

How Airbnb Instruments Android Apps to Capture User‑Centric Performance Metrics

Airbnb’s Android Page Performance Score (PPS) framework instruments fragments to collect user‑centric metrics such as TTFL, TTIL, MTH, ALT and RCLT, using a standardized logging config, LoadableView interface, and visibility algorithms, enabling detailed performance analysis and automated alerts for mobile teams.

AndroidInstrumentationMobile Development

0 likes · 10 min read

How Airbnb Instruments Android Apps to Capture User‑Centric Performance Metrics

Architect

Sep 13, 2024 · Operations

Introducing MyPerf4J: A High‑Performance Java Monitoring and Statistics Tool

The article presents MyPerf4J, a Java‑agent based, low‑overhead performance monitoring library that provides real‑time metrics such as method latency, QPS, memory usage, GC statistics, and class loading, along with quick‑start instructions, configuration details, and open‑source links for Java backend services.

BackendJavaJavaAgent

0 likes · 7 min read

Introducing MyPerf4J: A High‑Performance Java Monitoring and Statistics Tool

Bilibili Tech

Sep 13, 2024 · Backend Development

Architectural Evolution of Bilibili Live Interaction Center

To solve duplicated functionality, legacy code, and scalability limits in Bilibili’s live‑streaming interaction services, the team created a unified Interaction Center that abstracts RTC, consolidates session, link, UI, scoring and role management, introduces a shared state machine and tracing, and evolves through phased, extensible architecture for higher performance and maintainability.

Performance MonitoringRTClive streaming

0 likes · 22 min read

Architectural Evolution of Bilibili Live Interaction Center

Full-Stack DevOps & Kubernetes

Sep 13, 2024 · Operations

Diagnosing and Resolving Server Memory Spikes: Tools, Causes, and Fixes

This guide explains how to monitor memory usage on Linux servers, identify common causes of sudden memory consumption such as leaks, stuck processes, or high load, and provides concrete commands and remediation steps to stabilize system performance.

LinuxPerformance MonitoringShell Commands

0 likes · 9 min read

Diagnosing and Resolving Server Memory Spikes: Tools, Causes, and Fixes

Xiaohongshu Tech REDtech

Sep 9, 2024 · Cloud Native

Applying eBPF for Cloud‑Native Observability and Continuous Profiling

By deploying eBPF agents as DaemonSets that hook kernel network and performance events, the Xiaohongshu observability team extended cloud‑native monitoring from the application to the kernel, delivering real‑time traffic analysis and low‑overhead continuous profiling for C++ services, aggregating data into centralized collectors for dashboards, flame‑graphs, and rapid root‑cause diagnosis.

KubernetesObservabilityPerformance Monitoring

0 likes · 37 min read

Applying eBPF for Cloud‑Native Observability and Continuous Profiling

Linux Kernel Journey

Sep 7, 2024 · Operations

Building and Running an eBPF Application – Part 1

This article walks through creating a first eBPF program using C and Go on Ubuntu 22.04, covering required dependencies, kernel‑space vs user‑space concepts, event selection, BPF map definition, and a tracepoint function that measures per‑process CPU time.

BPF mapsCGo

0 likes · 11 min read

Building and Running an eBPF Application – Part 1

Bilibili Tech

Sep 6, 2024 · Operations

Design and Implementation of a Cross‑Platform Real‑Time Troubleshooting System for Live Streaming

The team built a cross‑platform real‑time troubleshooting system for live streaming that adds critical‑business monitoring and a unified trace_id‑based tracing framework, simplifies OpenTracing, iterates reporting components, handles multi‑threading, stitches telemetry into searchable event chains, and via dashboards cut diagnosis time from two hours to five minutes, achieving a 91% fault‑resolution rate.

Distributed TracingPerformance Monitoringlive streaming

0 likes · 15 min read

Design and Implementation of a Cross‑Platform Real‑Time Troubleshooting System for Live Streaming

MaGe Linux Operations

Aug 29, 2024 · Operations

Master Linux’s /usr/bin/time: Measure CPU, Memory, and More with Custom Formats

This guide explains how to use the Linux /usr/bin/time utility to analyze program performance—including user and kernel CPU time, memory usage, and other resources—by invoking the command with various options, customizing output formats, redirecting results, and distinguishing it from the shell built‑in time command.

LinuxPerformance MonitoringShell

0 likes · 9 min read

Master Linux’s /usr/bin/time: Measure CPU, Memory, and More with Custom Formats

macrozheng

Aug 20, 2024 · Operations

Boost Java App Performance with MyPerf4J: High‑Throughput, Low‑Latency Monitoring

MyPerf4J is a high‑performance, non‑intrusive Java Agent that records millions of method calls per second with nanosecond precision, offering real‑time metrics, low memory overhead, and comprehensive monitoring for both development and production environments.

BackendJavaMyPerf4J

0 likes · 8 min read

Boost Java App Performance with MyPerf4J: High‑Throughput, Low‑Latency Monitoring

FunTester

Aug 14, 2024 · Backend Development

Mastering Java Agents: Build, Package, and Deploy Runtime Instrumentation

This guide explains what Java Agents are, their core capabilities such as bytecode enhancement, performance monitoring, security checks, and debugging, and provides step‑by‑step instructions for implementing the premain method, creating a ClassFileTransformer, packaging the agent with Maven, and loading it both statically and dynamically.

InstrumentationJavaJava Agent

0 likes · 10 min read

Mastering Java Agents: Build, Package, and Deploy Runtime Instrumentation