Tagged articles
72 articles
Page 1 of 1
Deepin Linux
Deepin Linux
May 6, 2026 · Fundamentals

Master Linux Memory Performance: From Theory to Real‑World Optimization

This article systematically breaks down Linux's core memory mechanisms, identifies common performance bottlenecks, and demonstrates how to use tools like numastat, perf, and Valgrind together with kernel parameters such as swappiness and min_free_kbytes to achieve practical memory optimizations.

KernelLinuxMemory
0 likes · 55 min read
Master Linux Memory Performance: From Theory to Real‑World Optimization
Deepin Linux
Deepin Linux
Mar 28, 2026 · Fundamentals

Unlocking Linux Performance: A Deep Dive into NUMA Architecture

This article explains the core principles of NUMA, its deep integration with the Linux kernel, practical memory‑node and scheduling mechanisms, real‑world database and virtualization use cases, and step‑by‑step commands for inspecting and tuning NUMA on modern servers.

Linux kernelMemory ManagementNUMA
0 likes · 23 min read
Unlocking Linux Performance: A Deep Dive into NUMA Architecture
dbaplus Community
dbaplus Community
Feb 24, 2026 · Cloud Native

How CPU Architecture Bottlenecks Cripple Netflix’s Container Scaling

Netflix discovered that scaling hundreds of containers on modern CPUs hit severe lock‑contention due to mount‑related kernel locks, with performance varying across AWS instance types, NUMA designs, and hyper‑threading, leading them to redesign containerd mounting and choose hardware‑aware scheduling to restore efficient scaling.

AWSCPU architectureHyper-threading
0 likes · 16 min read
How CPU Architecture Bottlenecks Cripple Netflix’s Container Scaling
FunTester
FunTester
Jan 20, 2026 · Fundamentals

Why Data Movement, Not CPU Speed, Is the Real Performance Bottleneck

Most engineers blame slow CPUs for performance issues, but the true bottleneck is often data latency—from registers and caches to DRAM, NUMA nodes, disks, and networks—so understanding and minimizing data movement is key to reducing tail latency and improving system performance.

LatencyNUMASystems
0 likes · 11 min read
Why Data Movement, Not CPU Speed, Is the Real Performance Bottleneck
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Dec 30, 2025 · Cloud Native

How HBox Boosts GPU Utilization with Multi‑Pool and NUMA‑Aware Scheduling

The HBox scheduling platform tackles large‑scale AI cluster challenges by introducing a three‑pool resource model, priority‑based preemptive scheduling, network‑topology and NUMA‑aware dispatch, and GPU virtualization techniques like MIG and vGPU, dramatically improving GPU utilization, SLA guarantees, and overall cluster efficiency.

AI clustersGPU schedulingGPU virtualization
0 likes · 24 min read
How HBox Boosts GPU Utilization with Multi‑Pool and NUMA‑Aware Scheduling
Ops Community
Ops Community
Nov 3, 2025 · Operations

Master Linux Memory Management: Core Commands & Tuning in 10 Minutes

This comprehensive guide walks you through Linux memory management fundamentals, from prerequisite environments and a quick checklist to step‑by‑step installation of monitoring tools, memory diagnostics, kernel parameter adjustments, THP and swap optimization, NUMA affinity tuning, validation, Prometheus alerts, security hardening, troubleshooting, rollback procedures, best‑practice recommendations, and ready‑to‑use scripts and configuration snippets.

LinuxMemory ManagementNUMA
0 likes · 24 min read
Master Linux Memory Management: Core Commands & Tuning in 10 Minutes
MaGe Linux Operations
MaGe Linux Operations
Aug 25, 2025 · Operations

Why Your 128‑Core Server Underperforms: Unlock 300% Gains with CPU Affinity

This article explains why a newly purchased 128‑core AMD EPYC server may perform worse than a 32‑core machine, demonstrates how improper CPU affinity and NUMA configuration cause severe performance loss, and provides step‑by‑step practical methods—including system topology analysis, taskset, numactl, kernel scheduler tweaks, and container settings—to achieve up to 300% improvement.

CPU affinityNUMA
0 likes · 15 min read
Why Your 128‑Core Server Underperforms: Unlock 300% Gains with CPU Affinity
Deepin Linux
Deepin Linux
Jul 8, 2025 · Operations

Unlock Linux NUMA Performance: A Practical Multithreaded Tuning Guide

This article explains the fundamentals of NUMA architecture, why it matters for multithreaded Linux applications, and provides step‑by‑step practical guidance—including kernel internals, memory allocation policies, useful commands, and performance‑monitoring tools—to help developers optimize memory locality and boost overall program efficiency.

LinuxNUMAmultithreading
0 likes · 37 min read
Unlock Linux NUMA Performance: A Practical Multithreaded Tuning Guide
Bilibili Tech
Bilibili Tech
Jul 4, 2025 · Operations

Solving CPU Performance Layering in Heterogeneous Data Centers: A Practical Guide

This article explains why heterogeneous servers cause CPU performance layering, describes how to detect the issue using metrics such as NUMA hit/miss rates, cache miss ratios and frequency states, and provides step‑by‑step remediation techniques—including NUMA binding, cache isolation, recompilation and frequency locking—to improve resource pooling efficiency in modern data centers.

CPU performanceData centerNUMA
0 likes · 24 min read
Solving CPU Performance Layering in Heterogeneous Data Centers: A Practical Guide
Linux Kernel Journey
Linux Kernel Journey
Mar 8, 2025 · Backend Development

Optimizing MPTCP Flow Selection and Exploring a User‑Space MPTCP Stack – ByteDance STE at Netdev 0x19

At Netdev 0x19, ByteDance's STE team presented two technical talks: a NUMA‑aware MPTCP flow‑selection strategy that boosts Redis benchmark throughput by up to 30% and cuts tail latency by 6%, and a DPDK‑based user‑space MPTCP stack that halves latency and doubles throughput in data‑center tests.

DPDKLinux networkingMPTCP
0 likes · 8 min read
Optimizing MPTCP Flow Selection and Exploring a User‑Space MPTCP Stack – ByteDance STE at Netdev 0x19
ByteDance SYS Tech
ByteDance SYS Tech
Mar 7, 2025 · Fundamentals

How NUMA‑Aware MPTCP Flow Selection Boosts Throughput and Cuts Latency

At Netdev 0x19, ByteDance's STE team presented two talks—one on a NUMA‑locality‑aware MPTCP flow‑selection strategy that can raise throughput by up to 30% and lower tail latency by 6%, and another on a DPDK‑based user‑space MPTCP stack that reduces latency by nearly 10% and more than doubles throughput—showcasing practical performance gains for data‑center networking.

DPDKData Center NetworkingMPTCP
0 likes · 8 min read
How NUMA‑Aware MPTCP Flow Selection Boosts Throughput and Cuts Latency
Linux Kernel Journey
Linux Kernel Journey
Feb 16, 2025 · Fundamentals

Understanding Multi‑Core Hardware Topology and Linux sched_domain

The article explains how Linux kernel scheduling uses a hierarchical topology—balancing load and preserving cache affinity—by mapping real‑world multi‑core hardware structures such as sockets, dies, clusters, and NUMA nodes to sched_domain and sched_group, and shows how to inspect and tune this layout with CONFIG_SCHED_DEBUG and QEMU simulation.

KernelLinuxNUMA
0 likes · 9 min read
Understanding Multi‑Core Hardware Topology and Linux sched_domain
Linux Code Review Hub
Linux Code Review Hub
Feb 12, 2025 · Fundamentals

Understanding Multi‑core Hardware Architecture and Linux sched_domain

The article explains how Linux builds sched_domain and sched_group hierarchies based on physical CPU topology—sockets, dies, clusters, and NUMA nodes—illustrating load‑balancing (BALANCE) versus affinity (AFFINE) with concrete examples, kernel code references, and QEMU‑based experiments.

CPU topologyKernel SchedulingNUMA
0 likes · 9 min read
Understanding Multi‑core Hardware Architecture and Linux sched_domain
Deepin Linux
Deepin Linux
Dec 30, 2024 · Fundamentals

Understanding NUMA Node Detection and Memory Management in the Linux Kernel

This article explains the fundamentals of NUMA architecture, how Linux detects and represents NUMA nodes, the memory zone hierarchy, allocation policies, and practical techniques such as using numactl and taskset to bind processes for optimal performance on multi‑socket servers.

Linux kernelMemory ManagementNUMA
0 likes · 22 min read
Understanding NUMA Node Detection and Memory Management in the Linux Kernel
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Nov 18, 2024 · Cloud Computing

How Dynamic Resource Scheduling Boosts OpenStack Efficiency and Cuts Costs

Virtualization resource scheduling algorithms, especially in OpenStack, address fragmented CPU allocation and uneven node utilization by dynamically consolidating VMs, employing NUMA-aware placement, and using resource scoring to trigger migrations, ultimately improving utilization, reducing costs, and enhancing performance in cloud environments.

NUMAOpenStackVirtualization
0 likes · 12 min read
How Dynamic Resource Scheduling Boosts OpenStack Efficiency and Cuts Costs
Architects' Tech Alliance
Architects' Tech Alliance
Sep 20, 2024 · Operations

Unlocking Kunpeng CPU Performance: Real-World Optimization Techniques and Benchmarks

This article provides a comprehensive, step‑by‑step guide to tuning Kunpeng‑based servers, covering hardware characteristics, matrix‑multiplication benchmarks, NUMA‑aware scheduling, compiler and JDK optimizations, acceleration libraries, disk and NIC tuning, and a practical MariaDB performance‑tuning workflow.

CPU optimizationKunpengLinux
0 likes · 17 min read
Unlocking Kunpeng CPU Performance: Real-World Optimization Techniques and Benchmarks
Tencent Cloud Developer
Tencent Cloud Developer
Aug 15, 2024 · Databases

Architecture Upgrade Challenges and Atomic Write Solutions for Cloud-native Databases

Collaborating across TencentOS and database kernel teams, the article details how architecture upgrades—moving to TKE HouseKeeper, switching to AMD CPUs, and adding a portable 16 KB atomic‑write feature—combined with kernel optimizations like huge‑page support, NUMA‑aware qspinlocks, speculative page‑fault handling, and ORC unwinding to deliver up to 30 % mixed workload and over 100 % write‑only performance gains while reducing memory usage.

NUMAORC unwinderatomic write
0 likes · 16 min read
Architecture Upgrade Challenges and Atomic Write Solutions for Cloud-native Databases
Architects' Tech Alliance
Architects' Tech Alliance
Jul 13, 2024 · Operations

How to Supercharge Kunpeng CPUs: Real‑World Performance Tuning Techniques

This article provides a comprehensive guide to optimizing Kunpeng‑based servers, covering hardware characteristics, matrix multiplication benchmarks, Von Neumann architecture insights, soft and hard acceleration, compiler and JDK tweaks, NUMA tuning, Nginx and OpenSSL acceleration, disk and network optimizations, application‑level tuning, and a step‑by‑step MariaDB performance‑tuning checklist.

CPU performanceDatabase TuningHardware acceleration
0 likes · 16 min read
How to Supercharge Kunpeng CPUs: Real‑World Performance Tuning Techniques
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Jul 5, 2024 · Cloud Native

Koordinator v1.5.0 Release: New Features and Enhancements

Koordinator v1.5.0, the 13th major release since its open‑source debut, introduces pod‑level NUMA alignment, Terway network QoS, core scheduling, and numerous performance and stability improvements, while also being accepted as a CNCF Sandbox project and outlining future roadmap plans.

Cloud NativeCore SchedulingKubernetes
0 likes · 14 min read
Koordinator v1.5.0 Release: New Features and Enhancements
Open Source Linux
Open Source Linux
Apr 29, 2024 · Fundamentals

Unlocking DPDK Memory Management: How Hugepages Boost Performance

This article consolidates DPDK 17.11 source‑code notes to explain the library’s memory‑management subsystem, covering hugepage concepts, shared configuration mapping, NUMA‑aware allocation, and the custom allocator that enables high‑throughput packet processing on Linux.

DMADPDKMemory Management
0 likes · 40 min read
Unlocking DPDK Memory Management: How Hugepages Boost Performance
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Apr 19, 2024 · Fundamentals

Large Folios in the Linux Kernel: Benefits, Implementations, and Future Directions

Large folios in the Linux kernel combine multiple pages to reduce TLB misses, page faults, and reclamation cost while enabling more efficient compression; they are supported by filesystems like XFS and bcachefs, and recent patches add multi‑size THP, swap‑in/out handling, TAO allocation, NUMA balancing, and debug tools, with OPPO’s production deployment showing performance gains and motivating broader adoption and fragmentation mitigation.

NUMASwapTLB
0 likes · 17 min read
Large Folios in the Linux Kernel: Benefits, Implementations, and Future Directions
MaGe Linux Operations
MaGe Linux Operations
Mar 1, 2024 · Operations

Master Linux Virtualization: Tuning KVM Performance with Tuned and KSM

This guide walks through Linux virtualization management and performance tuning, covering tuned profiles for guests and hosts, kernel parameters, NUMA awareness, CPU pinning, memory limits, KSM configuration, qcow2 image creation, disk cache modes, I/O throttling, and monitoring commands to optimize KVM workloads.

KSMKVMLinux
0 likes · 21 min read
Master Linux Virtualization: Tuning KVM Performance with Tuned and KSM
AI Cyberspace
AI Cyberspace
May 19, 2023 · Cloud Computing

Mastering OpenStack Neutron SR‑IOV: Boost Network Performance with VLAN & NUMA

This guide explains the performance limitations of Neutron OVS networking, introduces SR‑IOV as a high‑performance I/O virtualization solution, and provides step‑by‑step configuration for enabling SR‑IOV agents, mapping physical networks, creating VLAN and flat networks, handling NUMA affinity, security groups, and bonding, with detailed command examples and XML snippets.

NUMANeutronOVS
0 likes · 27 min read
Mastering OpenStack Neutron SR‑IOV: Boost Network Performance with VLAN & NUMA
Bin's Tech Cabin
Bin's Tech Cabin
May 4, 2023 · Fundamentals

How Linux’s Slab Allocator Manages Memory: Deep Dive into Fast and Slow Paths

This article dissects the Linux kernel’s slab allocator, explaining its complete architecture, the fast‑path allocation from per‑CPU caches, the slow‑path mechanisms involving partial lists, NUMA node caches, and fallback to the buddy system, while detailing object initialization and freelist construction.

LinuxMemory ManagementNUMA
0 likes · 41 min read
How Linux’s Slab Allocator Manages Memory: Deep Dive into Fast and Slow Paths
AI Cyberspace
AI Cyberspace
Mar 28, 2023 · Fundamentals

Why NUMA Slows Multithreaded Apps and How to Optimize It

This article explains NUMA architecture, its multithreaded performance overheads such as remote memory access, cache synchronization, context and mode switches, interrupt handling, TLB misses, and memory copies, and then presents optimization techniques like NUMA and CPU affinity, IRQ tuning, and large‑page usage.

CPU affinityLinuxNUMA
0 likes · 20 min read
Why NUMA Slows Multithreaded Apps and How to Optimize It
ByteDance SYS Tech
ByteDance SYS Tech
Feb 10, 2023 · Fundamentals

Mastering Linux Memory: Reclaim, Huge Pages, and NUMA Optimization

This article explains common Linux memory‑related performance bottlenecks—such as memory reclamation, page‑cache pressure, huge‑page usage, and cross‑NUMA access—and provides practical tuning methods to improve latency and throughput on servers and applications.

Huge PagesNUMA
0 likes · 16 min read
Mastering Linux Memory: Reclaim, Huge Pages, and NUMA Optimization
Bin's Tech Cabin
Bin's Tech Cabin
Dec 28, 2022 · Fundamentals

How Linux Allocates Physical Memory: Inside the Kernel’s Buddy Allocator

This article walks through Linux kernel physical memory allocation, explaining the hierarchy of allocation interfaces, the role of gfp_mask and ALLOC flags, the fast and slow allocation paths, memory watermarks, NUMA zone handling, and the complex fallback mechanisms including compaction, direct reclaim, and OOM, all illustrated with code snippets and diagrams.

LinuxMemory ManagementNUMA
0 likes · 68 min read
How Linux Allocates Physical Memory: Inside the Kernel’s Buddy Allocator
Bin's Tech Cabin
Bin's Tech Cabin
Nov 21, 2022 · Fundamentals

Inside Linux Physical Memory Management: From FLATMEM to NUMA, Watermarks, and Page Structures

This article provides an in‑depth, step‑by‑step explanation of how the Linux kernel organizes and manages physical memory, covering memory models (FLATMEM, DISCONTIGMEM, SPARSEMEM), NUMA vs. UMA architectures, zone partitioning, watermarks, reserved pages, hot‑cold page handling, and the detailed struct page layout used for both anonymous and file‑backed pages.

LinuxMemory ManagementNUMA
0 likes · 99 min read
Inside Linux Physical Memory Management: From FLATMEM to NUMA, Watermarks, and Page Structures
Architects' Tech Alliance
Architects' Tech Alliance
Aug 22, 2022 · Fundamentals

DPDK Performance Tuning: Influencing Factors and Optimization Techniques

This article explains how hardware architecture, Linux OS version, kernel configuration, OVS integration, memory management, NUMA awareness, and CPU micro‑architecture affect DPDK application performance and provides concrete tuning steps such as CPU isolation, service disabling, huge‑page setup, and optimized memory allocation.

CPU optimizationDPDKLinux
0 likes · 11 min read
DPDK Performance Tuning: Influencing Factors and Optimization Techniques
Liangxu Linux
Liangxu Linux
May 29, 2022 · Operations

Why Linux Triggers OOM Killer and How to Manage Memory Reclamation

This article explains Linux virtual memory, the page‑fault allocation process, the two memory‑reclaim paths (kswapd and direct reclaim), OOM killer scoring, swappiness tuning, NUMA‑aware reclamation, and practical steps to protect critical processes from being killed.

LinuxNUMAOOM killer
0 likes · 19 min read
Why Linux Triggers OOM Killer and How to Manage Memory Reclamation
IT Services Circle
IT Services Circle
May 24, 2022 · Fundamentals

Understanding Linux Memory Management, Page Reclamation, and OOM Killer

This article explains Linux virtual memory concepts, the process of memory allocation, page fault handling, background and direct memory reclamation methods, LRU-based page types, NUMA considerations, tuning parameters like swappiness and min_free_kbytes, and strategies to prevent OOM killer termination.

LinuxMemory ManagementNUMA
0 likes · 18 min read
Understanding Linux Memory Management, Page Reclamation, and OOM Killer
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
May 19, 2022 · Artificial Intelligence

Performance Evaluation of Cloud Music Online Estimation System on NUMA Architecture

Evaluating the Cloud Music online estimation system on NUMA‑based servers revealed that CPU pinning across both memory nodes dramatically boosts throughput on high‑end 96‑core machines—up to 75% for complex models—while low‑end servers gain only modestly, confirming NUMA‑aware scheduling’s critical role for CPU‑intensive inference workloads.

CPU architectureNUMAPerformance Testing
0 likes · 8 min read
Performance Evaluation of Cloud Music Online Estimation System on NUMA Architecture
Alibaba Cloud Native
Alibaba Cloud Native
Feb 14, 2022 · Cloud Native

How to Overcome CPU Throttling and NUMA Bottlenecks in Cloud‑Native Containers

This article explains why container workloads suffer from CPU throttling and NUMA‑related performance loss in cloud‑native environments, examines Kubelet's CPU allocation policies, demonstrates the impact of CPU bursts and topology‑aware scheduling, and shows how Alibaba Cloud ACK mitigates these issues with concrete data.

Alibaba Cloud ACKCPU BurstCPU throttling
0 likes · 11 min read
How to Overcome CPU Throttling and NUMA Bottlenecks in Cloud‑Native Containers
360 Tech Engineering
360 Tech Engineering
Sep 2, 2021 · Cloud Computing

Performance Comparison and CPU Pinning Techniques for Enterprise‑Level Virtual Machine Instances

The article analyzes the instability of shared‑type virtual machines, introduces enterprise‑level instances with fixed CPU scheduling and NUMA topology, details the applied technologies such as CPU pinning, PCI‑Passthrough and multi‑queue NICs, and presents extensive sysbench and STREAM benchmark results that demonstrate superior isolation, stability and performance of enterprise instances over shared ones.

CPU pinningNUMAPerformance Testing
0 likes · 12 min read
Performance Comparison and CPU Pinning Techniques for Enterprise‑Level Virtual Machine Instances
Liangxu Linux
Liangxu Linux
Jun 2, 2021 · Operations

Mastering Linux Multi‑Core Scheduling: Strategies, Algorithms, and Performance Optimizations

This article explains Linux's sophisticated scheduling system for multi‑core, SMP, and NUMA architectures, describes global, clustered, partitioned, and arbitrary schedulers, details scheduling domains and load‑balancing mechanisms, and provides practical performance‑tuning techniques using tools like perf, flame graphs, and various kernel optimizations.

BFSCFSCPU optimization
0 likes · 31 min read
Mastering Linux Multi‑Core Scheduling: Strategies, Algorithms, and Performance Optimizations
360 Smart Cloud
360 Smart Cloud
Jun 1, 2021 · Fundamentals

Physical Address Space Management and Memory Allocation in Linux (NUMA, Nodes, Zones, Pages, Slab, and Page Fault Handling)

This article explains how Linux manages physical address space using SMP and NUMA architectures, describes the node, zone, and page data structures, details page allocation via the buddy system and slab allocator, and outlines user‑ and kernel‑mode page‑fault handling, swapping, and address translation mechanisms.

LinuxMemory ManagementNUMA
0 likes · 17 min read
Physical Address Space Management and Memory Allocation in Linux (NUMA, Nodes, Zones, Pages, Slab, and Page Fault Handling)
Aikesheng Open Source Community
Aikesheng Open Source Community
Apr 22, 2021 · Databases

Understanding NUMA and Its Impact on MySQL Performance

This article explains NUMA architecture, how its memory allocation policies can cause swap‑related performance issues for MySQL, provides step‑by‑step methods to disable NUMA at BIOS, kernel or MySQL levels, and discusses the innodb_numa_interleave parameter and best‑practice recommendations.

LinuxNUMAdatabase
0 likes · 7 min read
Understanding NUMA and Its Impact on MySQL Performance
Architects' Tech Alliance
Architects' Tech Alliance
Apr 11, 2021 · Industry Insights

How to Supercharge Ceph on Huawei Kunpeng ARM: Deep Performance Tuning Guide

This article examines Ceph’s architecture, identifies performance bottlenecks on Huawei’s Kunpeng ARM platform, and presents practical tuning methods—including NUMA placement, cache tagging, vector acceleration, thread scaling, and monitoring tools—to improve storage efficiency, reduce latency, and lower power consumption.

ARMCephKunpeng
0 likes · 17 min read
How to Supercharge Ceph on Huawei Kunpeng ARM: Deep Performance Tuning Guide
Architects' Tech Alliance
Architects' Tech Alliance
Nov 11, 2020 · Fundamentals

Understanding DPDK Memory Management: Large Pages, NUMA, DMA, and IOMMU

This article explains the core principles of DPDK memory management, covering standard huge pages, NUMA node binding, direct memory access, IOMMU and IOVA addressing, custom allocators, and memory pools, and how these mechanisms together enable high‑performance packet processing on Linux systems.

DMADPDKHigh‑Performance Networking
0 likes · 14 min read
Understanding DPDK Memory Management: Large Pages, NUMA, DMA, and IOMMU
ITPUB
ITPUB
May 10, 2020 · Databases

How We Migrated MySQL to Tencent Cloud CDB and Boosted Performance Up to 10×

This case study details the migration of Weimeng's MySQL databases to Tencent Cloud CDB, describing the testing methodology, performance bottlenecks discovered (NUMA, network parameters, low‑concurrency issues, and version bugs), the step‑by‑step optimizations applied, and the resulting QPS improvements across various workloads.

NUMATencent Cloud CDBdatabase migration
0 likes · 20 min read
How We Migrated MySQL to Tencent Cloud CDB and Boosted Performance Up to 10×
Ctrip Technology
Ctrip Technology
Nov 21, 2019 · Cloud Native

Case Study: Intermittent Container Timeout Issues – Analysis and Resolution

This article presents a detailed case study of intermittent container timeout problems in a Kubernetes environment, examining kernel upgrades, NUMA configurations, CPU affinity bindings, kubelet behavior, cadvisor overhead, and hardware faults, and outlines the investigative steps and solutions applied.

CPU affinityContainerHardware Fault
0 likes · 8 min read
Case Study: Intermittent Container Timeout Issues – Analysis and Resolution
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Nov 8, 2019 · Operations

Boost Network Performance on Kunpeng CPUs: Tuning Tips & Tools

This guide explains how to improve network subsystem performance on Kunpeng processors by using tools such as ethtool and strace, adjusting PCIe payload size, binding NIC interrupts to NUMA‑local cores, tweaking interrupt coalescing, enabling TSO, and replacing select with epoll for high‑concurrency workloads.

KunpengNUMANetwork Tuning
0 likes · 12 min read
Boost Network Performance on Kunpeng CPUs: Tuning Tips & Tools
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Oct 30, 2019 · Operations

Master CPU & Memory Subsystem Tuning on Kunpeng Processors: Tools & Strategies

This article introduces practical CPU and memory subsystem performance tuning for Kunpeng processors, covering optimization concepts, key parameters, common monitoring tools such as top, perf and numactl, and detailed methods like NUMA binding, prefetch control, timer tuning, TLB page size adjustment, and thread concurrency optimization.

CPU tuningKunpengLinux performance
0 likes · 15 min read
Master CPU & Memory Subsystem Tuning on Kunpeng Processors: Tools & Strategies
Programmer DD
Programmer DD
Feb 12, 2019 · Fundamentals

How ZGC Achieves Sub‑10 ms Pauses: A Deep Dive into Java’s Low‑Latency GC

ZGC is a scalable, low‑latency Java garbage collector designed to keep pause times under 10 ms regardless of heap size, supporting up to 4 TB, and leveraging concurrent, region‑based, compacting, NUMA‑aware techniques, colored pointers, and load barriers, with detailed compilation and tuning guidance.

Garbage CollectionJDKJava
0 likes · 8 min read
How ZGC Achieves Sub‑10 ms Pauses: A Deep Dive into Java’s Low‑Latency GC
JD Tech
JD Tech
Dec 11, 2018 · Big Data

Introduction to Graph Computing and the JoyGraph System

This article introduces graph computing, compares it with graph databases, surveys notable graph processing systems, and details the architecture, NUMA‑aware design, execution model, push/pull dual mode, and load‑balancing strategies of the JoyGraph framework while outlining its future development directions.

Big DataJoyGraphNUMA
0 likes · 9 min read
Introduction to Graph Computing and the JoyGraph System
dbaplus Community
dbaplus Community
May 15, 2018 · Operations

Why High‑Throughput Redis Still Drops Packets: Deep Dive into Linux Network Stack and Interrupt Optimization

The article investigates massive packet loss in Meituan‑Dianping's Redis service despite 10 Gbps NIC upgrades, traces the issue to kernel receive‑buffer drops and single‑CPU interrupt handling, and presents a step‑by‑step optimization using backlog tuning, CPU and Redis affinity, and NUMA‑aware placement to eliminate drops and improve latency.

InterruptsLinuxNUMA
0 likes · 30 min read
Why High‑Throughput Redis Still Drops Packets: Deep Dive into Linux Network Stack and Interrupt Optimization
Qunar Tech Salon
Qunar Tech Salon
Mar 21, 2018 · Operations

Root Cause Analysis and Optimization of Network Packet Loss in High‑Traffic Redis Services

The article investigates why massive Redis deployments experience network packet loss despite using 10 Gbps NICs, explains how Linux kernel counters such as net.if.in.dropped are derived from /proc/net/dev, walks through the driver‑to‑kernel processing path, and proposes CPU‑affinity, interrupt‑affinity and NUMA‑aware tuning to eliminate the drops.

CPU affinityLinux kernelNUMA
0 likes · 28 min read
Root Cause Analysis and Optimization of Network Packet Loss in High‑Traffic Redis Services
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Jul 8, 2015 · Operations

Design and Implementation of High‑Concurrency (C10M) Load Balancing in Alibaba's AGW Middlebox

The article analyzes the challenges of scaling network devices to handle ten‑million concurrent connections (C10M) and describes Alibaba's AGW solution, which uses lock‑free data planes, hugepages, NUMA‑aware memory placement, and user‑space NIC drivers to achieve high‑performance four‑layer load balancing.

C10MNUMAhugepage
0 likes · 9 min read
Design and Implementation of High‑Concurrency (C10M) Load Balancing in Alibaba's AGW Middlebox