Tagged articles
8 articles
Page 1 of 1
Su San Talks Tech
Su San Talks Tech
Jul 4, 2022 · Fundamentals

Understanding CPU Cache False Sharing and How to Eliminate It

This article explains the concept of CPU cache false sharing, how it degrades performance on multi‑core systems, and provides practical techniques—including cache‑line alignment macros and padding strategies—to prevent it and improve multithreaded application efficiency.

CPU cachePerformance Optimizationcache line
0 likes · 10 min read
Understanding CPU Cache False Sharing and How to Eliminate It
Top Architect
Top Architect
Feb 1, 2022 · Fundamentals

Understanding CPU Cache Hierarchy, Cache Coherence, and Performance Optimization

This article explains the structure of modern CPU caches, the principles of cache lines, associativity, and coherence protocols, and demonstrates how these hardware details affect program performance through multiple code examples covering loop stride, matrix traversal, multithreading, and false sharing.

CPU cacheMemory HierarchyPerformance Optimization
0 likes · 21 min read
Understanding CPU Cache Hierarchy, Cache Coherence, and Performance Optimization
Baidu Geek Talk
Baidu Geek Talk
Apr 21, 2021 · Backend Development

Performance Optimization in Baidu's C++ Backend: Memory Allocation and Access Techniques

Baidu engineers boost C++ backend latency and cost efficiency by eliminating unnecessary string zero‑initialization, using zero‑copy split with SIMD, replacing deep protobuf merges with repeated string fields, employing job‑scoped arenas and custom memory resources for allocation, and applying prefetching, cache‑line awareness, and tuned memory‑order semantics, achieving multiplicative to order‑of‑magnitude speedups.

Memory AccessPerformance Optimizationcache line
0 likes · 31 min read
Performance Optimization in Baidu's C++ Backend: Memory Allocation and Access Techniques
vivo Internet Technology
vivo Internet Technology
Mar 10, 2021 · Fundamentals

CPU Performance Optimization Using Top‑Down Micro‑architecture Analysis (TMAM)

The article demonstrates how Top‑down Micro‑architecture Analysis Methodology (TMAM) can quickly pinpoint CPU bottlenecks—such as front‑end, back‑end, and bad speculation stalls—in a simple C++ accumulation loop, and shows that applying targeted compiler, alignment, and branch‑prediction optimizations reduces runtime by roughly 34 % while increasing retiring slots.

CCPU performanceTMAM
0 likes · 20 min read
CPU Performance Optimization Using Top‑Down Micro‑architecture Analysis (TMAM)
JavaEdge
JavaEdge
Jan 9, 2020 · Fundamentals

Why False Sharing Slows Your Java Programs and How to Eliminate It

False sharing occurs when multiple threads modify variables that reside on the same CPU cache line, causing unnecessary cache coherency traffic; this article explains cache line basics, CPU cache hierarchy, MESI protocol, and presents Java solutions—including padding, @sun.misc.Contended annotation, and JVM flags—to prevent performance degradation.

Java concurrencyMESIPerformance Optimization
0 likes · 9 min read
Why False Sharing Slows Your Java Programs and How to Eliminate It
Qunar Tech Salon
Qunar Tech Salon
Mar 23, 2015 · Fundamentals

Understanding CPU Cache: Purpose, Multi‑Level Design, Cache Lines, and Optimization Techniques

This article explains why CPU caches are needed, the evolution to multi‑level caches, the concept of cache lines, practical experiments demonstrating their impact, and how different cache organization strategies such as fully associative, direct‑mapped, and N‑way set‑associative affect performance and eviction policies.

Memory HierarchyPerformance Optimizationcache architecture
0 likes · 14 min read
Understanding CPU Cache: Purpose, Multi‑Level Design, Cache Lines, and Optimization Techniques