Tagged articles
12 articles
Page 1 of 1
IT Services Circle
IT Services Circle
Jul 17, 2025 · Backend Development

Boosting Dubbo Performance: Extract Hot Branches, If vs Switch, and CPU Branch Prediction

The article explores how Dubbo’s ChannelEventRunnable code was optimized by separating the frequently‑taken ChannelState.RECEIVED case into its own if statement, compares the runtime efficiency of pure if‑else, mixed if‑switch, and pure switch structures, and explains the underlying CPU branch‑prediction and instruction‑pipeline mechanisms that affect these choices.

CPU optimizationDubboJava performance
0 likes · 15 min read
Boosting Dubbo Performance: Extract Hot Branches, If vs Switch, and CPU Branch Prediction
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Dec 1, 2023 · Fundamentals

Performance Optimization: Register Access, Assembly Basics, and CPU Pipeline Techniques

The article explains how performance can be dramatically improved by keeping frequently used data in CPU registers instead of memory, understanding basic assembly syntax and instruction types, using branch‑prediction hints, and exploiting the CPU pipeline to reduce stalls and wasted cycles.

Assembly LanguageCPU registersPerformance Optimization
0 likes · 12 min read
Performance Optimization: Register Access, Assembly Basics, and CPU Pipeline Techniques
Tencent Cloud Developer
Tencent Cloud Developer
Oct 19, 2023 · Fundamentals

Profile-Guided Optimization (PGO) Principles and Practice in Go and C++

Profile‑Guided Optimization (PGO) collects runtime profiling data to recompile programs for higher performance, reducing branch mispredictions and improving code layout; Go gained built‑in PGO in 1.21 with typical 5 % gains, while C++ sees 15‑18 % QPS improvements and devirtualization benefits, and future work aims at deeper block ordering and register allocation.

C++GoPGO
0 likes · 16 min read
Profile-Guided Optimization (PGO) Principles and Practice in Go and C++
IT Services Circle
IT Services Circle
Feb 22, 2022 · Fundamentals

Why Sorting an Array Speeds Up Summation: CPU Pipeline, Hazards, and Branch Prediction Explained

The article examines a puzzling StackOverflow case where sorting a random array before summation yields a six‑fold speedup, explains the phenomenon through CPU five‑stage pipeline fundamentals, structural, data, and control hazards, and shows how branch prediction and operand forwarding mitigate the performance loss.

CPUPipelineSorting
0 likes · 16 min read
Why Sorting an Array Speeds Up Summation: CPU Pipeline, Hazards, and Branch Prediction Explained
ITPUB
ITPUB
Apr 8, 2021 · Fundamentals

Why Ordered Arrays Run 10× Faster: CPU Pipelines and Branch Prediction Explained

This article explains how the invention of assembly‑line manufacturing parallels modern CPU pipelines, why processing an ordered array can be nearly ten times faster than an unordered one, and shows a practical bit‑wise optimization to eliminate costly if‑statements for high‑performance code.

CPUPipelineassembly line
0 likes · 10 min read
Why Ordered Arrays Run 10× Faster: CPU Pipelines and Branch Prediction Explained
vivo Internet Technology
vivo Internet Technology
Mar 10, 2021 · Fundamentals

CPU Performance Optimization Using Top‑Down Micro‑architecture Analysis (TMAM)

The article demonstrates how Top‑down Micro‑architecture Analysis Methodology (TMAM) can quickly pinpoint CPU bottlenecks—such as front‑end, back‑end, and bad speculation stalls—in a simple C++ accumulation loop, and shows that applying targeted compiler, alignment, and branch‑prediction optimizations reduces runtime by roughly 34 % while increasing retiring slots.

CCPU performanceTMAM
0 likes · 20 min read
CPU Performance Optimization Using Top‑Down Micro‑architecture Analysis (TMAM)
ITPUB
ITPUB
Jul 31, 2020 · Backend Development

What Java Developers Can Learn from Top StackOverflow Questions: Branch Prediction, Security, Exceptions, and More

This article reviews several of the most popular Java questions on StackOverflow, explaining branch prediction for sorted arrays, why char[] is safer than String for passwords, handling NullPointerException, deterministic random strings, historic timezone quirks, creating an uncatchable exception, and the differences between HashMap, TreeMap and LinkedHashMap, highlighting practical lessons for developers.

Exception HandlingHashMapJava
0 likes · 10 min read
What Java Developers Can Learn from Top StackOverflow Questions: Branch Prediction, Security, Exceptions, and More
ITPUB
ITPUB
Oct 8, 2017 · Fundamentals

Mastering Branch Prediction: Techniques to Minimize Branch Overhead in x86 Code

This article explains the different types of CPU branches, how branch prediction works, and presents practical techniques—including branch‑prediction hints, SETcc/CMOVx instructions, and branch‑less coding—to reduce the performance impact of conditional and indirect jumps in x86 programs.

CMOVCPU pipelineSETcc
0 likes · 14 min read
Mastering Branch Prediction: Techniques to Minimize Branch Overhead in x86 Code
ITPUB
ITPUB
Aug 22, 2017 · Fundamentals

Why Adding Unnecessary Sorting Can Triple Your x86 Code Speed – A Deep Dive into Performance Metrics

This article explores x86 performance optimization by comparing a simple sum‑of‑array loop with and without a pre‑sort step, demonstrating how branch prediction and cache behavior can make seemingly redundant code run up to three times faster, and outlines practical benchmarking principles and common pitfalls.

BenchmarkingC programmingCPU cycles
0 likes · 14 min read
Why Adding Unnecessary Sorting Can Triple Your x86 Code Speed – A Deep Dive into Performance Metrics