Backend Development 19 min read

Master the Five ‘Optimization Secrets’: Pooling, Sequencing, Batching, Reduction, and Concurrency

This article presents five practical performance‑optimization principles—pooling, sequential I/O, batching, reduction, and concurrency—explaining their rationale, real‑world examples such as object pools, ordered reads, batch APIs in MySQL/Redis/Kafka, sharding strategies, read/write separation, and common concurrency anomalies, while highlighting trade‑offs and implementation tips.

dbaplus Community

Apr 30, 2022

Master the Five ‘Optimization Secrets’: Pooling, Sequencing, Batching, Reduction, and Concurrency

1. Pooling (池)

Pooling reduces the cost of creating and recycling reusable objects. In games, many short‑lived enemies are essentially identical; creating a new instance for each is wasteful. An object pool (or thread/connection pool) reuses objects, cutting CPU and memory overhead.

2. Sequential Access (序)

Sequential reads and writes are far faster than random accesses for both memory and disk. Ordering I/O reduces cache misses and improves throughput. The article cites examples such as iterating over an ordered snapshot instead of a hash table in game servers, and the importance of sequential disk writes in Kafka, where disabling fsync lets SSD sequential throughput approach memory speed (~1 GiB/s) while random reads drop dramatically.

3. Batching (分)

Batching combines multiple operations into a single I/O, reducing the number of disk syncs and network round‑trips. It can be viewed as “cache + buffer”: caches accelerate reads, buffers accelerate writes. Common batch APIs include:

MySQL INSERT … VALUES with multiple rows.

Redis PIPELINE for bulk commands.

Elasticsearch _bulk for indexing/deleting many documents.

Kafka aggregates messages in memory before sending; it does not expose a single‑message send API.

Batching also appears as frame‑level processing (e.g., Redis incremental rehash, Redis SCAN/HSCAN for paginated reads) and time‑sharing (IO multiplexing, reactor pattern). The article warns that batching can weaken data consistency: caches may become stale without strict guarantees, and buffers can lose data on process crash, so designers must weigh durability versus performance.

3.1 Sharding, Partitioning, Database Splitting

Horizontal scaling splits data across shards, partitions, databases, or tables. Routing determines which key maps to which shard. Two routing families exist:

Non‑deterministic (random, round‑robin) for stateless task distribution.

Deterministic (range, hash, config tables) for stateful workloads such as Kafka partitions.

MySQL supports Range, List, Hash, and Key partition types. In‑process sharding also appears in Java’s pre‑1.8 ConcurrentHashMap, which divides the map into 16 segments to reduce lock contention.

3.2 Separation (分离)

Separation isolates responsibilities: read/write separation, compute/storage separation, and service‑level separation. Read/write separation decouples read paths from write paths, allowing independent scaling. Event streams act as the glue between write‑side services and read‑side consumers, enabling “Unix‑style” pipelines (e.g., mysql | elasticsearch).

4. Reduction (减)

Early‑stage systems often don’t need heavy optimization; simple designs win. When performance bottlenecks appear, cheap wins include adding indexes, archiving stale data, or reducing data volume. The article stresses that trimming data (similar to Java GC) can dramatically speed up queries without major refactoring.

5. Concurrency (并)

Concurrency is the most complex of the five tips. It encompasses multi‑threading, lock management, and database isolation anomalies (dirty read, non‑repeatable read, phantom read, etc.). The article notes that DB transactions (ACID) center on isolation, and modern systems face up to seven classic anomalies.

Two main scaling paths are described:

Vertical scaling via multi‑threading when hardware is under‑utilized.

Horizontal scaling (scale‑out) via sharding, partitioning, or distributed databases when a single node hits its limits.

Introducing concurrency raises code complexity and consistency challenges, making it a last‑resort optimization after simpler techniques have been exhausted.

Conclusion

Continuous, incremental optimization—likened to “splitting pancakes”—is a sustainable habit. Simpler, cheaper techniques (pooling, sequential I/O, batching, reduction, careful concurrency) should be applied first; more invasive changes like full architectural separation are reserved for when performance demands outgrow these basics.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Optimization concurrency

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.