Backend Development 9 min read

Optimizing High‑Concurrency Services: Practical Strategies for QPS Over 200k

This article outlines practical techniques for handling online services with QPS exceeding 200,000, including avoiding relational databases, employing multi‑level caching, leveraging multithreading, implementing degradation and circuit‑breaker patterns, optimizing I/O, using controlled retries, handling edge cases, and logging efficiently.

Architecture Digest

May 17, 2022

Optimizing High‑Concurrency Services: Practical Strategies for QPS Over 200k

When an online service receives more than 200,000 queries per second (QPS), it faces challenges such as the inability to use offline caching, strict response‑time requirements (typically under 300 ms), and massive data volumes that stress storage and access layers.

1. Say No to Relational Databases – Large‑scale consumer‑facing services should not rely on MySQL/Oracle as the primary store; instead, they should use NoSQL caches like Redis or Memcached for hot data, while relational databases serve as asynchronous backups for less‑frequent queries.

2. Multi‑Level Cache – Combine local memory cache, a multi‑threaded cache (e.g., MemeryCache), and Redis to absorb millions of QPS, mitigating cache‑penetration and cache‑stampede issues, especially in flash‑sale scenarios.

3. Multithreading – Replace synchronous loops that read Redis (≈3 ms per call) with a thread‑pool implementation; this can reduce an operation from >30 seconds to a few seconds, but thread pool size and queue depth must be tuned and monitored.

4. Degradation and Circuit‑Breaker – Use degradation to disable non‑critical features under overload, and circuit‑breaker to stop cascading failures when downstream services become saturated, protecting the system from collapse.

5. I/O Optimization – Batch downstream calls to avoid exponential I/O growth; reducing the number of external requests per user request dramatically improves latency under high traffic.

6. Controlled Retry – Implement retries with configurable limits and back‑off intervals; excessive retries can cause severe lag (e.g., Kafka consumer lag), so they must be carefully managed.

7. Edge‑Case Handling and Fallbacks – Guard against null or empty inputs and other boundary conditions; missing checks can lead to massive data leaks and service outages.

8. Graceful Logging – Apply rate‑limited or whitelist‑based logging to prevent disk overflow and I/O contention; token‑bucket algorithms can restrict log output to a manageable level.

In summary, these eight recommendations provide a foundational checklist for building resilient, high‑performance online services, though real‑world systems may require additional, more complex measures.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Caching High concurrency Logging circuit breaker retry strategy IO optimization

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.