Backend Development 8 min read

Building Scalable High‑Concurrency Backend Systems: Guarding the Baseline, Raising Throughput, and Horizontal Expansion

This article shares practical guidance on designing, protecting, and continuously improving high‑concurrency backend services—covering baseline capacity, rate limiting, data‑structure optimization, stateless architecture, and horizontal scaling—to help engineers evolve small systems into robust, production‑grade platforms.

JD Tech

Aug 13, 2018

Building Scalable High‑Concurrency Backend Systems: Guarding the Baseline, Raising Throughput, and Horizontal Expansion

Author Introduction

Liu Shenbao — JD Finance R&D Department Architect

Liu has over ten years of experience in internet R&D, focusing on core components and technical solutions for the finance R&D division. He has led major architecture upgrades, database migrations, and performance‑critical projects such as settlement engine refactoring and order ledger optimization.

Introduction

High‑concurrency architecture talks are often geared toward massive systems, leaving smaller teams wondering how to bridge the gap between “beginner” and “expert” levels.

First: Guard Your Baseline

Baseline? The maximum processing capacity of a single instance.

Single Instance refers to one application, cache, or storage instance.

How is the baseline determined? Through load testing.

Is the baseline fixed? It must be adjusted whenever the service architecture changes.

Example: a Java + DB instance can handle 500 req/s; after caching, the peak can rise to 5 000 req/s, but if the cache fails the system must fall back to the original 500 req/s.

Can’t meet the baseline? The system may be overwhelmed, leading to a cascade of 502 errors.

How to protect the baseline? Rate limiting, rate limiting, and rate limiting!

Rate Limiting is the fundamental safeguard of a stable system and should never be ignored.

Unexpected traffic spikes are unpredictable; robust rate‑limiting mechanisms are essential.

Adjusting Online Rate Limits should be based on monitoring and traffic segmentation.

Monitoring granularity must match the depth of rate‑limiting layers.

We have built a traffic analysis platform that allows custom rule definitions for traffic splitting reports and fine‑grained flow control.

Second: Continuously Raise the Baseline

Increase Per‑Instance Throughput

Optimize data structures and employ caching to boost the maximum throughput of a single instance.

Static web traffic splitting: page staticization, app‑side caching, CDN distribution.

Data caching: preload configuration data locally, hot‑data Redis pre‑load (note that cache also has throughput and capacity limits).

Process simplification: split order‑taking and production workflows.

Request cleanup: enable gzip compression, remove unnecessary AJAX payloads.

Statelessness

Design for high cohesion and low coupling; package application + DB as an externally extensible service module.

Three standards:

Multi‑source internal data dependencies.

Decoupled upstream interfaces.

Automatic downstream data propagation.

Example: pricing‑protection system workflow diagram.

Finally: Orchestrate Change and Horizontal Expansion

Monitoring as the eyes – comprehensive traffic visibility is achieved through monitoring systems.

Process optimization – keep business processes simple; split order‑taking and processing to allocate resources flexibly.

Horizontal scaling – container + DB clusters enable dynamic scaling; external dependency failures can be switched over seamlessly.

System Leveling Standards

Single instance survivability – combine DB‑level and application‑level rate limiting so that under heavy load the service degrades gracefully instead of crashing.

Typed rate limiting – configure different thresholds for different business flows based on fine‑grained monitoring.

Cluster expansion (DB + instance) – package DB and instances into a cluster that can be auto‑scaled according to traffic.

External dependency obliviousness – pre‑fetch and cache critical data so that when external APIs fail the core system continues to operate.

Each stage should evolve with traffic growth; over‑design leads to higher development and operational costs and harms system stability.

Brainstorm

How do top‑level experts forge their skills? What equipment and tactics enable systems to tackle “over‑level” challenges?

What’s the next step for your system?

---------------------END---------------------

Click the images to read more.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend monitoring microservices Scalability high concurrency Rate Limiting

Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.