Designing High‑Concurrency Systems: Key Strategies Interviewers Expect

This guide explains practical techniques—service splitting, caching, message queues, database sharding, read/write separation, and Elasticsearch—to design high‑concurrency back‑end systems that impress interviewers and handle real‑world traffic spikes.

JavaEdge
JavaEdge
JavaEdge
Designing High‑Concurrency Systems: Key Strategies Interviewers Expect

Service Splitting (Micro‑service Decomposition)

Break a monolithic application into several independent services. Each service owns its own database schema (or even a separate physical database). This isolates traffic, reduces contention on a single DB, and allows each service to be scaled, deployed, and tuned independently. Typical steps:

Identify bounded contexts (e.g., user, order, inventory).

Extract each context into a separate codebase or module.

Provision an independent relational (or NoSQL) instance for each service.

Define API contracts (REST, gRPC) for inter‑service communication.

Cache Layer

High‑concurrency workloads are usually read‑heavy. Deploy an in‑memory cache such as Redis or Memcached to store frequently accessed data.

Write‑through or write‑behind strategy keeps DB and cache consistent.

Read path: application queries cache first; on miss, fetch from DB and populate cache.

Typical cache TTL values range from seconds to minutes depending on data freshness requirements.

Redis can handle tens of thousands of QPS on a single node; cluster mode scales horizontally.

Message Queue for Write‑Spike Smoothing

When write traffic spikes (e.g., flash‑sale,秒杀), direct DB writes can overwhelm MySQL. Use an asynchronous queue to decouple request intake from persistence.

# Example using RabbitMQ (Java Spring)
MessageProducer.produce(orderMessage);

// Consumer
@RabbitListener(queues = "order_queue")
public void handle(OrderMessage msg) {
    // Apply business logic, then persist
    orderRepository.save(msg.toEntity());
}

Key points:

Queue buffers bursts, allowing downstream workers to process at a controlled rate.

Multiple consumer instances can be added to increase write throughput.

Ensure idempotency and transactional outbox patterns to avoid duplicate writes.

Database Sharding (分库分表)

When a single database becomes a bottleneck, split it horizontally:

Database‑level sharding : distribute rows across multiple physical databases based on a sharding key (e.g., user_id % N).

Table‑level partitioning : split a large table into several smaller tables (e.g., order_2023_01, order_2023_02) while keeping the same schema.

Maintain a routing layer (middleware or application logic) that determines the target shard for each query.

Benefits:

Reduced per‑shard data volume → faster index scans.

Load is spread across multiple DB instances, increasing overall QPS.

Read/Write Separation (Master‑Slave Replication)

Configure a primary (master) instance for writes and one or more replicas for reads.

Writes go to the master; replication asynchronously propagates changes to slaves.

Read traffic can be load‑balanced across slaves using a proxy (e.g., MySQL Router, ProxySQL) or application‑level routing.

When read latency becomes critical, consider semi‑synchronous replication to reduce replication lag.

Elasticsearch for Search‑Heavy Queries

Offload full‑text search, analytics, and simple aggregations to Elasticsearch, a distributed search engine.

Synchronize data from MySQL to ES via CDC (Change Data Capture) tools such as Debezium or Logstash.

Typical use cases: product keyword search, real‑time dashboards, statistical reports.

ES clusters scale horizontally; each node holds a shard replica, providing high availability.

Design Considerations & Caveats

While the techniques above form a solid foundation, practical high‑concurrency design must address:

When to apply sharding vs. scaling a single instance (cost, operational complexity).

How to perform joins across shards—often denormalization or application‑level aggregation is required.

Cache invalidation strategies to keep stale data from causing consistency bugs.

Idempotent processing in MQ consumers to handle retries safely.

Monitoring of replication lag, cache hit ratio, and queue depth to detect bottlenecks early.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

MicroservicesSystem Designcachinghigh concurrencyMessage Queue
JavaEdge
Written by

JavaEdge

First‑line development experience at multiple leading tech firms; now a software architect at a Shanghai state‑owned enterprise and founder of Programming Yanxuan. Nearly 300k followers online; expertise in distributed system design, AIGC application development, and quantitative finance investing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.