30 Essential Architecture Patterns for Scalable Backend Systems

This article presents a comprehensive catalog of thirty architectural patterns—including ambassador, anti‑corruption, gateway aggregation, CQRS, event sourcing, sharding, and circuit breaker—that help developers design, manage, and scale modern backend services efficiently and reliably.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
30 Essential Architecture Patterns for Scalable Backend Systems

1. Management and Monitoring

1.1 Ambassador Mode: Create a helper service that represents consumer services or applications to send network requests

An out‑of‑process proxy service (often implemented as a network middleware) acts as a network agent communicating with remote services to perform routing, circuit breaking, tracing, monitoring, authorization, data encryption, and logging.

Service routing

Service circuit breaking

Service tracing

Service monitoring

Service authorization

Data encryption

Log recording

Because it runs as an independent process, this pattern suits multi‑language, multi‑framework environments where client‑side responsibilities can be offloaded to the ambassador service, though it adds network overhead and deployment considerations.

1.2 Anti‑Corruption Mode: Implement a decorator or adapter layer between modern applications and legacy systems

A protective layer acts as an intermediary, allowing new systems to use modern communication and architecture while legacy systems remain unchanged; the layer can be discarded once legacy components are retired.

1.3 External Configuration Store: Move configuration information from application packages to a centralized location

A centralized configuration service stores settings, providing shared, secure, and manageable configuration for large‑scale sites; many open‑source projects offer such services.

1.4 Gateway Aggregation Mode: Use a gateway to aggregate multiple individual requests into a single request

A gateway layer concurrently issues multiple downstream requests, aggregates the results, and returns them to the caller, improving performance, enabling elasticity (circuit breaking, retries, rate limiting), caching, and serving as an external network entry point.

Concurrent calls to multiple services improve performance and allow partial data returns

Gateway can implement resilience patterns such as circuit breaking, retries, and rate limiting

Gateway can provide caching

Gateway serves as a network middle layer for external communication

Implementation can be as simple as using OpenResty or Nginx.

1.5 Gateway Offloading Mode: Place shared or specific service functions into a gateway proxy

The gateway handles non‑business concerns such as SSL termination; for example, external HTTPS is terminated at the gateway while internal services communicate over HTTP.

1.6 Gateway Routing Mode: Use a single endpoint to route requests to multiple services

APIs like /cart, /order, /search are routed by the gateway to different backend services, enabling load balancing, failover, and flexible version routing.

1.7 Health Endpoint Monitoring Mode: Execute functional checks in the application that external tools can periodically access via exposed endpoints

Expose detailed health information (service dependencies, thread pools, connection pools, queue lengths) so external monitoring and load balancers can determine true service health.

Decide which information to expose, including external storage and internal metrics

Both websites and services should expose health data for monitoring and failover

Secure the health endpoint to prevent unauthorized access

In Spring Boot, the Actuator module provides this capability.

1.8 Executioner Mode: Gradually replace specific functional components with new applications and services to migrate old systems

A façade routes traffic between old and new services; over time the new services replace the old ones, making the migration transparent to consumers.

2. Performance and Scalability

2.1 Cache‑Assisted Mode: Load data from storage into cache on demand

This pattern focuses on a full‑data‑in‑cache approach for relatively static data, achieving near‑100% hit rates and orders‑of‑magnitude faster lookups than databases.

Periodic synchronization of cache data

Different expiration times with active or passive updates

Synchronous updates of cache and database on data modification

2.2 Command‑Query Responsibility Segregation (CQRS) Mode: Separate read and write operations via distinct interfaces

Two independent data models—one optimized for reads, the other for writes—reduce interference, simplify permission management, and can be combined with event sourcing and materialized views.

2.3 Event Sourcing Mode: Record a series of immutable events that describe operations on domain data

Instead of persisting current state, store an append‑only log of state‑changing events, providing immutability, high performance, low coupling, and a complete audit trail.

Events are immutable and only appended

Event‑driven external processing with low coupling

Preserves original information without loss

2.4 Materialized View Mode: Generate pre‑filled views in one or more data stores for required query operations

Pre‑compute and store query‑optimized data structures to avoid costly joins; suitable for complex calculations, unstable back‑ends, or multi‑store queries, but not ideal for highly mutable data requiring strong consistency.

Complex calculations needed for queries

Unstable underlying storage

Need to join multiple heterogeneous stores

2.5 Queue‑Based Load Balancing Mode: Use a queue as a buffer between tasks and services to smooth intermittent heavy loads

Message queues decouple components, allowing each part to scale independently; they do not increase processing capacity but provide elasticity and protect against overload.

Queue introduces buffering, not unlimited storage

Ensure processing throughput exceeds peak enqueue rate (e.g., 2×) to handle load spikes

2.6 Priority Queue Mode: Determine request priority so higher‑priority requests are processed faster

Messages are assigned priorities; the queue either reorders in real time or uses separate processing pools for different priority levels.

Real‑time priority reordering within the queue

Separate processing pools with dedicated resources for high‑priority messages

2.7 Rate‑Limiting Mode: Control the resources consumed by an application, tenant, or service instance

Common algorithms include simple counters, token bucket (allows bursts), and leaky bucket (smooths traffic). Implementations such as Guava's RateLimiter provide token‑bucket functionality.

Rate limiting must be fast; reject excess requests immediately

Trigger limiting before system reaches ~80% capacity

Return specific error codes indicating throttling

Monitor throttling metrics to detect sudden traffic drops

Edge‑node throttling (e.g., CDN or client‑side) can filter massive spikes before they reach back‑end services

3. Data Management Modes

3.1 Sharding Mode: Partition data storage into a set of horizontal partitions or shards

When a single table cannot sustain >10k TPS even with caching and queuing, horizontal sharding distributes load across multiple databases.

Determine sharding strategy (condition, range, hash) and modify business code accordingly

Choose hard‑code, framework, or middleware for routing

Provide operational tools for unified indexing and data warehousing across shards

3.2 Static Content Hosting Mode: Deploy static content to cloud‑based storage services for direct client delivery

Separate static assets from dynamic sites, serve them via Nginx or CDN, improving load, parallel downloads, and user experience.

Define cache‑invalidation and refresh mechanisms for CDN

Ensure versioned filenames to avoid stale content

Use front‑end error‑handling to capture script errors for debugging

3.3 Index Table Mode: Create indexes for fields frequently referenced in queries

When primary‑key‑based indexes are insufficient, dedicated index tables can accelerate high‑selectivity queries, though they add storage and maintenance overhead.

4. Design and Implementation Modes

4.1 Front‑End‑Specific Backend Mode: Use separate interfaces to separate read and write operations for front‑end consumption

Different front‑ends (PC web vs. mobile app) may require distinct back‑ends due to signing, encryption, UI complexity, and security differences.

App APIs often require signatures and encryption

PC flows differ from app flows

PC pages display more content than mobile screens

Security designs vary (e.g., captchas)

4.2 Compute Resource Consolidation Mode: Combine multiple tasks or operations into a single compute unit

Aggregating related operations reduces resource fragmentation and overhead.

4.3 Election Mode: Elect one instance to manage others, coordinating distributed application tasks

Leader election (e.g., via Zookeeper) can be implemented with non‑fair (ephemeral non‑sequential nodes) or fair (ephemeral sequential nodes) algorithms.

Non‑fair: simple but may suffer performance issues under many nodes

Fair: uses sequence numbers for orderly, low‑contention leader selection

4.4 Pipeline and Filter Mode: Decompose complex processing tasks into reusable individual elements

Filters (or handlers) can be chained, inserted, or removed to build flexible processing pipelines, as seen in Spring MVC, Netty, or custom middleware.

5. Messaging Modes

5.1 Competing Consumers Mode: Use multiple concurrent consumers to handle messages on the same channel

Stateless consumers compete for messages, enabling horizontal scaling and fault tolerance.

5.2 Retry Mode: Transparently retry failed operations when temporary faults occur

Retries can be user‑initiated, middleware‑initiated, or manually coded, with configurable attempts, exception filters, and back‑off strategies.

Maximum retry count

Exception whitelist/blacklist

Fixed or incremental delay between attempts

5.3 Scheduler‑Proxy‑Supervisor Mode: Coordinate a set of operations among distributed services and remote resources

Three roles work together: Scheduler (orchestrates tasks), Proxy (communicates with remote services), and Supervisor (monitors execution and triggers compensation).

Scheduler maintains task state

Proxy handles communication, retries, and error handling

Supervisor watches execution and requests compensation when needed

6. Resilience Modes

6.1 Bulkhead Mode: Isolate application elements into pools so that failure of one does not affect others

Isolation can be at thread‑pool, service, VM, or container level, preventing cascading failures.

ParallelStream sharing a common pool can starve other tasks

Dead‑letter messages occupying a queue block normal traffic

Heavy upload operations can monopolize threads, starving lightweight queries

6.2 Circuit Breaker Mode: Handle faults when connecting to remote services or resources

A three‑state circuit breaker (closed, open, half‑open) protects callers from repeatedly invoking failing services.

Closed: count errors

Open: block calls after threshold

Half‑open: allow limited traffic to test recovery

6.3 Transaction Compensation Mode: Undo work performed through a series of steps to achieve eventual consistency

Compensation middleware records call status, queries service state, and invokes compensating actions when necessary.

Mark critical calls for automatic compensation

Record call status in a database

Providers expose status query APIs; consumers expose compensation callbacks

Middleware orchestrates compensation based on recorded state

7. Security Modes

7.1 Customer‑Key Mode: Provide tokens or keys that grant limited direct access to specific resources or services

Generate time‑bound tokens that clients present directly to storage or CDN services, avoiding proxying through the application.

7.2 Federated Identity Mode: Delegate authentication to an external identity provider

Use a dedicated identity service (e.g., OAuth 2.0 with Spring Security) to centralize login and enable single sign‑on.

In summary, these thirty patterns span design details and high‑level philosophies; they are often combined, and selecting the right ones for a given scenario can provide a breakthrough in system architecture.

design-patternsmicroservicesscalability
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.