30 Essential Architecture Patterns for Scalable Backend Systems
This article presents a comprehensive catalog of thirty architectural patterns—including ambassador, anti‑corruption, gateway aggregation, CQRS, event sourcing, sharding, and circuit breaker—that help developers design, manage, and scale modern backend services efficiently and reliably.
1. Management and Monitoring
1.1 Ambassador Mode: Create a helper service that represents consumer services or applications to send network requests
An out‑of‑process proxy service (often implemented as a network middleware) acts as a network agent communicating with remote services to perform routing, circuit breaking, tracing, monitoring, authorization, data encryption, and logging.
Service routing
Service circuit breaking
Service tracing
Service monitoring
Service authorization
Data encryption
Log recording
Because it runs as an independent process, this pattern suits multi‑language, multi‑framework environments where client‑side responsibilities can be offloaded to the ambassador service, though it adds network overhead and deployment considerations.
1.2 Anti‑Corruption Mode: Implement a decorator or adapter layer between modern applications and legacy systems
A protective layer acts as an intermediary, allowing new systems to use modern communication and architecture while legacy systems remain unchanged; the layer can be discarded once legacy components are retired.
1.3 External Configuration Store: Move configuration information from application packages to a centralized location
A centralized configuration service stores settings, providing shared, secure, and manageable configuration for large‑scale sites; many open‑source projects offer such services.
1.4 Gateway Aggregation Mode: Use a gateway to aggregate multiple individual requests into a single request
A gateway layer concurrently issues multiple downstream requests, aggregates the results, and returns them to the caller, improving performance, enabling elasticity (circuit breaking, retries, rate limiting), caching, and serving as an external network entry point.
Concurrent calls to multiple services improve performance and allow partial data returns
Gateway can implement resilience patterns such as circuit breaking, retries, and rate limiting
Gateway can provide caching
Gateway serves as a network middle layer for external communication
Implementation can be as simple as using OpenResty or Nginx.
1.5 Gateway Offloading Mode: Place shared or specific service functions into a gateway proxy
The gateway handles non‑business concerns such as SSL termination; for example, external HTTPS is terminated at the gateway while internal services communicate over HTTP.
1.6 Gateway Routing Mode: Use a single endpoint to route requests to multiple services
APIs like /cart, /order, /search are routed by the gateway to different backend services, enabling load balancing, failover, and flexible version routing.
1.7 Health Endpoint Monitoring Mode: Execute functional checks in the application that external tools can periodically access via exposed endpoints
Expose detailed health information (service dependencies, thread pools, connection pools, queue lengths) so external monitoring and load balancers can determine true service health.
Decide which information to expose, including external storage and internal metrics
Both websites and services should expose health data for monitoring and failover
Secure the health endpoint to prevent unauthorized access
In Spring Boot, the Actuator module provides this capability.
1.8 Executioner Mode: Gradually replace specific functional components with new applications and services to migrate old systems
A façade routes traffic between old and new services; over time the new services replace the old ones, making the migration transparent to consumers.
2. Performance and Scalability
2.1 Cache‑Assisted Mode: Load data from storage into cache on demand
This pattern focuses on a full‑data‑in‑cache approach for relatively static data, achieving near‑100% hit rates and orders‑of‑magnitude faster lookups than databases.
Periodic synchronization of cache data
Different expiration times with active or passive updates
Synchronous updates of cache and database on data modification
2.2 Command‑Query Responsibility Segregation (CQRS) Mode: Separate read and write operations via distinct interfaces
Two independent data models—one optimized for reads, the other for writes—reduce interference, simplify permission management, and can be combined with event sourcing and materialized views.
2.3 Event Sourcing Mode: Record a series of immutable events that describe operations on domain data
Instead of persisting current state, store an append‑only log of state‑changing events, providing immutability, high performance, low coupling, and a complete audit trail.
Events are immutable and only appended
Event‑driven external processing with low coupling
Preserves original information without loss
2.4 Materialized View Mode: Generate pre‑filled views in one or more data stores for required query operations
Pre‑compute and store query‑optimized data structures to avoid costly joins; suitable for complex calculations, unstable back‑ends, or multi‑store queries, but not ideal for highly mutable data requiring strong consistency.
Complex calculations needed for queries
Unstable underlying storage
Need to join multiple heterogeneous stores
2.5 Queue‑Based Load Balancing Mode: Use a queue as a buffer between tasks and services to smooth intermittent heavy loads
Message queues decouple components, allowing each part to scale independently; they do not increase processing capacity but provide elasticity and protect against overload.
Queue introduces buffering, not unlimited storage
Ensure processing throughput exceeds peak enqueue rate (e.g., 2×) to handle load spikes
2.6 Priority Queue Mode: Determine request priority so higher‑priority requests are processed faster
Messages are assigned priorities; the queue either reorders in real time or uses separate processing pools for different priority levels.
Real‑time priority reordering within the queue
Separate processing pools with dedicated resources for high‑priority messages
2.7 Rate‑Limiting Mode: Control the resources consumed by an application, tenant, or service instance
Common algorithms include simple counters, token bucket (allows bursts), and leaky bucket (smooths traffic). Implementations such as Guava's RateLimiter provide token‑bucket functionality.
Rate limiting must be fast; reject excess requests immediately
Trigger limiting before system reaches ~80% capacity
Return specific error codes indicating throttling
Monitor throttling metrics to detect sudden traffic drops
Edge‑node throttling (e.g., CDN or client‑side) can filter massive spikes before they reach back‑end services
3. Data Management Modes
3.1 Sharding Mode: Partition data storage into a set of horizontal partitions or shards
When a single table cannot sustain >10k TPS even with caching and queuing, horizontal sharding distributes load across multiple databases.
Determine sharding strategy (condition, range, hash) and modify business code accordingly
Choose hard‑code, framework, or middleware for routing
Provide operational tools for unified indexing and data warehousing across shards
3.2 Static Content Hosting Mode: Deploy static content to cloud‑based storage services for direct client delivery
Separate static assets from dynamic sites, serve them via Nginx or CDN, improving load, parallel downloads, and user experience.
Define cache‑invalidation and refresh mechanisms for CDN
Ensure versioned filenames to avoid stale content
Use front‑end error‑handling to capture script errors for debugging
3.3 Index Table Mode: Create indexes for fields frequently referenced in queries
When primary‑key‑based indexes are insufficient, dedicated index tables can accelerate high‑selectivity queries, though they add storage and maintenance overhead.
4. Design and Implementation Modes
4.1 Front‑End‑Specific Backend Mode: Use separate interfaces to separate read and write operations for front‑end consumption
Different front‑ends (PC web vs. mobile app) may require distinct back‑ends due to signing, encryption, UI complexity, and security differences.
App APIs often require signatures and encryption
PC flows differ from app flows
PC pages display more content than mobile screens
Security designs vary (e.g., captchas)
4.2 Compute Resource Consolidation Mode: Combine multiple tasks or operations into a single compute unit
Aggregating related operations reduces resource fragmentation and overhead.
4.3 Election Mode: Elect one instance to manage others, coordinating distributed application tasks
Leader election (e.g., via Zookeeper) can be implemented with non‑fair (ephemeral non‑sequential nodes) or fair (ephemeral sequential nodes) algorithms.
Non‑fair: simple but may suffer performance issues under many nodes
Fair: uses sequence numbers for orderly, low‑contention leader selection
4.4 Pipeline and Filter Mode: Decompose complex processing tasks into reusable individual elements
Filters (or handlers) can be chained, inserted, or removed to build flexible processing pipelines, as seen in Spring MVC, Netty, or custom middleware.
5. Messaging Modes
5.1 Competing Consumers Mode: Use multiple concurrent consumers to handle messages on the same channel
Stateless consumers compete for messages, enabling horizontal scaling and fault tolerance.
5.2 Retry Mode: Transparently retry failed operations when temporary faults occur
Retries can be user‑initiated, middleware‑initiated, or manually coded, with configurable attempts, exception filters, and back‑off strategies.
Maximum retry count
Exception whitelist/blacklist
Fixed or incremental delay between attempts
5.3 Scheduler‑Proxy‑Supervisor Mode: Coordinate a set of operations among distributed services and remote resources
Three roles work together: Scheduler (orchestrates tasks), Proxy (communicates with remote services), and Supervisor (monitors execution and triggers compensation).
Scheduler maintains task state
Proxy handles communication, retries, and error handling
Supervisor watches execution and requests compensation when needed
6. Resilience Modes
6.1 Bulkhead Mode: Isolate application elements into pools so that failure of one does not affect others
Isolation can be at thread‑pool, service, VM, or container level, preventing cascading failures.
ParallelStream sharing a common pool can starve other tasks
Dead‑letter messages occupying a queue block normal traffic
Heavy upload operations can monopolize threads, starving lightweight queries
6.2 Circuit Breaker Mode: Handle faults when connecting to remote services or resources
A three‑state circuit breaker (closed, open, half‑open) protects callers from repeatedly invoking failing services.
Closed: count errors
Open: block calls after threshold
Half‑open: allow limited traffic to test recovery
6.3 Transaction Compensation Mode: Undo work performed through a series of steps to achieve eventual consistency
Compensation middleware records call status, queries service state, and invokes compensating actions when necessary.
Mark critical calls for automatic compensation
Record call status in a database
Providers expose status query APIs; consumers expose compensation callbacks
Middleware orchestrates compensation based on recorded state
7. Security Modes
7.1 Customer‑Key Mode: Provide tokens or keys that grant limited direct access to specific resources or services
Generate time‑bound tokens that clients present directly to storage or CDN services, avoiding proxying through the application.
7.2 Federated Identity Mode: Delegate authentication to an external identity provider
Use a dedicated identity service (e.g., OAuth 2.0 with Spring Security) to centralize login and enable single sign‑on.
In summary, these thirty patterns span design details and high‑level philosophies; they are often combined, and selecting the right ones for a given scenario can provide a breakthrough in system architecture.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
