Mastering Distributed System Design: Core Principles Every Engineer Should Know
This article outlines essential distributed system concepts—including system decomposition, concurrency, caching strategies, online vs. offline processing, push/pull communication, load limiting, service degradation, CAP theorem, and eventual consistency—to help engineers design scalable, reliable architectures for high‑traffic applications.
In today's internet‑driven world, distributed systems are ubiquitous, powering search engines, e‑commerce sites, social platforms, and any service with massive users and high concurrency.
There is no single best architecture; each business faces unique challenges that require tailored designs.
Nevertheless, several fundamental ideas are common across distributed systems, and this article summarizes them.
Decomposition
System Decomposition
As an architect once said, "Make big systems small." Large, complex systems should be split into multiple subsystems, each with its own storage, services, and interfaces, allowing independent development, testing, deployment, and operation. Teams can use familiar languages and collaborate via well‑defined interfaces.
Subsystem Decomposition
Within a subsystem, further layering and modularization are possible. Concepts like "system," "subsystem," "layer," and "module" are relative; a complex module may become its own system, while simple modules may stay combined.
Storage Decomposition
NoSQL databases such as MongoDB are inherently distributed and support sharding easily. Relational databases like MySQL require database‑and‑table sharding, raising issues of split dimensions, join handling, and distributed transactions.
Computation Decomposition
Two approaches exist: data partitioning—splitting a large dataset into smaller chunks for parallel processing (e.g., large‑scale merge sort); and task partitioning—breaking a long task into stages that run in parallel. Frameworks like Java's Fork/Join, Hadoop's Map/Reduce, and distributed search indexing follow this pattern.
Concurrency
Increasing concurrency, often via multithreading, boosts performance—e.g., converting sequential RPC calls to asynchronous parallel calls or partitioning data to process large tables faster.
Cache
Caching is a primary performance technique; the key consideration is cache granularity. Finer granularity improves reuse but may require more lookups, while coarser granularity simplifies access but risks higher invalidation rates.
Online vs. Offline / Synchronous vs. Asynchronous
Not all workloads need real‑time results. Scenarios like internal reporting, social media feed propagation, search indexing, or payment processing can tolerate delays, allowing asynchronous processing via message queues or background jobs, and enabling read‑write separation such as MySQL master‑slave replication.
Full + Incremental
Full and incremental processing mirrors online/offline strategies—for example, full and incremental indexes in search engines, or periodic merges of small tables in OceanBase.
Push vs. Pull
State notification between nodes can be push (active notification) or pull (periodic polling). This choice impacts module responsibilities and coupling, similar to bidirectional associations in object‑oriented design.
Batch Processing
Transforming real‑time tasks into batch jobs reduces system pressure, as seen in Kafka's batch message sending or aggregating ad clicks before billing.
Write‑Heavy vs. Read‑Heavy
"Write‑light, read‑heavy" (e.g., MySQL) favors strong consistency but may suffer performance issues, whereas "write‑heavy, read‑light" (e.g., feed pre‑generation) stores results for fast reads at the cost of eventual consistency.
Read‑Write Separation
Traditional single‑node MySQL synchronously serves reads and writes, but many scenarios allow asynchronous replication, enabling master‑slave setups or OLTP/OLAP separation where online data syncs periodically to analytical systems.
Static‑Dynamic Separation
Web front‑ends often separate dynamic pages (served by web servers) from static assets (served via CDN), improving performance and reducing server load.
Hot‑Cold Separation
Historical data can be periodically moved from MySQL to Hive for long‑term storage and analysis.
Rate Limiting
During traffic spikes like flash sales, limiting request rates prevents overload, similar to controlling crowd size at popular venues.
Service Circuit Breaker and Degradation
When downstream services become unavailable, degrading them—returning default responses—keeps the primary workflow functional and maintains overall system availability.
CAP Theory
The discussed principles can be unified under the CAP theorem: Consistency (data replicas agree), Availability (service remains responsive), and Partition tolerance (network partitions are inevitable). In practice, systems balance Consistency and Availability while accepting Partition tolerance.
Examples: MySQL emphasizes Consistency, sacrificing Availability; NoSQL systems favor Partition tolerance and Availability, weakening Consistency; social feeds may relax Consistency for higher Availability.
Eventual Consistency
Distributed systems often adopt eventual consistency to mitigate the difficulty of strong consistency across partitions, typically using reliable message queues to propagate updates.
Source: https://blog.csdn.net/chunlongyu/article/details/52525604
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
