Master the Distributed Systems Knowledge Map: From SOA to MSA and Beyond
This comprehensive guide walks you through the fundamentals, design patterns, consistency models, core components, and engineering practices of modern distributed systems, helping you understand micro‑service architecture, network protocols, data management, fault tolerance, and performance optimization in cloud‑native environments.
Introduction
This article outlines the essential knowledge for building distributed systems based on a micro‑service architecture (MSA). It covers theoretical foundations, design patterns, engineering practices, deployment, and operations.
Fundamental Theory
The evolution from Service‑Oriented Architecture (SOA) to Micro‑Service Architecture (MSA) is driven by the need to decouple services and enable independent deployment. SOA typically relies on a central service bus and shared databases, which create single points of failure. MSA eliminates the bus by making each service fully independent from entry to persistence, at the cost of increased orchestration complexity.
Node and Network
Nodes have progressed from physical machines to virtual machines and finally to lightweight containers that host services.
Three network models are defined:
Synchronous network – nodes execute in lockstep, latency is bounded, and a global lock can be used.
Semi‑synchronous network – lock scope is relaxed, allowing limited asynchrony.
Asynchronous network – nodes run independently, latency is unbounded, and no global lock exists.
Time and Order
Physical clocks cannot guarantee ordering across nodes. Distributed systems therefore use protocols such as NTP, logical clocks, and vector clocks.
Logical clock update: t' = max(t, t_msg + 1) Vector clock update:
t_i' = max(t_i, t_msg_i)Consistency Theory
Strong consistency (ACID)
Atomicity
Consistency
Isolation
Durability
CAP theorem – In a distributed system it is impossible to simultaneously guarantee Consistency, Availability, and Partition tolerance.
FLP impossibility – In an asynchronous network with unbounded delay, consensus cannot be reached in finite time if even a single node behaves maliciously.
BASE – Basically Available, Soft State, Eventual Consistency, which relaxes ACID for higher availability.
CALM principle – Consistency and Logical Monotonicity: monotonic logic leads to eventual consistency without a central coordinator.
CRDT (Conflict‑Free Replicated Data Types)
State‑based CRDT – merge states from all nodes.
Operation‑based CRDT – broadcast operations to all nodes.
Key protocols include Highly Available Transactions (HATs) and Zookeeper Atomic Broadcast (ZAB).
Core Distributed Systems
File systems
HDFS
FastDFS
Ceph
MooseFS
Databases
Column store: HBase
Document store: Elasticsearch, MongoDB
Key‑Value store: Redis
Distributed relational: Spanner
Computing frameworks
Offline batch: Hadoop
Real‑time analytics: Spark
Streaming: Storm, Flink/Blink
Cache
Persistent: Redis
Non‑persistent: Memcached
Message queues
Kafka
RabbitMQ
RocketMQ
ActiveMQ
Monitoring
Zookeeper (used for health checks and coordination)
Security mechanisms
Federated identity
Gateway‑proxy
Token‑based access control
Engineering Practices
Design Patterns
Typical patterns for distributed systems include reverse proxy, adapters, front‑back separation, resource aggregation, configuration separation, gateway aggregation, leader election, pipeline‑filter, sidecar, and static‑content CDN.
Availability
Health checks
Load balancing
Rate limiting (throttling)
Data Management
Cache
CQRS (Command Query Responsibility Segregation)
Event sourcing
Indexing
Materialized views
Sharding and partitioning
Implementation Details
Reverse proxy
Adapter layer
Front‑back separation
Resource aggregation
Configuration center
Gateway aggregation, offload, routing
Leader election
Pipeline‑filter
Sidecar deployment
Static‑content CDN
Resource Scheduling
Elastic scaling replaces manual provisioning. Automatic scaling, instance termination, and replacement of faulty nodes are essential.
Network Management
Domain name registration and updates
Load management
Outbound security filtering
Unified access control
Fault Snapshot
Capture memory distribution, thread counts (e.g., JavaDump)
Non‑intrusive bytecode debugging for production logs
Traffic Scheduling
Traffic passes through gateways. Strategies include:
Load balancers: hardware switches, F5, LVS/ALI‑LVS, Nginx/Tengine, VIPServer/ConfigServer
Gateway design: high‑performance, distributed, business filtering
Traffic management: request validation, CDN caching
Flow control: counters, queues, leaky bucket, token bucket, dynamic control
Rate‑limiting tools: Sentinel
Service Scheduling
Service registry for state detection and lifecycle management
Version management (cluster version, rollback)
Orchestration: Kubernetes, Spring Cloud, HSF, Zookeeper + Dubbo
Service control: registration, health check, degradation, circuit breaker (Hystrix), idempotency (global ID, Snowflake)
Data Scheduling
State transfer to global storage (e.g., login info in Redis)
Horizontal scaling via sharding, partitioning, replication
Automation & Operations
Configuration center (e.g., Switch, Diamend)
Deployment strategies: stop‑the‑world, rolling, blue‑green, canary, A/B testing
Job scheduling: SchedulerX, Spring scheduled tasks
Application management: restart, offline, log cleanup
Fault Tolerance
Active handling: retries (spring‑retry)
Passive handling: transaction compensation, idempotent operations
Performance Tuning
Performance optimization spans distributed lock design, high‑concurrency programming, and asynchronous event‑driven models.
Distributed lock for cache consistency
High‑concurrency patterns
Asynchronous event‑driven programming
Conclusion
Distributed systems provide scalability but introduce complexity and new failure modes. When possible, a single‑node solution should be considered first. If a distributed approach is required, the combination of Docker, Kubernetes, and Spring Cloud offers a practical foundation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
