Scalable Web Architecture: Layers, Load Balancing, and Storage
This article explains the layered architecture of large‑scale web systems, covering flexible component choices, load distribution strategies, business service and communication layers, storage options from file to object systems, and key evaluation criteria such as cost, scalability, security, and maintainability.
1. Architecture Layer Diagram
The diagram illustrates typical components of a web system architecture and common technology stacks for each layer, emphasizing that the architecture is flexible and not every layer or technology is required for every project.
Simple CRM systems may omit caching layers like K‑V stores.
Low‑traffic systems might not need a load‑balancing layer.
Business communication does not rely on traditional HTTP requests due to high latency and unrelated chatter; HTTP is suited for client‑to‑server calls (WEB, iOS, Android).
Caching systems such as Redis are treated as key‑value databases in the data storage layer.
Layers are not strictly dependent; for example, static image requests can bypass the business layer and go directly to distributed file storage.
2. Load Distribution Layer
Load balancing distributes external traffic across multiple internal processing nodes, similar to traffic control in everyday life. Large‑scale web services (hundreds of millions of daily page views) require multiple servers, and the load layer routes requests based on rules—for instance, directing image requests to storage and order requests to the order service.
Common load‑balancing architectures include:
Standalone Nginx or HAProxy
LVS (DR) + Nginx
DNS round‑robin + LVS + Nginx
Smart DNS + LVS + Nginx
Further articles will detail each scheme.
3. Business Service and Communication Layer
3.1 Overview
The core business layer handles orders, construction management, medical services, payments, logging, etc. In medium‑to‑large systems, subsystems are decoupled; a system may know only the underlying services (e.g., authentication) but not peer systems. Complex workflows often require inter‑service calls.
3.2 HTTP Requests Not Recommended
Using HTTP for inter‑service communication incurs TCP connection overhead, unnecessary header exchange, and lacks built‑in context consistency. While HTTP pools can reduce connection time, fundamental drawbacks remain, so HTTP should be limited to client‑to‑server scenarios.
The article advises against using HTTP for service‑to‑service calls.
4. Data Storage Layer
4.1 File Storage Basics
Demonstrates creating an Ext4 filesystem on CentOS 6.5 using fdisk, mkfs, and mounting, highlighting physical blocks, sectors, and the abstraction provided by file systems.
4.2 Block and File Storage
Explains the need for scalable, shared storage. Block storage separates disks from hosts, using protocols like FC, SCSI, or iSCSI to transport I/O over networks, offering high throughput but limited sharing. File storage (FTP, NFS, DAS) provides network‑shared access at lower performance.
When faced with heavy read/write pressure, block storage is preferred; when sharing files is essential, file storage is used, each with trade‑offs.
4.3 Object Storage Systems
Object storage combines high throughput of block storage with the sharing capability of file storage. Examples include Swift, Ceph, and Ozone. It uses metadata servers and coordination nodes to manage distributed objects, offering scalability and fault tolerance.
Object storage is a distributed file system, though not all distributed file systems are object storage.
4.4 Database Storage
Future articles will cover MySQL architecture, performance tuning, and other databases such as Cassandra, HBase, and MongoDB.
5. Evaluating Architecture Characteristics
5.1 Construction Cost
Considers design, hardware, operation, and third‑party service costs; architects must avoid over‑design.
5.2 Scalability and Planning
Describes horizontal and vertical scaling with minimal service disruption.
5.3 Attack Resistance
Focuses on preventing and mitigating external (DoS/DDoS) and internal attacks.
5.4 Disaster Recovery Levels
Discusses cluster, distributed, and cross‑site disaster recovery strategies.
5.5 Business Fit
Architecture must serve business needs; technology choices (SOA, message queues) should be driven by specific scenarios.
5.6 Maintenance Difficulty
Evaluates operational complexity and ongoing maintenance costs.
6. Additional Notes
Detailed designs for load, business, and storage layers will be explored in subsequent articles.
Data analysis layers, such as Hadoop ecosystems, will also be introduced.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
