Technical Summary of Large-Scale Distributed E‑Commerce Website Architecture
This article provides a comprehensive technical overview of large distributed website architecture, covering performance, high availability, scalability, security, and agility, and illustrates the evolution, design patterns, and practical optimization techniques for modern e‑commerce platforms.
This article is a technical summary for learning large distributed website architecture. It gives an overview of designing a high‑performance, highly‑available, scalable, and extensible distributed site, offering both reading notes and personal experience for reference.
1. Large Distributed Website Architecture Technologies
1. Characteristics of Large Websites
Many users, widely distributed.
High traffic, high concurrency.
Massive data, high service availability.
Harsh security environment, prone to network attacks.
Many functions, rapid changes, frequent releases.
Gradual growth from small to large.
User‑centric.
Free services, paid experiences.
2. Goals of Large Website Architecture
High performance: provide fast access experience.
High availability: services remain accessible at all times.
Scalability: increase or decrease hardware to adjust processing capacity.
Security: ensure secure access, data encryption, and safe storage.
Extensibility: easily add or remove modules and functions.
Agility: respond quickly to changing needs.
3. Architecture Patterns for Large Websites
Layered: typically application, service, data, management, and analytics layers.
Segmentation: split by business/module/function, e.g., front‑end layer divided into homepage, user center.
Distributed: deploy applications on multiple physical machines and coordinate via remote calls.
Cluster: deploy multiple instances of an application/module and use load balancing for external access.
Cache: place data close to the application or user to accelerate access.
Asynchronous: convert synchronous operations to asynchronous, using notification or polling after processing.
Redundancy: add replicas to improve availability, security, and performance.
Security: provide effective solutions for known issues and mechanisms for unknown threats.
Automation: replace repetitive manual tasks with tools and machines.
Agility: quickly adapt to requirement changes and business growth.
4. High‑Performance Architecture
Centering on the user, provide fast web access. Key parameters include short response time, high concurrency, high throughput, and stable performance.
It can be divided into front‑end optimization, application‑layer optimization, code‑level optimization, and storage‑layer optimization.
Front‑end optimization: the part before business logic.
Browser optimization: reduce HTTP requests, use browser cache, enable compression, place CSS/JS appropriately, async JS, reduce cookie transmission; use CDN acceleration and reverse proxy.
Application‑layer optimization: use caching, asynchronous processing, clustering.
Code optimization: proper architecture, multithreading, resource reuse (object pool, thread pool), good data structures, JVM tuning, singleton, cache, etc.
Storage optimization: cache, SSD, fiber transmission, read/write tuning, disk redundancy, distributed storage (HDFS), NoSQL, etc.
5. High‑Availability Architecture
Large websites must remain accessible at all times. Due to complexity, distribution, cheap servers, open‑source databases, and OS variations, achieving high availability is challenging, and failures are inevitable.
Improving availability must be considered at the architectural level during planning. Industry often uses "nines" to denote availability, e.g., four nines (99.99%) allow about 53 minutes of downtime per year.
Different layers adopt different strategies, typically redundancy and failover.
Application layer: design stateless services; use load balancing (with session synchronization) for high availability.
Service layer: load balancing, hierarchical management, fast failure (timeouts), asynchronous calls, service degradation, idempotent design, etc.
Data layer: redundant backups (cold, hot [sync/async], warm), failover; based on CAP theorem (consistency, availability, partition tolerance).
6. Scalability Architecture
Scalability means adjusting processing capacity by adding or removing hardware without changing the original design.
Application layer: vertical or horizontal sharding, then load balance each function (DNS, HTTP reverse proxy, IP, layer‑2).
Service layer: similar to application layer.
Data layer: database sharding, table partitioning, NoSQL; common algorithms include hash and consistent hash.
7. Extensibility Architecture
Allows easy addition or removal of functional modules, providing good code/module level extensibility.
Modularization, componentization: high cohesion, low coupling, improve reusability and extensibility.
Stable interfaces: define stable APIs so internal structures can change without affecting callers.
Design patterns: apply OOP principles and design patterns for code‑level design.
Message queues: decouple modules via asynchronous messaging.
Distributed services: expose common modules as services for reuse and extension.
8. Security Architecture
Provide effective solutions for known problems and mechanisms for unknown/potential threats. First, raise security awareness and establish policies such as regular password changes, weekly security scans, and institutionalized security measures.
Infrastructure security: procure hardware from reputable sources, choose secure operating systems, patch vulnerabilities promptly, install anti‑virus and firewalls, implement DDOS protection, intrusion detection, and subnet isolation.
Application security: address common issues (XSS, injection, CSRF, error leakage, file upload, path traversal) during development; use Web Application Firewalls like ModSecurity and conduct vulnerability scans.
Data confidentiality: secure storage (reliable devices, regular backups), secure saving (encryption, access control), and secure transmission (prevent data theft and tampering).
Common encryption algorithms: hash (MD5, SHA), symmetric (DES, 3DES, RC), asymmetric (RSA).
9. Agility
Architecture and operations must adapt to change, providing high stretchability and extensibility to handle rapid business growth and traffic spikes.
Beyond the technical elements above, agile management and development practices should be introduced to align business, product, technology, and operations.
10. Example of Large‑Scale Architecture
The example uses a seven‑layer logical architecture: client layer, front‑end optimization layer, application layer, service layer, data storage layer, big‑data storage layer, and big‑data processing layer.
Client layer: supports PC browsers and mobile apps; mobile apps can access directly via IP and reverse proxy.
Front‑end layer: DNS load balancing, CDN acceleration, reverse proxy.
Application layer: website application cluster; vertical splitting by business (e.g., product, member center).
Service layer: provides common services such as user, order, payment.
Data layer: relational DB cluster (read/write separation), NoSQL cluster, distributed file system cluster, distributed cache.
Big‑data storage layer: collects logs from application and service layers, structured and semi‑structured data.
Big‑data processing layer: offline analysis via MapReduce or real‑time analysis via Storm, storing results in relational databases for downstream use.
2. Evolution of Large E‑Commerce Site Architecture
A mature large site (e.g., Taobao, Tmall, Tencent) does not start with a complete high‑performance, high‑availability, highly‑scalable design; it evolves as user volume and business functions grow, changing development models, technical stacks, and design philosophies.
Different business characteristics lead to different architectural emphases, but common technologies can be identified across sites.
1. Initial Architecture
All components (application, database, files) deployed on a single server.
2. Separation of Application, Data, and Files
As traffic grows, separate servers are used for application, database, and file storage, each with hardware tuned for its role.
3. Using Cache to Improve Performance
Cache hot data (80% of requests hit 20% of data) to reduce latency. Common approaches include local cache, distributed cache, CDN, and reverse proxy.
Local cache (e.g., OSCache) is fast but limited in size; distributed cache (e.g., Memcached, Redis) scales to massive data.
4. Application Server Clustering
Deploy a cluster of application servers behind a load balancer to share request load.
Load balancing options: hardware (F5), software (LVS – layer 4, Nginx/HAProxy – layer 7). LVS offers higher performance; Nginx/HAProxy provide richer configuration such as static‑resource separation.
5. Database Read/Write Separation and Sharding
To alleviate database bottlenecks, use master‑slave replication for read/write separation and split databases/tables horizontally or vertically.
6. CDN and Reverse Proxy for Performance
Deploy CDN to cache content near users in different regions, reducing latency. Use reverse proxy (e.g., Squid, Nginx) to serve cached responses before hitting application servers.
7. Distributed File Systems
When file volume grows, adopt distributed file systems such as GFS, HDFS, or TFS.
8. NoSQL and Search Engines
For massive data queries, combine NoSQL databases (MongoDB, HBase, Redis) with search engines (Lucene, Solr, Elasticsearch).
9. Business‑Level Service Splitting
As applications become monolithic, split them into business‑level services (e.g., news, web, image) that communicate via messaging or shared databases.
10. Building Distributed Services
Extract common functionalities (user, order, payment, security) into distributed services using frameworks like Dubbo.
3. One‑Page Overview of E‑Commerce Architecture
4. Large E‑Commerce Site Architecture Case Study
1. Reasons for Choosing an E‑Commerce Case
Distributed large‑scale sites fall into three categories: portals (e.g., NetEase, Sina), social networks (e.g., campus sites, Kaixin), and e‑commerce (e.g., Alibaba, JD, Guomei, Autohome). E‑commerce combines characteristics of portals and social sites, making it a suitable case study.
2. E‑Commerce Requirements
Build a full‑category B2C platform with online purchase, payment, and cash‑on‑delivery.
Online customer service chat.
User reviews and ratings after purchase.
Integration with existing ERP/inventory systems.
Support 3‑5 years of business growth.
Target 10 million registered users within 3‑5 years.
Handle promotional events like Double 11, Double 12, etc.
Reference features from JD or Guomei.
3. Initial Simple Architecture
Typical early setup: three servers – one for application, one for database, one for NFS file storage.
4. Capacity Estimation
Based on 10 million users, estimate daily UV = 2 million, average 30 page views per user → 60 million PV per day. Peak traffic (80% of PV in 4.8 hours) ≈ 4.8 million PV per hour → ≈ 16.7 k concurrent requests per minute → ≈ 2.8 k per second, multiplied by three for peak → ≈ 8.3 k per second.
Assuming one Tomcat instance handles 300 QPS, need ~10 servers for normal load and ~30 for peak.
5. Architecture Analysis and Optimization
Deploy many servers for peak load; consider over‑provisioning waste.
Current monolithic deployment causes tight coupling; need vertical and horizontal splitting.
Redundant code across applications.
Session synchronization consumes memory and bandwidth.
Database becomes a bottleneck due to frequent access.
Optimization measures include business splitting, application clustering, multi‑level caching, distributed session (single sign‑on), database clustering (read/write separation, sharding), service‑oriented architecture, message queues, and other techniques.
6. Detailed Optimizations
6.1 Business Splitting
Separate vertical domains: product, shopping cart, payment, review, customer service, and integration interfaces. Define core (product, shopping, payment) and non‑core systems.
6.2 Application Cluster Deployment
Deploy each split service on multiple machines, use RPC for communication, and load balancers for high availability.
6.3 Multi‑Level Caching
Use local cache for immutable or slowly changing data and distributed cache (e.g., Redis) for hot data; fallback to database if both miss.
6.4 Distributed Session / Single Sign‑On
Store session data in a shared cache (Redis) with expiration, enabling stateless application servers.
6.5 Database Cluster (Read/Write Separation, Sharding)
Each business subsystem has its own database; large databases are further sharded; read/write separation applied on top.
6.6 Service‑Oriented Architecture
Extract common functionalities into reusable services (e.g., member service).
6.7 Message Queues
Use MQ (e.g., RabbitMQ) to decouple modules; order placement writes to queue, then inventory and delivery services consume asynchronously.
6.8 Other Techniques
Include CDN, reverse proxy, distributed file systems, big‑data processing, etc.
7. Architecture Summary
The architecture of large websites continuously evolves based on business needs; this article outlines common techniques and considerations to inspire further design.
Author: Luan Zhupi – over ten years of experience, former Google engineer, proficient in Java, distributed architecture, micro‑services, databases, currently researching big data and blockchain.
Source: https://my.oschina.net/editorial-story/blog/1808757
Copyright statement: Content originates from the internet; copyright belongs to the original author. We credit the author and source unless verification is impossible. Please inform us of any infringement.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.