Backend Development 32 min read

Technical Summary of Large-Scale Distributed E‑Commerce Website Architecture

This article provides a comprehensive technical overview of large distributed website architecture, covering performance, high availability, scalability, security, and agility, and illustrates the evolution, design patterns, and practical optimization techniques for modern e‑commerce platforms.

Architecture Digest
Architecture Digest
Architecture Digest
Technical Summary of Large-Scale Distributed E‑Commerce Website Architecture

This article is a technical summary for learning large distributed website architecture. It gives an overview of designing a high‑performance, highly‑available, scalable, and extensible distributed site, offering both reading notes and personal experience for reference.

1. Large Distributed Website Architecture Technologies

1. Characteristics of Large Websites

Many users, widely distributed.

High traffic, high concurrency.

Massive data, high service availability.

Harsh security environment, prone to network attacks.

Many functions, rapid changes, frequent releases.

Gradual growth from small to large.

User‑centric.

Free services, paid experiences.

2. Goals of Large Website Architecture

High performance: provide fast access experience.

High availability: services remain accessible at all times.

Scalability: increase or decrease hardware to adjust processing capacity.

Security: ensure secure access, data encryption, and safe storage.

Extensibility: easily add or remove modules and functions.

Agility: respond quickly to changing needs.

3. Architecture Patterns for Large Websites

Layered: typically application, service, data, management, and analytics layers.

Segmentation: split by business/module/function, e.g., front‑end layer divided into homepage, user center.

Distributed: deploy applications on multiple physical machines and coordinate via remote calls.

Cluster: deploy multiple instances of an application/module and use load balancing for external access.

Cache: place data close to the application or user to accelerate access.

Asynchronous: convert synchronous operations to asynchronous, using notification or polling after processing.

Redundancy: add replicas to improve availability, security, and performance.

Security: provide effective solutions for known issues and mechanisms for unknown threats.

Automation: replace repetitive manual tasks with tools and machines.

Agility: quickly adapt to requirement changes and business growth.

4. High‑Performance Architecture

Centering on the user, provide fast web access. Key parameters include short response time, high concurrency, high throughput, and stable performance.

It can be divided into front‑end optimization, application‑layer optimization, code‑level optimization, and storage‑layer optimization.

Front‑end optimization: the part before business logic.

Browser optimization: reduce HTTP requests, use browser cache, enable compression, place CSS/JS appropriately, async JS, reduce cookie transmission; use CDN acceleration and reverse proxy.

Application‑layer optimization: use caching, asynchronous processing, clustering.

Code optimization: proper architecture, multithreading, resource reuse (object pool, thread pool), good data structures, JVM tuning, singleton, cache, etc.

Storage optimization: cache, SSD, fiber transmission, read/write tuning, disk redundancy, distributed storage (HDFS), NoSQL, etc.

5. High‑Availability Architecture

Large websites must remain accessible at all times. Due to complexity, distribution, cheap servers, open‑source databases, and OS variations, achieving high availability is challenging, and failures are inevitable.

Improving availability must be considered at the architectural level during planning. Industry often uses "nines" to denote availability, e.g., four nines (99.99%) allow about 53 minutes of downtime per year.

Different layers adopt different strategies, typically redundancy and failover.

Application layer: design stateless services; use load balancing (with session synchronization) for high availability.

Service layer: load balancing, hierarchical management, fast failure (timeouts), asynchronous calls, service degradation, idempotent design, etc.

Data layer: redundant backups (cold, hot [sync/async], warm), failover; based on CAP theorem (consistency, availability, partition tolerance).

6. Scalability Architecture

Scalability means adjusting processing capacity by adding or removing hardware without changing the original design.

Application layer: vertical or horizontal sharding, then load balance each function (DNS, HTTP reverse proxy, IP, layer‑2).

Service layer: similar to application layer.

Data layer: database sharding, table partitioning, NoSQL; common algorithms include hash and consistent hash.

7. Extensibility Architecture

Allows easy addition or removal of functional modules, providing good code/module level extensibility.

Modularization, componentization: high cohesion, low coupling, improve reusability and extensibility.

Stable interfaces: define stable APIs so internal structures can change without affecting callers.

Design patterns: apply OOP principles and design patterns for code‑level design.

Message queues: decouple modules via asynchronous messaging.

Distributed services: expose common modules as services for reuse and extension.

8. Security Architecture

Provide effective solutions for known problems and mechanisms for unknown/potential threats. First, raise security awareness and establish policies such as regular password changes, weekly security scans, and institutionalized security measures.

Infrastructure security: procure hardware from reputable sources, choose secure operating systems, patch vulnerabilities promptly, install anti‑virus and firewalls, implement DDOS protection, intrusion detection, and subnet isolation.

Application security: address common issues (XSS, injection, CSRF, error leakage, file upload, path traversal) during development; use Web Application Firewalls like ModSecurity and conduct vulnerability scans.

Data confidentiality: secure storage (reliable devices, regular backups), secure saving (encryption, access control), and secure transmission (prevent data theft and tampering).

Common encryption algorithms: hash (MD5, SHA), symmetric (DES, 3DES, RC), asymmetric (RSA).

9. Agility

Architecture and operations must adapt to change, providing high stretchability and extensibility to handle rapid business growth and traffic spikes.

Beyond the technical elements above, agile management and development practices should be introduced to align business, product, technology, and operations.

10. Example of Large‑Scale Architecture

The example uses a seven‑layer logical architecture: client layer, front‑end optimization layer, application layer, service layer, data storage layer, big‑data storage layer, and big‑data processing layer.

Client layer: supports PC browsers and mobile apps; mobile apps can access directly via IP and reverse proxy.

Front‑end layer: DNS load balancing, CDN acceleration, reverse proxy.

Application layer: website application cluster; vertical splitting by business (e.g., product, member center).

Service layer: provides common services such as user, order, payment.

Data layer: relational DB cluster (read/write separation), NoSQL cluster, distributed file system cluster, distributed cache.

Big‑data storage layer: collects logs from application and service layers, structured and semi‑structured data.

Big‑data processing layer: offline analysis via MapReduce or real‑time analysis via Storm, storing results in relational databases for downstream use.

2. Evolution of Large E‑Commerce Site Architecture

A mature large site (e.g., Taobao, Tmall, Tencent) does not start with a complete high‑performance, high‑availability, highly‑scalable design; it evolves as user volume and business functions grow, changing development models, technical stacks, and design philosophies.

Different business characteristics lead to different architectural emphases, but common technologies can be identified across sites.

1. Initial Architecture

All components (application, database, files) deployed on a single server.

2. Separation of Application, Data, and Files

As traffic grows, separate servers are used for application, database, and file storage, each with hardware tuned for its role.

3. Using Cache to Improve Performance

Cache hot data (80% of requests hit 20% of data) to reduce latency. Common approaches include local cache, distributed cache, CDN, and reverse proxy.

Local cache (e.g., OSCache) is fast but limited in size; distributed cache (e.g., Memcached, Redis) scales to massive data.

4. Application Server Clustering

Deploy a cluster of application servers behind a load balancer to share request load.

Load balancing options: hardware (F5), software (LVS – layer 4, Nginx/HAProxy – layer 7). LVS offers higher performance; Nginx/HAProxy provide richer configuration such as static‑resource separation.

5. Database Read/Write Separation and Sharding

To alleviate database bottlenecks, use master‑slave replication for read/write separation and split databases/tables horizontally or vertically.

6. CDN and Reverse Proxy for Performance

Deploy CDN to cache content near users in different regions, reducing latency. Use reverse proxy (e.g., Squid, Nginx) to serve cached responses before hitting application servers.

7. Distributed File Systems

When file volume grows, adopt distributed file systems such as GFS, HDFS, or TFS.

8. NoSQL and Search Engines

For massive data queries, combine NoSQL databases (MongoDB, HBase, Redis) with search engines (Lucene, Solr, Elasticsearch).

9. Business‑Level Service Splitting

As applications become monolithic, split them into business‑level services (e.g., news, web, image) that communicate via messaging or shared databases.

10. Building Distributed Services

Extract common functionalities (user, order, payment, security) into distributed services using frameworks like Dubbo.

3. One‑Page Overview of E‑Commerce Architecture

4. Large E‑Commerce Site Architecture Case Study

1. Reasons for Choosing an E‑Commerce Case

Distributed large‑scale sites fall into three categories: portals (e.g., NetEase, Sina), social networks (e.g., campus sites, Kaixin), and e‑commerce (e.g., Alibaba, JD, Guomei, Autohome). E‑commerce combines characteristics of portals and social sites, making it a suitable case study.

2. E‑Commerce Requirements

Build a full‑category B2C platform with online purchase, payment, and cash‑on‑delivery.

Online customer service chat.

User reviews and ratings after purchase.

Integration with existing ERP/inventory systems.

Support 3‑5 years of business growth.

Target 10 million registered users within 3‑5 years.

Handle promotional events like Double 11, Double 12, etc.

Reference features from JD or Guomei.

3. Initial Simple Architecture

Typical early setup: three servers – one for application, one for database, one for NFS file storage.

4. Capacity Estimation

Based on 10 million users, estimate daily UV = 2 million, average 30 page views per user → 60 million PV per day. Peak traffic (80% of PV in 4.8 hours) ≈ 4.8 million PV per hour → ≈ 16.7 k concurrent requests per minute → ≈ 2.8 k per second, multiplied by three for peak → ≈ 8.3 k per second.

Assuming one Tomcat instance handles 300 QPS, need ~10 servers for normal load and ~30 for peak.

5. Architecture Analysis and Optimization

Deploy many servers for peak load; consider over‑provisioning waste.

Current monolithic deployment causes tight coupling; need vertical and horizontal splitting.

Redundant code across applications.

Session synchronization consumes memory and bandwidth.

Database becomes a bottleneck due to frequent access.

Optimization measures include business splitting, application clustering, multi‑level caching, distributed session (single sign‑on), database clustering (read/write separation, sharding), service‑oriented architecture, message queues, and other techniques.

6. Detailed Optimizations

6.1 Business Splitting

Separate vertical domains: product, shopping cart, payment, review, customer service, and integration interfaces. Define core (product, shopping, payment) and non‑core systems.

6.2 Application Cluster Deployment

Deploy each split service on multiple machines, use RPC for communication, and load balancers for high availability.

6.3 Multi‑Level Caching

Use local cache for immutable or slowly changing data and distributed cache (e.g., Redis) for hot data; fallback to database if both miss.

6.4 Distributed Session / Single Sign‑On

Store session data in a shared cache (Redis) with expiration, enabling stateless application servers.

6.5 Database Cluster (Read/Write Separation, Sharding)

Each business subsystem has its own database; large databases are further sharded; read/write separation applied on top.

6.6 Service‑Oriented Architecture

Extract common functionalities into reusable services (e.g., member service).

6.7 Message Queues

Use MQ (e.g., RabbitMQ) to decouple modules; order placement writes to queue, then inventory and delivery services consume asynchronously.

6.8 Other Techniques

Include CDN, reverse proxy, distributed file systems, big‑data processing, etc.

7. Architecture Summary

The architecture of large websites continuously evolves based on business needs; this article outlines common techniques and considerations to inspire further design.

Author: Luan Zhupi – over ten years of experience, former Google engineer, proficient in Java, distributed architecture, micro‑services, databases, currently researching big data and blockchain.
Source: https://my.oschina.net/editorial-story/blog/1808757

Copyright statement: Content originates from the internet; copyright belongs to the original author. We credit the author and source unless verification is impossible. Please inform us of any infringement.

distributed systemse-commercearchitecturescalabilityHigh AvailabilityLoad Balancingcaching
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.