Transforming Monolithic Websites to Scalable, High‑Performance Distributed Systems
Learn how early monolithic websites evolve into distributed architectures by splitting applications, services, and data, implementing load balancers, reverse proxies, caching, CDN, database sharding, and security measures, while focusing on performance, high availability, scalability, and extensibility for robust, high‑traffic sites.
Early websites often used a centralized architecture to save costs, deploying applications, databases, and other components on a single server. As business grows rapidly, bottlenecks appear, prompting a transformation to a distributed system based on principles such as application, service, and data splitting and decoupling.
Main Steps
Business splitting: divide the whole site into independent applications, each deployed separately and communicating via RPC or message queues.
Clustering: application servers, micro‑service applications based on RPC, etc.
LVS load balancing to forward requests to different business clusters.
Reverse proxy servers, commonly Nginx.
Application servers, e.g., servlet containers like Tomcat.
Separate application and data services, deploying them on different servers.
Logical layering of backend applications: presentation/gateway layer, business logic layer, data persistence layer.
Caching: local cache and distributed cache.
CDN: deploy static content to CDN for proximity delivery and faster response.
Database read‑write separation using master‑slave hot standby; writes go to master, replication syncs to slaves.
Database sharding and distributed data frameworks.
Introduce NoSQL for massive data storage.
Leverage open‑source search engines such as Elasticsearch.
Asynchrony to decouple systems.
Shorten business processes to accelerate site access.
Eliminate peak concurrent access.
Architecture Five Elements
High performance
Availability
Scalability
Extensibility
Security
1. High Performance
Key performance metrics include response time, concurrency, QPS (queries per second), and system performance counters. Optimization can be categorized into three layers:
Web front‑end performance: reduce HTTP requests, use browser caching, enable compression, place CSS at the top and JavaScript at the bottom, reduce Cookie transmission.
Application server optimization: caching, clustering, asynchronous processing, multithreading (stateless design, local objects, locks for shared resources), resource reuse (singletons, object pools), efficient data structures, asynchronous messaging for load smoothing, and clustering multiple servers.
Database server optimization: indexing, caching, SQL tuning, and for NoSQL, model and storage optimization.
Storage optimization: choose SSD over HDD, compare B+ tree vs. LSM tree, consider RAID vs. HDFS.
2. High Availability
High‑availability architecture ensures services remain accessible and data stays intact despite hardware failures, using redundancy, backup, and failover mechanisms.
Stateless application design.
Load balancing for failover of stateless services.
Session management in application server clusters.
Failover strategies such as timeout settings, asynchronous calls, service degradation, and rate limiting.
Data redundancy, failover confirmation, access redirection, data recovery, cold standby (no strong consistency), hot standby (asynchronous or synchronous), and CAP principles (Consistency, Availability, Partition Tolerance).
Software quality assurance, automated testing, pre‑release validation, gray‑release, real‑time monitoring, alert systems, graceful degradation, user behavior logging, server performance monitoring, and monitoring data collection and management.
3. Scalability
Large‑scale sites must handle massive concurrent users and data. Scalability is achieved by adding servers to clusters, using appropriate load‑balancing devices, and ensuring new nodes provide identical services.
Application server cluster scalability: add servers without storing data on any node.
Cache cluster scalability: design routing algorithms to keep cached data accessible when nodes are added.
Relational database scalability: implement sharding and routing outside the database.
NoSQL databases typically offer linear scalability with minimal operational effort.
Load‑balancing algorithms: Round Robin, Weighted Round Robin, Random, Least Connections, Source Hashing, DNS‑based, reverse‑proxy (HTTP layer), IP‑level, and data‑link‑layer (LVS).
Distributed cache design: client APIs, routing algorithms, server lists, Memcached clusters, consistent‑hash algorithms.
Data storage service cluster design.
Relational and NoSQL database cluster design.
4. Extensibility
Applying the Open/Closed principle at the architectural level enables building extensible site architectures.
Use distributed message queues to reduce coupling.
Adopt event‑driven architecture.
Build reusable business platforms with distributed service frameworks (e.g., Thrift, Dubbo).
Design extensible data structures such as HBase column families.
Leverage open platforms to create an ecosystem.
5. Security Architecture
Common web attacks include XSS and SQL injection, as well as CSRF and session hijacking.
Escape JavaScript to prevent execution, treat as plain strings.
Defenses: HttpOnly cookies, token validation, Referer checking.
Website vulnerability scanning.
CSRF protection mechanisms.
Error code handling.
Form tokens and CAPTCHAs.
JSONP requests with Referer verification.
SQL injection mitigation.
HTML dangerous character escaping.
XSS mitigation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
