How to Scale a Mid‑Size Website: From Caching to Search Indexes
This article walks through the evolution of a medium‑traffic website’s architecture, covering early rapid development, the introduction of caching, database‑app separation, read/write splitting, horizontal scaling with additional servers, and the later addition of full‑text search to handle millions of daily visits.
There are many online shares about website architecture, most of which analyze from an operations or infrastructure perspective—focusing on machine stacks and clusters—making them hard for ordinary developers to understand.
The first part of this series introduced large‑site infrastructure scaling; the second part focuses on application‑level scaling and evolution.
In the grassroots stage, a website is quickly developed and launched with limited users, modest economic capacity, and minimal investment.
When traffic grows, caching is introduced to improve site speed.
As user volume continues to increase, the database experiences heavy read/write load, prompting the separation of DB and application layers.
A single database soon becomes a bottleneck, so read/write separation is typically adopted, leveraging the common "read‑heavy, write‑light" pattern of internet services; the number of slave nodes depends on the evaluated read/write ratio.
Database bottlenecks are alleviated, but the application layer hits limits due to increased traffic, poorly written early code, and high staff turnover, making maintenance difficult; a common remedy is simply adding more servers.
Adding servers is easy, but it must produce real benefits; common issues that arise include page output cache, local cache, and session storage problems.
At this point, horizontal scaling at both the DB and application layers is largely achieved, allowing attention to shift to other aspects such as improving internal search precision, reducing DB dependence, and introducing full‑text indexes.
In the Java ecosystem, Lucene and Solr are popular; in the PHP world, Sphinx/Coreseek are commonly used.
Thus far, the architecture of a medium‑size website capable of handling millions of daily visits has been outlined; each scaling step contains many technical details that will be explored in separate future articles.
The next part will continue the discussion.
Source: http://blog.csdn.net/dinglang_2009/article/details/46398885
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
