50 Proven Principles for Building Highly Scalable Websites

This article distills the key takeaways from the book “50 Principles of High‑Scalability Websites,” presenting concise guidelines on avoiding over‑design, planning capacity, simplifying architecture, optimizing DNS and assets, leveraging horizontal scaling, proper database usage, caching, fault isolation, asynchronous messaging, and continuous learning to build robust, easily extensible web systems.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
50 Proven Principles for Building Highly Scalable Websites
This article summarizes the book “50 Principles of High‑Scalability Websites,” extracting the most important recommendations for creating a website that can grow smoothly with increasing traffic and business demands. Main Content The book presents 50 suggestions covering design simplicity, capacity planning, horizontal scaling, proper use of databases, caching strategies, fault tolerance, and continuous improvement. Simplify Design 1. Avoid over‑design – Excessive features add complexity and maintenance cost without real benefit. 2. Design for scalability – Aim for 20× capacity in design, 3× in implementation, and 1.5× in deployment. 3. Focus on the vital few – Apply the Pareto principle: 20% of design delivers 80% of value. 4. Reduce DNS lookups – Fewer domains mean fewer DNS queries, improving performance under load. 5. Minimize objects – Combine static assets (e.g., sprites) to reduce the number of HTTP requests. 6. Use uniform network equipment – Consistent hardware reduces unexpected issues. Distributed Work 7. X‑axis: horizontal replication – Clone services across multiple servers (clusters, load balancers, read/write DB separation). 8. Y‑axis: functional separation – Split distinct functions such as registration, purchase, search, storage. 9. Z‑axis: user‑based segmentation – Partition by user tier, geography, etc. Horizontal Scaling Design 10. Plan horizontal expansion – Use clustering for scaling rather than relying solely on hardware upgrades. 11. Choose economical hardware – Small, inexpensive machines in a cluster often outperform expensive single servers. 12. Multi‑site data centers – Deploy hot and cold sites; switch to a cold site when the hot site fails. 13. Leverage cloud technology – Virtualization enables elastic scaling during traffic peaks, though it can increase coupling. Use the Right Tools 14. Choose appropriate databases – Select relational (Oracle, MySQL) or NoSQL (MongoDB, Aerospike) based on speed, consistency, and workload. 15. Apply firewalls selectively – Protect sensitive operations while allowing static assets to bypass firewall checks. 16. Harness logs for monitoring – Use tools like Splunk to collect, store, and visualize logs for both infrastructure and business metrics. Avoid Redundant Work 17. Do not read immediately after write – Use asynchronous logging instead of synchronous verification. 18. Eliminate unnecessary redirects – Redirects add latency and consume resources. 19. Relax strict ordering when possible – Loosening ACID constraints can improve performance for certain workloads. Effective Caching 20. Use CDNs – Distribute content globally to reduce latency. 21. Set proper expiration headers – Control caching via Cache‑Control, Expires, etc. 22. Cache AJAX calls – Adjust HTTP headers like Last‑Modified. 23. Page‑level caching – Store full page responses to lower server load. 24. Application‑level caching – Cache personalized data per user. 25. Object caching – Cache frequently accessed data objects (e.g., hot products). 26. Isolate cache layer – Use a dedicated caching tier for easier scaling. Learning from Mistakes 27. Foster a learning culture – Encourage continuous technical and domain knowledge growth. 28. Do not rely solely on QA – Developers must also ensure code correctness. 29. Provide rollback mechanisms – Ability to revert to a previous stable version is essential. 30. Discuss failures openly – Analyze each failure to prevent recurrence. Database Principles ACID properties: atomicity, consistency, isolation, durability. 31. Avoid costly schema changes – Plan table structures early to prevent expensive migrations. 32. Use appropriate locks – Choose the right lock type (row, page, table) to maintain throughput. 33. Skip two‑phase commit when possible – It hampers scalability. 34. Avoid SELECT FOR UPDATE – Prevents row locking that slows transactions. 35. Do not select all columns – Explicit column lists reduce data transfer and improve adaptability. Fault‑Tolerant Design 36. Isolate failures with “lanes” – Separate services and data into independent domains. 37. Never trust a single point of failure – Redundant components prevent total outages. 38. Avoid cascading failures – High‑availability of each component does not guarantee system‑wide reliability. 39. Enable feature toggles – Allow turning services on/off without redeployment. Statelessness and State Management 40. Strive for stateless services – Reduces coupling and scaling cost. 41. Keep session data client‑side when feasible – Lessens server load. 42. Use distributed caches for state – Solutions like Memcached provide scalable state storage. Asynchronous Communication & Message Bus 43. Prefer asynchronous messaging – Decouples services and improves scalability. 44. Ensure the message bus scales – Design bus expansion along Y‑ or Z‑axis rather than simple cloning. 45. Prevent bus congestion – Balance message value against cost. Other Guidelines 46. Use third‑party solutions cautiously – Relying on external vendors can create hidden dependencies. 47. Archive or delete low‑value data – Regularly purge unnecessary data; backup valuable data for quick access. 48. Separate BI from transaction processing – Improves product scalability. 49. Design for observability – Implement global monitoring to answer: Is there a problem? Where? What? Will it happen again? Can it self‑heal? 50. Aim for competence in design – Build simple, high‑quality architectures rather than over‑relying on open‑source solutions.
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

cloud computingScalabilitycachingweb architectureDatabase design
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.