Operations 12 min read

Comprehensive Guide to Building a Resilient, High‑Performance Web Infrastructure

This guide outlines essential steps for creating a robust, high‑availability website architecture, covering domain acquisition, DNS management, CDN deployment, image caching, data center selection, monitoring, DDoS mitigation, redundancy, server configuration, database replication, testing environments, security practices, and operational tooling.

Open Source Linux
Open Source Linux
Open Source Linux
Comprehensive Guide to Building a Resilient, High‑Performance Web Infrastructure

1. Domain

Buy multiple domains (50‑100), split into primary and promotional domains. Purchase from GoDaddy for stability and add domain protection to hide real server IPs.

Do not manage DNS on GoDaddy; use Cloudflare, DNSPod, ZNDNS, or your own DNS server for faster changes.

2. CDN

Purchase a CDN service (e.g., Cloudflare). Point the domain to the CDN, which caches and forwards traffic, providing at least 200 GB DDoS protection and global caching.

3. Image Server

Deploy image caching servers domestically; Nginx can serve as an image cache.

4. Data Center

Select data centers close to your users; US servers for high bandwidth, test ping values with tools like Chinaz. Choose providers with high‑availability, DDoS protection, reliable support, and real‑time status monitoring.

Example: Hong Kong Jiuhé for core servers, US Santa Ana for high‑defense nodes.

5. Homepage

Use a cloud VM for the landing page; include a link to the game homepage, preferably without a port number, using CDN or a non‑备案 (non‑registered) data center for direct domain access.

6. Monitoring System

Implement real‑time monitoring to detect attacks, log spikes, and store logs on a syslog server; use Cacti for visualization. Set up alerts for abnormal traffic and investigate source IPs.

7. DDoS Mitigation

Small attacks can be blocked with Nginx and iptables; large attacks require high‑defense data centers (≥200 GB). For single‑source attacks, request the data center to block offending IPs.

During severe attacks, redirect the domain to another server or CDN to maintain service.

8. Redundancy

Design for double the expected concurrent users (e.g., 2000 vs 1000) to handle traffic spikes.

9. Server Configuration

Use three network cards: external user traffic, internal server communication, and SSH management. Assign multiple IPs per NIC, mirror disks (RAID 1), dual CPUs, dual power supplies, and avoid single points of failure.

10. Database

Implement master‑slave replication with off‑site backups; use Nginx upstream for clustering. Separate front‑end and back‑end services on different machines; other services can share a VM.

11. Test Environments

Maintain three environments: developer machines, LAN test, and Internet test, plus production. Use SVN or Git for code management and stable LAN hardware.

12. Core and Shield Servers

Ensure ping connectivity between shield (DDoS protection) and core servers to verify network paths.

13. Operations Staff

At least two ops personnel; one manager plus one engineer is sufficient. Document procedures, maintain 24‑hour on‑call rotation, and have a dedicated network admin.

14. Large‑Scale Architecture

For large deployments, maintain a dedicated core data center with specialized roles: ops engineers, DBAs, network, security, storage, and coordination.

15. Linux Optimization

Optimize Linux and Nginx based on CPU and memory limits.

16. Security

Rotate all passwords every three months, especially domain and email accounts.

17. LAN

Provide stable LAN with at least 10 Mbps bandwidth, dual cables, and a mobile Wi‑Fi hotspot for staff.

18. Ops Tools

Standardize tools: SQLyog for DB, CRT for SSH, KeePass for passwords, WinSCP for file transfer; encourage continuous learning and English documentation.

19. Disaster Recovery Plan

Maintain a documented DR plan with regular drills and backup restoration tests to ensure availability when primary servers fail.

20. Server Security

Apply comprehensive security hardening covering user, application, system, and file security.

21. High‑Concurrency Testing

Simulate 2000 concurrent users to evaluate load; invest where needed and optimize IP selection and bandwidth.

22. Knowledge Sharing

Share all operational information, passwords, and configurations among at least two team members under the ops manager’s guidance.

23. Logging

Record all server actions with timestamps; perform risk assessments before production changes.

24. Ops Principles

Focus on availability, monitoring & alerts, capacity planning, process standards, knowledge management, and automation.

25. Ongoing Ops Work

Post‑deployment tasks include version upgrades, monitoring, statistics, routine inspections, incident response, change management, clustering, performance tuning, DB optimization, scaling, security, and ops development.

26. Connection Count Example

netstat -ant | grep $ip:80 | wc -l
netstat -ant | grep $ip:80 | grep EST | wc -l
monitoringcloud servicesoperationsHigh Availabilitysecurityweb infrastructureDDoS protection
Open Source Linux
Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.