Testing and Tuning Nginx for Two Million Long‑Lived Connections
This article describes how to configure and benchmark an Nginx‑based comet server to sustain two million simultaneous long‑lived connections by analyzing resource bottlenecks, tuning kernel parameters, adjusting file descriptor limits, and scaling client machines.
For certain applications such as message‑push or chat systems, the number of concurrent connections—not just QPS—is the critical performance metric, requiring the server to hold many idle (long) connections.
The primary resources consumed by such a service are CPU, network, and memory; idle connections mainly occupy memory, so sufficient RAM is essential, and kernel data structures must handle the load.
To test this, a server and many clients are needed. The server runs an Nginx comet module that accepts requests and holds the connections without returning data, while the client repeatedly opens connections.
1. Server Preparation
The server is a high‑memory Dell R710 with the following specifications:
Summary: Dell R710, 2 x Xeon E5520 2.27GHz, 23.5GB / 24GB 1333MHz
System: Dell PowerEdge R710 (Dell 0VWN1R)
Processors: 2 x Xeon E5520 2.27GHz 5860MHz FSB (16 cores)
Memory: 23.5GB / 24GB 1333MHz == 6 x 4GB, 12 x empty
Disk-Control: megaraid_sas0: Dell/LSILogic PERC 6/i, Package 6.2.0-0013, FW 1.22.02-0612,
Network: eth0 (bnx2):Broadcom NetXtreme II BCM5709 Gigabit Ethernet,1000Mb/s
OS: RHEL Server 5.4 (Tikanga), Linux 2.6.18-164.el5 x86_64, 64-bitThe Nginx comet module holds connections, and the Nginx status module is used to monitor the maximum connection count.
System parameters are tuned in /etc/sysctl.conf :
net.core.somaxconn = 2048
net.core.rmem_default = 262144
net.core.wmem_default = 262144
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 4096 16777216
net.ipv4.tcp_wmem = 4096 4096 16777216
net.ipv4.tcp_mem = 786432 2097152 3145728
net.ipv4.tcp_max_syn_backlog = 16384
net.core.netdev_max_backlog = 20000
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_orphans = 131072
/sbin/sysctl -pKey settings explained: tcp_rmem and tcp_wmem control per‑socket read/write buffer sizes; reducing them to 4 KB minimizes memory per socket. tcp_mem defines system‑wide TCP memory thresholds, and tcp_max_orphans limits orphaned sockets.
2. Client Preparation
Each client machine can open roughly 64 k ports (1024‑65535), so to reach two million connections about 34 machines (or virtual IPs) are required.
The client port range is expanded:
net.ipv4.ip_local_port_range = 1024 65535
/sbin/sysctl -pThe client program, built with libevent, continuously creates new connections.
3. File‑Descriptor Limits
Clients need a limit of about 100 k descriptors:
admin soft nofile 100000
admin hard nofile 100000For the server, a limit of two million descriptors is required. On kernels prior to 2.6.25 the hard limit caps at 1 048 576, so the kernel is upgraded to 2.6.32, after which the limit can be raised via /proc/sys/fs/nr_open :
sudo bash -c 'echo 2000000 > /proc/sys/fs/nr_open'Then the nofile limits are set:
admin soft nofile 2000000
admin hard nofile 2000000During testing, sysctl values are adjusted based on dmesg warnings until the server successfully maintains two million long‑lived connections.
Memory usage is reduced by shrinking Nginx’s request_pool_size from the default 4 KB to 1 KB, and by setting the default TCP buffer sizes to 4 KB.
Monitoring via Nginx status shows the server handling two million connections, and memory statistics confirm the feasibility of such a scale.
Art of Distributed System Architecture Design
Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.