How Tcpcopy Replicates Live Traffic for Realistic Load Testing
Tcpcopy is a distributed online pressure testing tool that copies live traffic to test machines, enabling realistic simulation of production environments without deploying new code, and the article explains its principles, code details, deployment steps, advanced usage, and real‑world application at Yitao.
Tool Overview
Tcpcopy is a distributed online pressure testing tool that can copy live traffic to test machines, allowing real‑time simulation of production environments without releasing new code, helping discover bugs early and increasing confidence before deployment.
Originally open‑sourced by NetEase Technology in September 2011, it is now at version 0.4. Compared with traditional tools such as abench, Tcpcopy’s main advantage is its real‑time and high fidelity traffic replication, with only minimal packet loss.
How Tcpcopy Works
Using nginx as the front‑end example, the workflow is:
Online front‑end machine runs the Tcpcopy client process.
Test front‑end machine runs the Tcpcopy server (interception) process.
Both machines run nginx.
The traffic copy steps are:
1️⃣ A request reaches the online front‑end.
2️⃣ The packet is duplicated at the IP layer and handed to the Tcpcopy process.
3️⃣ Tcpcopy rewrites the destination and source addresses and forwards the packet to the test front‑end.
4️⃣ The copied packet arrives at the test front‑end.
5️⃣ Nginx on the test front‑end processes the request and returns a response.
6️⃣ The response is intercepted at the IP layer, discarded, and the interception process copies the IP header back.
7️⃣ The IP header is sent to the online front‑end’s Tcpcopy process.
Code Analysis
At the link or IP layer, the kernel checks for processes that have created raw sockets (e.g., socket(AF_PACKET,SOCK_DGRAM,…) or socket(AF_INET,SOCK_RAW,…)). If such a socket exists, the packet is duplicated and delivered to that socket’s buffer, which is how Tcpcopy captures traffic.
In version 0.3 the capture socket is created with:
int sock = socket(AF_PACKET,SOCK_RAW,htons(ETH_P_IP));In version 0.4 it uses:
int sock = socket(AF_INET,SOCK_RAW,IPPROTO_TCP);When sending copied packets, Tcpcopy uses a raw socket with IP_HDRINCL set, modifies the destination IP and port, and sends the packet via sendto:
sock = socket(AF_INET, SOCK_RAW,IPPROTO_RAW);
setsockopt(sock, IPPROTO_IP, IP_HDRINCL, &n, sizeof(n));
... // modify tcp_header->dest and ip_header->daddr
send_len = sendto(sock,(char *)ip_header,tot_len,0,(struct sockaddr *)&toaddr,sizeof(toaddr));On the test front‑end, the ip_queue kernel module and an iptables rule ( iptables -I OUTPUT -p tcp –sport 80 -j QUEUE) direct packets to a user‑space queue. The interception process creates a netlink socket ( socket(AF_NETLINK,SOCK_RAW,NETLINK_FIREWALL)) to receive these packets, decides whether to DROP or ACCEPT them, and sends the verdict back to the kernel.
struct receiver_msg_st msg;
... // copy ip_header and tcp_header into msg
send(sock,(const void *)msg,sizeof(struct receiver_msg_st),0);
struct ipq_verdict_msg *ver_data = NULL;
struct sockaddr_nl addr;
nl_header->nlmsg_type=IPQM_VERDICT;
... // set verdict to NF_DROP or NF_ACCEPT
sendto(firewall_sock,(void *)nl_header,nl_header->nlmsg_len,0,(struct sockaddr *)&addr,sizeof(struct sockaddr_nl));Operation Steps
Assume two machines:
Machine A (online front‑end) IP: 61.135.xxx.1
Machine B (test front‑end) IP: 61.135.xxx.2
Both run nginx and the operator must have sudo privileges on both.
Steps on Machine B:
Load the ip_queue module: modprobe ip_queue Set iptables rule:
sudo iptables -t filter -I OUTPUT -p tcp –sport 80 -j QUEUEStart the interception process: sudo ./interception & Steps on Machine A:
sudo ./tcpcopy 61.135.xxx.1 80 61.135.xxx.2 80 &If “I am booted” appears on A, the copy is working and you can verify via nginx logs on B.
Advanced Usage
Cascading: Copy traffic from one online front‑end to multiple test machines (A→B→C→D…) to amplify load.
Multiple copies per instance: Use the -n option to duplicate requests, e.g., sudo ./tcpcopy A 80 B 80 -n 2.
Allowed IP list: From version 0.4 you can specify IPs that are permitted to reach the test nginx, e.g., sudo ./interception 61.135.xxx.3:61.135.xxx.4.
Real‑World Application at Yitao
During a major update in February, Yitao’s engine used Tcpcopy to mirror all front‑end traffic to a new demo front‑end for online simulation. After a week of testing, the new engine achieved an average QPS of 110 (peak 240) with a packet loss rate of 1.37%. Further scaling to a single test front‑end showed QPS >1000 and latency around 40 ms, meeting launch requirements.
Resource consumption was low: Tcpcopy used ~7.7 % CPU and 77 MB RAM, while the interception process used ~5.8 % CPU and 38 MB RAM.
References
Project homepage: http://code.google.com/p/tcpcopy/
sock_raw paper: http://sock-raw.org/papers/sock_raw
Netlink documentation: http://smacked.org/docs/netlink.pdf
Related blog category: http://blog.csdn.net/wangbin579/article/category/926096/1
SVN source: http://tcpcopy.googlecode.com/svn/trunk
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
