How Linux Kernel Handles TCP Connections: Deep Dive into sock_common and Lookup
This article explores Linux kernel TCP connection handling by examining socket data structures, port range and file descriptor tuning, core functions like tcp_v4_rcv, and lookup mechanisms, while offering practical tips to boost client-side concurrent connections beyond traditional limits.
How Linux Kernel Handles TCP Connections
To increase the range of local ports Linux can use, run:
echo "5000 65000" > /proc/sys/net/ipv4/ip_local_port_rangeTypical client connections look like:
192.168.1.101 5000 → 192.168.1.100 8090
192.168.1.101 5001 → 192.168.1.100 8090
…
192.168.1.101 65000 → 192.168.1.100 8090
To raise the maximum number of open file descriptors for the whole system: echo 200000 > /proc/sys/fs/file-max And for each user process:
# vi /etc/sysctl.conf
fs.nr_open=210000
# sysctl -p
# vi /etc/security/limits.conf
* soft nofile 200000
* hard nofile 200000Note: the hard limit cannot exceed nr_open , so adjust nr_open first, preferably in sysctl.conf , to avoid startup issues.
The core socket data structure is struct sock_common defined in include/net/sock.h:
struct sock_common {
union {
__addrpair skc_addrpair; // TCP IP pair
struct {
__be32 skc_daddr;
__be32 skc_rcv_saddr;
};
};
union {
__portpair skc_portpair; // TCP port pair
struct {
__be16 skc_dport;
__u16 skc_num;
};
};
...
}Here skc_addrpair stores the IP pair and skc_portpair stores the port pair of a TCP connection.
The entry point for processing incoming TCP packets is tcp_v4_rcv (in net/ipv4/tcp_ipv4.c):
int tcp_v4_rcv(struct sk_buff *skb) {
...
th = tcp_hdr(skb); // get TCP header
iph = ip_hdr(skb); // get IP header
sk = __inet_lookup_skb(&tcp_hashinfo, skb, th->source, th->dest);
...
}The lookup uses __inet_lookup (in include/net/inet_hashtables.h) which first tries to find an established socket:
static inline struct sock *__inet_lookup(struct net *net,
struct inet_hashinfo *hashinfo,
const __be32 saddr, const __be16 sport,
const __be32 daddr, const __be16 dport,
const int dif)
{
u16 hnum = ntohs(dport);
struct sock *sk = __inet_lookup_established(net, hashinfo,
saddr, sport, daddr, hnum, dif);
return sk ? : __inet_lookup_listener(net, hashinfo, saddr, sport,
daddr, hnum, dif);
}Establishment lookup ( __inet_lookup_established) combines source and destination ports into a 32‑bit value, hashes it, and iterates the bucket list to find a matching socket:
struct sock *__inet_lookup_established(...){
const __portpair ports = INET_COMBINED_PORTS(sport, hnum);
unsigned int hash = inet_ehashfn(net, daddr, hnum, saddr, sport);
unsigned int slot = hash & hashinfo->ehash_mask;
struct inet_ehash_bucket *head = &hashinfo->ehash[slot];
sk_nulls_for_each_rcu(sk, node, &head->chain) {
if (sk->sk_hash != hash)
continue;
if (likely(INET_MATCH(sk, net, acookie, saddr, daddr, ports, dif))) {
if (unlikely(!atomic_inc_not_zero(&sk->sk_refcnt)))
goto begintw;
if (unlikely(!INET_MATCH(sk, net, acookie, saddr, daddr, ports, dif))) {
sock_put(sk);
goto begin;
}
goto out;
}
}
...
}The macro INET_MATCH compares the packet’s source/destination IPs and ports with the socket’s stored values, as well as device bindings and network namespace:
#define INET_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif) \
((inet_sk(__sk)->inet_portpair == (__ports)) && \
(inet_sk(__sk)->inet_daddr == (__saddr)) && \
(inet_sk(__sk)->inet_rcv_saddr == (__daddr)) && \
(!(__sk)->sk_bound_dev_if || (__sk)->sk_bound_dev_if == (__dif)) && \
net_eq(sock_net(__sk), (__net)))This comparison implements the classic TCP five‑tuple (source IP, source port, destination IP, destination port, protocol) matching.
System information commands show the environment used for testing:
# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.2 (Santiago)
# ss -ant | grep ESTAB | wc -l
1000013
# cat /proc/meminfo
MemTotal: 3925408 kB
MemFree: 97748 kB
Buffers: 35412 kB
Cached: 119600 kB
...
Slab: 3241528 kBConclusion
Each TCP connection consumes a client port; when connections exceed 30‑50 k the system may appear stressed, but the kernel can handle far more if tuned.
TCP connections are identified by a four‑tuple (client IP, client port, server IP, server port). Even if multiple connections share the same client port, differing IPs keep them distinct.
To increase client‑side concurrency you can either assign multiple IP addresses to the client or connect to many different servers; mixing both approaches is discouraged because binding to a specific IP changes the kernel’s port selection strategy.
Experiments show that, with proper tuning, a client can sustain over one million concurrent TCP connections, dispelling the myth that the 65535 port limit is an absolute barrier.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
