Why Unix Domain Sockets Outperform 127.0.0.1 Loopback: Deep Dive & Benchmarks
This article explains the internal mechanics of Unix Domain Sockets, shows how they differ from traditional 127.0.0.1 TCP loopback, provides sample code for server and client, and presents benchmark results demonstrating significantly lower latency and higher throughput for local IPC.
1. Using Unix Domain Sockets
Unix Domain Sockets (UDS) use the address family AF_UNIX and identify the server by a filesystem path (e.g., /dev/shm/fpm-cgi.sock) instead of an IP/port pair.
Many applications can be configured to use a UDS. For example, Nginx forwards FastCGI requests over a Unix socket with a single directive: fastcgi_pass unix:/dev/shm/fpm-cgi.sock; A minimal illustrative server implementation in C:
int main() {
// create Unix domain socket
int fd = socket(AF_UNIX, SOCK_STREAM, 0);
// bind and listen
char *socket_path = "./server.sock";
struct sockaddr_un addr;
memset(&addr, 0, sizeof(addr));
addr.sun_family = AF_UNIX;
strcpy(addr.sun_path, socket_path);
bind(fd, (struct sockaddr *)&addr, sizeof(addr));
listen(fd, 128);
while (1) {
// accept new connection
int conn = accept(fd, NULL, NULL);
// read/write data
read(conn, ...);
write(conn, ...);
close(conn);
}
return 0;
}A client simply creates a socket and calls connect:
int main() {
int sockfd = socket(AF_UNIX, SOCK_STREAM, 0);
struct sockaddr_un server_addr;
memset(&server_addr, 0, sizeof(server_addr));
server_addr.sun_family = AF_UNIX;
strcpy(server_addr.sun_path, "./server.sock");
connect(sockfd, (struct sockaddr *)&server_addr, sizeof(server_addr));
// use sockfd for read/write
return 0;
}2. Connection Process
When a client calls connect, the kernel allocates a new socket for the server side, links the two sockets, and places the new socket into the server's accept queue. The core logic resides in unix_stream_connect (file net/unix/af_unix.c).
// Simplified steps inside unix_stream_connect
// 1. Allocate a new socket for the server side
newsk = unix_create1(sock_net(sk), NULL);
// 2. Allocate an skb and associate it with the new socket
skb = sock_wmalloc(newsk, 1, 0, GFP_KERNEL);
// 3. Link the two socket objects
unix_peer(newsk) = sk;
newsk->sk_state = TCP_ESTABLISHED;
newsk->sk_type = sk->sk_type;
sk->sk_state = TCP_ESTABLISHED;
unix_peer(sk) = newsk;
// 4. Enqueue the skb into the server's receive queue
__skb_queue_tail(&other->sk_receive_queue, skb);Unlike TCP, there is no three‑way handshake, no half‑open queue, and no retransmission timers. The two sockets simply point to each other via unix_peer, and the kernel places the pending socket into the server's accept queue.
3. Data Transfer Process
Sending data over a UDS invokes unix_stream_sendmsg in the kernel. The function allocates an skb, copies user data, finds the peer socket, and enqueues the skb directly into the peer's receive queue.
static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
struct msghdr *msg, size_t len) {
// 1. Allocate an skb
struct sk_buff *skb = sock_alloc_send_skb(sk, size,
msg->msg_flags & MSG_DONTWAIT, &err);
if (!skb)
return -ENOMEM;
// 2. Copy user data into the skb
err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
if (err)
return err;
// 3. Find the peer socket
struct sock *other = unix_peer(sk);
// 4. Enqueue the skb into the peer's receive queue
skb_queue_tail(&other->sk_receive_queue, skb);
// 5. Notify the peer that data is ready
other->sk_data_ready(other, size);
return size;
}The receive side uses unix_stream_recvmsg, which simply reads from its own receive queue. Because data is placed directly into the peer's queue, the overhead is minimal compared with TCP.
4. Performance Comparison
Benchmarks were run with the ipc-bench tool (https://github.com/rigtorp/ipc-bench) on a 4‑core, 8 GB KVM VM. Two metrics were measured: latency and throughput.
Latency (small 100‑byte messages)
UDS average latency: 2,707 ns
TCP (127.0.0.1) average latency: 5,690 ns
For larger 100 KB messages, UDS latency ≈ 24 µs vs. TCP ≈ 32 µs.
Throughput (small messages)
UDS peak throughput: 854 MiB/s
TCP peak throughput: 386 MiB/s
5. Summary
The kernel implementation of Unix Domain Sockets bypasses most of the TCP/IP stack. Connection establishment consists of a simple socket allocation and pointer linking, and data transfer is performed by copying into an skb and enqueuing it directly on the peer's receive queue. This results in roughly half the latency and more than double the bandwidth of loopback TCP for local inter‑process communication, making UDS the preferred choice for performance‑critical local I/O.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Refining Core Development Skills
Fei has over 10 years of development experience at Tencent and Sogou. Through this account, he shares his deep insights on performance.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
