What Exactly Is a Socket? From Plug Analogy to Kernel Implementation
This article explains the concept of sockets using a plug analogy, details their usage in TCP/UDP communication, explores the kernel's sock structures and inheritance tricks, and describes how sockets expose network functionality to user‑space programs through file descriptors and queues.
What Is a Socket?
Imagine a plug inserted into a socket; similarly, a network socket connects two machines, allowing data exchange. The term "socket" in hardware and programming shares this metaphor of establishing a connection.
Typical Socket Use Cases
To send data from process A on one computer to process B on another, you choose a transport protocol: reliable TCP or unreliable UDP. Beginners usually start with TCP.
Creating a TCP socket looks like:
sock_fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);The returned sock_fd is an integer file descriptor that identifies the socket.
On the server side you typically call:
bind() listen() accept()On the client side you call connect(), which triggers the TCP three‑way handshake.
After the connection is established, send() and recv() exchange data between client and server.
Socket Design Inside the Kernel
The kernel treats network transmission as a sock structure. To distinguish endpoints, sock includes IP and port fields (via inet_sock), and different protocols are represented by specialized structs: inet_sock: basic network socket with address fields. inet_connection_sock: adds connection‑oriented fields (e.g., accept queue, retransmission counters). tcp_sock: extends inet_connection_sock with TCP‑specific features such as sliding windows and congestion control. udp_sock: UDP‑specific structure. unix domain socket: communication between processes on the same host, using the file system instead of the network stack.
Exposing the Socket Layer to User Space
The kernel abstracts these complex structures behind simple file‑like interfaces. When a socket is created, a corresponding file descriptor is allocated; operations on this descriptor ( bind, listen, connect, send, recv) are routed to the underlying sock implementation.
The file descriptor returned by socket() is essentially the sock_fd that identifies the kernel sock object.
How Sockets Implement Network Communication
Connection Establishment
When a client calls connect(), the kernel uses the sock_fd to locate the associated sock and initiates the TCP three‑way handshake. The server maintains a half‑connection queue (hash table) for SYNs and a full‑connection queue (linked list) for established sockets, created by listen(). accept() removes a socket from the full‑connection queue.
Data Transmission
Each sock contains a send buffer and a receive buffer, implemented as linked lists of data fragments. send() places data into the send buffer; the kernel later transmits it. recv() reads from the receive buffer; if empty, the calling process is placed on a waiting queue and sleeps until data arrives.
When data arrives, the kernel wakes the sleeping process, which then copies the data out.
Handling Multiple Clients
The server distinguishes clients using a four‑tuple (source IP, source port, destination IP, destination port). This tuple is hashed to a key stored in a hash table, allowing quick lookup of the correct sock for each incoming packet.
Surprise (Thundering Herd) Effect
Before Linux 2.6, all processes waiting on a listening socket were woken, causing unnecessary wake‑ups. Since 2.6, only one waiting process is awakened, eliminating the thundering herd problem.
How C Simulates Inheritance for Sockets
Linux kernel code places the "parent" struct as the first member of a derived struct, allowing a pointer cast to treat a generic sock as a specific tcp_sock or udp_sock. Example:
struct tcp_sock {
struct inet_connection_sock inet_conn; // other fields
};
static inline struct tcp_sock *tcp_sk(const struct sock *sk)
{
return (struct tcp_sock *)sk;
}Summary
In the kernel, network transmission is implemented by a hierarchy of sock structures (sock → inet_sock → inet_connection_sock → tcp_sock/udp_sock).
The socket file descriptor ( socket_fd) is a user‑space handle that maps to these kernel objects, exposing network functionality via familiar file operations.
Connection establishment uses half‑ and full‑connection queues; data transfer relies on send/receive buffers and waiting queues.
Servers differentiate clients using the four‑tuple (source/destination IP and port) hashed into a lookup table.
Linux mitigated the thundering herd problem after kernel 2.6 by waking only one waiting process.
C achieves “inheritance” by embedding the parent struct as the first member of the child struct, enabling safe pointer casts.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
