Fundamentals 10 min read

Why Does TCP Send‑Q Exceed SO_SNDBUF? Deep Dive into Linux Kernel Buffer Mechanics

A Linux client that sets SO_SNDBUF to 4096 bytes and sends 1 KB packets to a non‑receiving server sees the TCP Send‑Q grow to 14480 bytes, revealing how the kernel doubles the buffer size, uses sk_wmem_queued, and expands memory via GSO‑driven size goals.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Why Does TCP Send‑Q Exceed SO_SNDBUF? Deep Dive into Linux Kernel Buffer Mechanics

Experiment Setup and Expected Phases

A client creates a TCP socket, sets SO_SNDBUF to 4096 bytes, and sends a 1024‑byte segment every second to a server that never calls recv(). The expected behavior is divided into three phases:

Phase 1: The server’s receive buffer is not full, so the client receives ACKs despite the server not reading data.

Phase 2: The server’s receive buffer becomes full, the kernel advertises a zero‑window, and the client’s data starts queuing in its send buffer.

Phase 3: The client’s send buffer fills and the send() call blocks.

Observed Anomaly

Monitoring the connection with ss -nt shows the Send‑Q value rising from 0 to 14480, far exceeding the configured SO_SNDBUF of 4096 bytes.

Why the Kernel Doubles SO_SNDBUF

When a user sets SO_SNDBUF, the kernel stores sk->sk_sndbuf = max(val * 2, SOCK_MIN_SNDBUF). Thus a 4096‑byte request is recorded as 8192 bytes. The kernel doubles the value to reserve space for internal overhead such as sk_buff, skb_shared_info, and L2‑L4 headers.

Understanding sk_wmem_queued

The kernel tracks the actual memory used by the socket in sk->wmem_queued, which includes both user data and the overhead of each sk_buff. Consequently, sk_wmem_queued is always larger than the visible Send‑Q value.

How tcp_sendmsg Determines the Write Size

The function first checks whether the write queue already contains a partially filled sk_buff. If not, it allocates a new sk_buff; otherwise it attempts to append data to the last sk_buff. The amount that can be appended is calculated as copy = size_goal - skb->len, where size_goal is derived from the current MSS and the GSO (Generic Segmentation Offload) setting.

int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) {
    mss_now = tcp_send_mss(sk, &size_goal, flags);
    while (msg_data_left(msg)) {
        int copy = 0;
        int max = size_goal;
        skb = tcp_write_queue_tail(sk);
        if (tcp_send_head(sk)) {
            copy = max - skb->len;
        }
        if (copy <= 0) {
            if (!sk_stream_memory_free(sk))
                goto wait_for_sndbuf;
            skb = sk_stream_alloc_skb(sk, select_size(sk, sg), sk->sk_allocation, skb_queue_empty(&sk->sk_write_queue));
        }
        /* case 2: copy msg to last skb */
        if (!sk_wmem_schedule(sk, copy))
            goto wait_for_memory;
        err = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb, pfrag->page, pfrag->offset, copy);
    }
}

Size Goal Calculation

The kernel computes size_goal in tcp_xmit_size_goal. When GSO is enabled, size_goal = tp->gso_segs * mss_now; otherwise it equals mss_now. In the author's test environment, mss_now is 1448 bytes, and with GSO enabled the kernel reports a size_goal of 14480 bytes—exactly ten times the MSS.

Impact on the Send Buffer

During Phase 2, tcp_sendmsg computes copy = 14480 - 1024 = 13456 bytes, which it attempts to place into the existing sk_buff. The kernel then calls sk_wmem_schedule, which can expand the socket’s memory allocation beyond the hard limit stored in sk_sndbuf. This mechanism allows sk_wmem_queued (and consequently the observed Send‑Q) to exceed the user‑specified SO_SNDBUF value.

Possible Mitigations

Disable the network interface’s GSO feature, which reduces size_goal to the MSS and prevents the large copy size.

Patch the kernel to move the sk_stream_memory_free check to the beginning of the while loop in tcp_sendmsg, ensuring the buffer limit is enforced before any allocation.

These changes would make the SO_SNDBUF setting effective again, preventing the kernel from silently expanding the send buffer beyond the configured limit.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

TCPGSOSO_SNDBUFSend-QNetwork Buffer
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.