Cloud Native 11 min read

Why Linux veth Devices Miss TCP Checksums in Containers and How to Fix It

A Linux kernel bug caused virtual Ethernet (veth) devices in container environments to ignore TCP checksum validation, leading to corrupted data reaching applications, and the article explains the incident, investigation, reproducible tests, root cause in the veth driver, and the patch that resolves the issue.

ITPUB
ITPUB
ITPUB
Why Linux veth Devices Miss TCP Checksums in Containers and How to Fix It

Background

A Linux kernel bug was discovered that prevents containers using veth devices (e.g., Docker on IPv6, Kubernetes, Google Container Engine, Mesos) from checking TCP checksums, allowing corrupted packets to be delivered to applications under certain hardware conditions.

Incident at Twitter

During a weekend on‑call shift, Twitter engineers observed "impossible" errors in several services, with malformed strings and missing fields. After extensive application‑level debugging, the issue was traced to a subset of racks where TCP checksum errors spiked before the failures, indicating a network‑layer problem rather than application overload.

Investigation and Reproduction

The team reproduced the bug by running a simple client‑server test: a client sends a long message every second, the server listens with nc, and the tc tool corrupts packets before transmission. When both endpoints run on bare metal, no corrupted packets reach the server; when the client runs inside a container using a veth interface, corrupted packets are delivered to the application.

Linux Networking and Containers

In Linux, a veth pair acts as a virtual Ethernet device that connects a container’s network namespace to the host. The typical setup involves creating a virtual machine, a veth pair, moving one end into the container, assigning IP addresses, and configuring routing.

Root Cause

Inspection of the veth driver revealed that packets received from hardware with a checksum status of CHECKSUM_UNNECESSARY were incorrectly treated as valid even when the checksum could not be verified. The driver code replaced CHECKSUM_NONE with CHECKSUM_UNNECESSARY, effectively disabling software checksum verification for forwarded packets.

Patch

The fix removes the erroneous assignment, allowing the kernel to correctly drop or retransmit packets with bad checksums. The relevant diff is shown below:

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 0ef4a5a..ba21d07 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -117,12 +117,6 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
     kfree_skb(skb);
     goto drop;
 }
-/* don't change ip_summed == CHECKSUM_PARTIAL, as
- * will cause bad checksum on forwarded packets
- */
-if (skb->ip_summed == CHECKSUM_NONE &&
-    rcv->features & NETIF_F_RXCSUM)
-    skb->ip_summed = CHECKSUM_UNNECESSARY;
 
 if (likely(dev_forward_skb(rcv, skb) == NET_RX_SUCCESS)) {
     struct pcpu_vstats *stats = this_cpu_ptr(dev->vstats);
*** End of diff ***

Resolution

After applying the patch, Twitter’s Mesos team deployed the change to all production containers, disabling RX checksum offloading on each veth device. The issue was also back‑ported to kernel 3.14 and later stable releases (e.g., SUSE, Canonical). Docker’s default NAT network is unaffected, but Docker’s IPv6 mode exhibits the same problem.

Conclusion

The rapid response from the Linux networking maintainers demonstrates the importance of thorough kernel testing in containerized environments. This long‑standing bug could cause silent data corruption and application crashes, highlighting the need for proper checksum handling in virtual network interfaces.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

DockerKubernetesLinuxcontainer networkingVethchecksum bugkernel patch
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.