Operations 6 min read

TCP Out‑of‑Memory on AWS EC2: Diagnosis and Kernel Parameter Tuning

An AWS EC2 instance behind an Elastic Load Balancer became unresponsive due to repeated TCP out‑of‑memory errors, which were resolved by examining kernel messages, adjusting tcp_mem‑related kernel parameters, and rebooting the server after a long uptime.

Cognitive Technology Team
Cognitive Technology Team
Cognitive Technology Team
TCP Out‑of‑Memory on AWS EC2: Diagnosis and Kernel Parameter Tuning

In a production environment several EC2 instances run a Java 8/Tomcat 8 application behind an Elastic Load Balancer; one instance suddenly stopped responding while the others continued to handle traffic, returning a proxy error page that reported “Proxy Error – The proxy server received an invalid response from an upstream server – Reason: Error reading from remote server”.

APM monitoring showed normal CPU and memory usage but no traffic reaching the problematic instance, and standard diagnostics (vmstat, iostat, netstat, top, df) revealed nothing abnormal. Restarting Tomcat also failed to restore responsiveness.

Running dmesg on the instance displayed many lines like [4486500.513856] TCP: out of memory -- consider tuning tcp_mem indicating that the shortage originated in the TCP layer rather than the application code.

Web searches for the exact error yielded almost no useful results; the issue was temporarily fixed by rebooting the EC2 instance, which had been running for over 70 days, suggesting that long‑term operation exhausted TCP memory resources.

A colleague recommended checking several kernel parameters: net.core.netdev_max_backlog , net.core.rmem_max , net.core.wmem_max , net.ipv4.tcp_max_syn_backlog , net.ipv4.tcp_rmem , and net.ipv4.tcp_wmem . The current values were:

net.core.netdev_max_backlog = 1000
net.core.rmem_max = 212992
net.core.wmem_max = 212992
net.ipv4.tcp_max_syn_backlog = 256
net.ipv4.tcp_rmem = 4096 87380 6291456
net.ipv4.tcp_wmem = 4096 20480 4194304

Increasing these limits to the following values resolved the problem:

net.core.netdev_max_backlog = 30000
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 87380 67108864

Key takeaways:

Modern APM tools have limits: they may not surface low‑level kernel memory issues.

dmesg is valuable: kernel logs can reveal the root cause when applications appear hung.

Memory problems can occur outside the application layer: TCP and kernel buffers may exhaust, requiring tuning of sysctl parameters.

operationsTCPlinuxAWSkernel-tuningEC2out of memory
Cognitive Technology Team
Written by

Cognitive Technology Team

Cognitive Technology Team regularly delivers the latest IT news, original content, programming tutorials and experience sharing, with daily perks awaiting you.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.