How a Hidden Uint Overflow Triggered Massive Traffic Spikes and the Memory‑Leak Mystery I Solved

This article recounts a developer's journey from a fresh graduate to a senior backend engineer, detailing two real‑world incidents—a pseudo‑memory‑leak in a C++ service and a uint overflow that caused traffic bursts—showing the analysis steps, code fixes, and lessons learned for reliable backend development.

Programmer DD
Programmer DD
Programmer DD
How a Hidden Uint Overflow Triggered Massive Traffic Spikes and the Memory‑Leak Mystery I Solved

From Fresh Graduate to Independent Engineer

A recent graduate joined a startup game company as a server developer, learning quickly under a mentor and becoming capable of handling tasks independently within three months.

The author emphasizes the importance of a good mentor for new graduates and urges senior engineers to treat junior colleagues kindly.

Case Study 1: Pseudo‑Memory‑Leak in a C++ Service

Background

The service exhibited a steady memory increase of 4–8 KB every few seconds after deployment, suggesting a leak.

Analysis Process

Running Valgrind repeatedly showed no explicit leaks but many unreleased allocations. Manual module inspection and demo programs also failed to locate the leak.

After two weeks of intensive testing, a queue initialized with 10 million elements was identified as the culprit; its size caused continuous memory growth as items were enqueued and never freed.

Root Cause

Newly allocated objects were added to the queue, mapping virtual to physical memory. When objects were dequeued and freed, the physical memory was not immediately reclaimed, leading to a gradual increase until the queue reached its limit.

Reducing the queue length to 20 000 elements balanced memory usage, eliminating the growth.

Conclusion

The issue was a pseudo‑leak caused by an oversized queue, not an actual memory leak, allowing the service to be safely released.

Case Study 2: Traffic Spikes Caused by Uint Overflow

Background

After six years in the game industry, the author switched to internet backend development, encountering frequent high‑traffic incidents where network flow peaked at ~3 Tbps, leading to ISP IP bans and significant financial loss.

Analysis Process

Monitoring revealed that incidents occurred roughly every 50 days. The author hypothesized a link to a 32‑bit unsigned integer overflow (≈4.29 × 10⁹ ms ≈ 49 days).

Calculations confirmed that 50 days ≈ 4.32 × 10⁹ ms, close to the uint32 maximum.

Investigation

The function used to obtain the current timestamp was:

uint64_t now_ms() { struct timeval t; gettimeofday(&t, NULL); return t.tv_sec * 1000 + t.tv_usec / 1000; }

On 32‑bit systems, t.tv_sec * 1000 overflowed, producing a large jump in the returned value.

The client calculated the next ping time as now_ms() + 30000. When the overflow occurred, next_ping wrapped around to a small value, causing the client to send ping packets continuously for about 26 seconds, generating massive traffic.

Verification

By setting a client’s local clock to cross the overflow point, the log showed tens of thousands of ping packets sent within ten seconds, confirming the root cause.

Fix

The timestamp function was corrected to cast t.tv_sec to uint64_t before multiplication:

uint64_t now_ms() { struct timeval t; gettimeofday(&t, NULL); return (uint64_t)t.tv_sec * 1000 + t.tv_usec / 1000; }

After the fix, the overflow disappeared and traffic spikes ceased.

Prediction and Prevention

Using the 50‑day pattern, the team predicted future incident times, which later proved accurate, enabling proactive mitigation.

Lessons Learned

Large data structures (e.g., oversized queues) can cause apparent memory leaks; size them appropriately.

Beware of integer overflows on 32‑bit platforms, especially when handling timestamps.

Systematic monitoring and pattern analysis can reveal hidden periodic issues.

Accurate root‑cause analysis and targeted code fixes prevent recurring incidents.

These experiences highlight the importance of deep backend knowledge, careful code review, and proactive operations to maintain reliable services.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance OptimizationMemory LeakC++incident analysisuint overflow
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.