Operations 16 min read

How a Midnight Migration Saved Millions: Lessons in Problem‑Solving for Developers

A senior engineer recounts a high‑pressure, overnight data‑migration from an overloaded legacy platform to a new micro‑service system, detailing the technical challenges, rapid troubleshooting, multithreaded workarounds, and the broader lessons on what truly makes a programmer great.

ITPUB
ITPUB
ITPUB
How a Midnight Migration Saved Millions: Lessons in Problem‑Solving for Developers

1. The Core Skill: Solving Problems

When a reader asked what makes a programmer truly strong, the author answered simply: the ability to solve problems. He illustrates this with a story about two developers debating how to check network connectivity between servers, ultimately writing a Java‑based ping tool.

2. A Late‑Night Technical Story

Old platform vs. new platform – The company ran an aging Oracle‑based system that was designed for 1‑2 billion daily transactions but now handled 40 billion. After years of incremental optimizations, the architecture could no longer scale, prompting the development of a new micro‑service platform built on MySQL HA and hundreds of services.

The migration had to be seamless, likened to changing a car’s wheels while driving at highway speed. A sudden policy change forced the team to accelerate the migration schedule.

3. Midnight Migration Execution

The team prepared a migration tool that could move merchants from the old to the new platform in batches. On New Year’s Eve, they planned to migrate the remaining millions of merchants in a single, uninterrupted window.

After extensive pre‑testing, the migration began at 1 am. Initial batches of agents migrated successfully, but the speed slowed dramatically – only 100 k merchants per half‑hour, threatening a multi‑day operation.

Realising the urgency, the team analyzed logs and discovered that the migration program processed agents sequentially, despite each agent’s internal work being multithreaded. The main loop lacked parallelism.

4. Rapid Remedy: Manual Multithreading

Instead of rewriting code, the engineers opened multiple browser windows, each invoking the migration servlet with a different agent ID. Because each HTTP request runs in its own servlet thread, this achieved concurrent agent migration without code changes.

Testing with a few agents succeeded, but scaling to dozens introduced occasional errors due to shared mutable state. The root cause was identified as non‑thread‑safe global variables in the servlet.

Fixing this involved wrapping the shared data in ThreadLocal, giving each thread its own isolated copy.

5. Scaling Across Servers

When more than six concurrent servlet threads strained a single Tomcat instance, the team deployed the migration UI on ten separate servers. Each server handled a subset of agents, and the engineers manually entered agent groups into the web pages, effectively distributing the load.

Within two hours the migration of all agents completed, and by 6 am the new platform was fully operational. Subsequent monitoring showed only minor, non‑critical issues.

6. Reflections on What Makes a Great Programmer

The author emphasizes that technical knowledge alone is insufficient; the ability to stay calm under pressure, analyze logs, devise quick workarounds, and execute reliably distinguishes top engineers. Practicing problem‑solving in real incidents, documenting lessons, and continuously refining one’s approach are essential for growth.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendOperationsproblem solvingmultithreadingproduction migration
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.