Backend Development 15 min read

From Two‑Month Crawl to Four‑Hour Sprint: Optimizing a 20M‑Record Data Migration

This article chronicles a step‑by‑step performance overhaul of a 20‑million‑record migration project, detailing four architectural revisions—from a single‑threaded procedural script to a fully decoupled, multithreaded, interface‑driven solution—that reduced processing time from two months to just four hours while ensuring data consistency, recoverability, and scalability.

Programmer DD

Jun 16, 2019

From Two‑Month Crawl to Four‑Hour Sprint: Optimizing a 20M‑Record Data Migration

I was initially unaware of what performance optimization truly entails until I took on a small yet comprehensive data migration project that required moving 20 million user records from database A to database B, generating a GUID for each user, and maintaining an association table.

Project requirements :

Insert users into database B via an SDK registration API (no direct JDBC inserts).

Data must be recoverable: successful records are skipped on re‑run, failed records are persisted for later retry.

Consistency: each user in B must correspond one‑to‑one with an entry in the association table of A; errors must not break overall consistency.

Speed: process the entire 20 million rows within a day while preserving correctness.

Version 1 – Procedural (2 months)

Characteristics: single‑threaded, tightly coupled, processing each record sequentially, no error handling, no counters. The entire flow—read from A, generate GUID, call SDK to insert into B, insert association into A—was implemented in one massive main method. This design made the slowest link (often the SDK HTTP call or a blocked insert into the association table) dominate the overall throughput, leading to an estimated two‑month runtime.

Version 2 – Object‑Oriented (21 days)

Characteristics: object‑oriented, still single‑threaded, slightly decoupled, batch inserts, data recoverable.

The architecture introduced three core objects: BatchStrategy (configuration holder), Reader , Processor , and Writer . Each object handled a distinct stage—reading, processing, writing—allowing isolated changes. Batch insertion reduced the number of SDK HTTP calls and used JDBC batch for the association table, cutting the runtime to 21 days, but the overall pipeline remained limited by the slowest stage.

Version 3 – Fully Decoupled (Queue + Multithreading, 3 days)

Characteristics: multithreaded, fully decoupled, batch insertion, data recoverable.

A thread‑safe ConcurrentLinkedQueue was introduced between stages. Reader enqueues raw rows, Processor dequeues, transforms, and enqueues results for Writer . This asynchronous design eliminated blocking on slower stages and allowed parallel execution, reducing total time to three days. Data recoverability was achieved by persisting successful users and failed association rows for later retry.

Version 4 – Highly Abstract (One‑Click, 4 hours)

Characteristics: interface‑driven, multithreaded, extensible, batch or single‑row insertion, optimized LIMIT queries.

The final design abstracts each stage as a Job interface with lifecycle methods ( init, start, stop, finish). An Interactive<T> interface defines openInteract, receive, and closeInteract. The main method wires the jobs together and launches each in its own thread:

public static void main(String[] args) {
    Job reader = Reader.getInstance();
    Job processor = Processor.getInstance();
    Job writer = Writer.getInstance();
    reader.init();
    processor.init();
    writer.init();
    start(reader, processor, writer);
}

private static void start(Job... jobs) {
    for (Job job : jobs) {
        new Thread(() -> job.start()).start();
    }
}

Further optimizations focused on the reader (potential multithreading) and asynchronous logging, which would dramatically cut the overhead of logging 20 million operations.

Further Optimization Thoughts

Parallelizing the reader is non‑trivial because it reads directly from the database, but it remains a viable improvement.

Logging accounts for a large portion of runtime; making it asynchronous would further boost performance.

Overall, the migration evolved from a monolithic, inefficient script to a clean, modular, and highly parallel system that meets the original constraints of speed, consistency, and recoverability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Data Migration Java Performance Optimization Batch Processing Spring Boot multithreading

Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Version 1 – Procedural (2 months)

Version 2 – Object‑Oriented (21 days)

Version 3 – Fully Decoupled (Queue + Multithreading, 3 days)

Version 4 – Highly Abstract (One‑Click, 4 hours)

Further Optimization Thoughts

Programmer DD

How this landed with the community

Was this worth your time?

0 Comments

Version 1 – Procedural (2 months)

Version 2 – Object‑Oriented (21 days)

Version 3 – Fully Decoupled (Queue + Multithreading, 3 days)

Version 4 – Highly Abstract (One‑Click, 4 hours)