Databases 8 min read

Migrate 700TB Over 2Mbps: Scripts, Sneakernet & Practical Steps

When a manager demands a script to move a 700‑terabyte database under a 2 Mbps bandwidth cap, the realistic solution combines physical Sneakernet transfer with a carefully staged export‑transform‑load script that handles field mapping, compression, rate‑limited transport, and fault‑tolerant import.

ITPUB
ITPUB
ITPUB
Migrate 700TB Over 2Mbps: Scripts, Sneakernet & Practical Steps

When the boss says, “Just write a script to migrate a 700 TB database,” the first reaction is disbelief, especially with a hard limit of 2 Mbps that can’t impact the backbone network.

The two data centers use different domestic databases (A and B) whose field types are not fully compatible, similar to trying to import an Excel file into Word without reformatting.

Why a physical transfer?

At 2 Mbps the transfer rate is only 0.25 MB/s, roughly 21 GB per day, meaning moving 700 TB would take over 90 years—clearly impossible. Hence the community suggests “Sneakernet”: physically shipping hard drives, which is often faster than network transfer for massive data.

Physical migration is the mainstream approach for such volumes, though it requires handling permissions, security, live databases, and hot‑cold data segregation.

If you must write a script, follow this workflow

Data source: A domestic database

Target: B domestic database (field types not fully compatible)

Bandwidth limit: 2 Mbps

Total data: 700 TB

The workflow consists of four main steps:

Data export (ETL)

Field mapping and conversion

Compression and rate‑limited transfer

Target database import

Step 1 – Export data

Export data from database A in batches rather than all at once. A Java JDBC script can paginate through tables:

Connection conn = DriverManager.getConnection(sourceUrl, user, password);
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("SELECT * FROM huge_table LIMIT 0, 10000");
// Write results to local files (JSON, CSV, Parquet, etc.)

Implement a checkpoint mechanism to resume after interruptions, storing progress per table.

Step 2 – Field conversion

Map incompatible field types, e.g., VARCHAR(255) → STRING, CLOB/BLOB → base64‑encoded files. A simple Java converter can be used:

public String convertType(String sourceType) {
    switch (sourceType.toUpperCase()) {
        case "VARCHAR": return "STRING";
        case "CLOB": return "TEXT";
        case "NUMBER": return "DECIMAL";
        default: return "STRING";
    }
}

Tools like Apache NiFi or DataX can also handle the mapping if configured manually.

Step 3 – Compress & rate‑limited transfer

Compress data with high‑ratio algorithms such as LZ4 or Zstd, then throttle the network to stay under 2 Mbps. In Java, Guava’s RateLimiter can control the flow:

RateLimiter limiter = RateLimiter.create(0.25); // 0.25 MB per second
while ((bytesRead = input.read(buffer)) != -1) {
    limiter.acquire(bytesRead);
    output.write(buffer, 0, bytesRead);
}

Alternatively, compress first, copy to a portable drive, and upload later during off‑peak hours.

Step 4 – Import into target database

Use the target’s bulk‑load capability (e.g., LOAD DATA) or a batch INSERT script:

PreparedStatement ps = conn.prepareStatement("INSERT INTO target_table (a, b) VALUES (?, ?)");
ps.setString(1, "xxx");
ps.setInt(2, 123);
ps.addBatch();
// Execute batch with appropriate commit intervals to avoid OOM

Control commit size and import speed to prevent the target from being overwhelmed.

Commercial sync services?

Some suggest using Alibaba Cloud DTS, but for 700 TB the cost and performance are prohibitive; it’s only viable for small incremental syncs.

Key script requirements

Export and import capabilities

Compression and bandwidth throttling

Fault tolerance with checkpoint/resume

Field‑type mapping for target compatibility

Batch execution rather than a single massive run

In practice, a hybrid approach works best: let the script handle logical export, conversion, and chunked transfer, while the bulk of the data moves via physical media and professional tools.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaETLrate limitinglarge data transfersneakernet
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.