Migrate 700TB Over 2Mbps: Scripts, Sneakernet & Practical Steps
When a manager demands a script to move a 700‑terabyte database under a 2 Mbps bandwidth cap, the realistic solution combines physical Sneakernet transfer with a carefully staged export‑transform‑load script that handles field mapping, compression, rate‑limited transport, and fault‑tolerant import.
When the boss says, “Just write a script to migrate a 700 TB database,” the first reaction is disbelief, especially with a hard limit of 2 Mbps that can’t impact the backbone network.
The two data centers use different domestic databases (A and B) whose field types are not fully compatible, similar to trying to import an Excel file into Word without reformatting.
Why a physical transfer?
At 2 Mbps the transfer rate is only 0.25 MB/s, roughly 21 GB per day, meaning moving 700 TB would take over 90 years—clearly impossible. Hence the community suggests “Sneakernet”: physically shipping hard drives, which is often faster than network transfer for massive data.
Physical migration is the mainstream approach for such volumes, though it requires handling permissions, security, live databases, and hot‑cold data segregation.
If you must write a script, follow this workflow
Data source: A domestic database
Target: B domestic database (field types not fully compatible)
Bandwidth limit: 2 Mbps
Total data: 700 TB
The workflow consists of four main steps:
Data export (ETL)
Field mapping and conversion
Compression and rate‑limited transfer
Target database import
Step 1 – Export data
Export data from database A in batches rather than all at once. A Java JDBC script can paginate through tables:
Connection conn = DriverManager.getConnection(sourceUrl, user, password);
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("SELECT * FROM huge_table LIMIT 0, 10000");
// Write results to local files (JSON, CSV, Parquet, etc.)Implement a checkpoint mechanism to resume after interruptions, storing progress per table.
Step 2 – Field conversion
Map incompatible field types, e.g., VARCHAR(255) → STRING, CLOB/BLOB → base64‑encoded files. A simple Java converter can be used:
public String convertType(String sourceType) {
switch (sourceType.toUpperCase()) {
case "VARCHAR": return "STRING";
case "CLOB": return "TEXT";
case "NUMBER": return "DECIMAL";
default: return "STRING";
}
}Tools like Apache NiFi or DataX can also handle the mapping if configured manually.
Step 3 – Compress & rate‑limited transfer
Compress data with high‑ratio algorithms such as LZ4 or Zstd, then throttle the network to stay under 2 Mbps. In Java, Guava’s RateLimiter can control the flow:
RateLimiter limiter = RateLimiter.create(0.25); // 0.25 MB per second
while ((bytesRead = input.read(buffer)) != -1) {
limiter.acquire(bytesRead);
output.write(buffer, 0, bytesRead);
}Alternatively, compress first, copy to a portable drive, and upload later during off‑peak hours.
Step 4 – Import into target database
Use the target’s bulk‑load capability (e.g., LOAD DATA) or a batch INSERT script:
PreparedStatement ps = conn.prepareStatement("INSERT INTO target_table (a, b) VALUES (?, ?)");
ps.setString(1, "xxx");
ps.setInt(2, 123);
ps.addBatch();
// Execute batch with appropriate commit intervals to avoid OOMControl commit size and import speed to prevent the target from being overwhelmed.
Commercial sync services?
Some suggest using Alibaba Cloud DTS, but for 700 TB the cost and performance are prohibitive; it’s only viable for small incremental syncs.
Key script requirements
Export and import capabilities
Compression and bandwidth throttling
Fault tolerance with checkpoint/resume
Field‑type mapping for target compatibility
Batch execution rather than a single massive run
In practice, a hybrid approach works best: let the script handle logical export, conversion, and chunked transfer, while the bulk of the data moves via physical media and professional tools.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
