Databases 8 min read

How to Shrink a 130GB Logistics JSON Column in MySQL

This article examines a MySQL table storing large JSON logistics data, analyzes why its size grows to 130 GB, and presents practical optimizations such as data archiving, JSON field reduction, and schema redesign to keep the table manageable.

ITPUB

Apr 20, 2016

How to Shrink a 130GB Logistics JSON Column in MySQL

Background

The e‑commerce system records a logistics JSON string for each order in the goods_order_express table. Over time the table grew to 130 GB because every API call appends a new JSON payload to the data TEXT column, and no rows are ever deleted.

Original Table Structure

CREATE TABLE `goods_order_express` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `express_id` int(10) unsigned NOT NULL,
  `message` varchar(200) NOT NULL,
  `status` varchar(20) NOT NULL,
  `state` tinyint(3) unsigned NOT NULL,
  `data` text NOT NULL,
  `created_time` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_expid` (`express_id`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8mb4;

Business Analysis

Each logistics event generates a JSON array like:

[{"time":"2016-03-16 11:16:20","ftime":"2016-03-16 11:16:20","context":"...","areaCode":"","areaName":"","status":"在途"}, ...]

The raw string is 411 bytes. The system periodically polls the API and inserts or updates this JSON into the data column. Because rows are never removed, the table continuously expands.

Optimization 1 – Archive Historical Data

Logistics records have a natural lifecycle: after delivery they are rarely needed for display. Archive rows older than 90 days to HBase (or another cold‑storage system) and delete them from MySQL, keeping the active table size bounded.

Optimization 2 – Reduce JSON Size

Remove redundant fields : ftime duplicates time, and areaCode / areaName are not used in the UI. Dropping them cuts the example from 411 bytes to 237 bytes (≈47 % reduction).

Trim node count : Typical shipments have 15‑20 nodes. Estimating 100 bytes per node, a maximum of 20 nodes fits comfortably in VARCHAR(2048). For shipments with >25‑30 nodes, keep only key milestones (first, last, and any status changes) and discard intermediate nodes.

Optimization 3 – Schema Change to VARCHAR

After trimming, the JSON fits within 2 KB. Changing the column type from TEXT to VARCHAR(2048) reduces storage overhead and improves indexability.

Optimization 4 – Sharding (Optional)

If the table still grows rapidly, consider horizontal sharding (e.g., by month or by express_id) using middleware such as Cobar, Mycat, or ProxySQL. However, this adds operational complexity and should be a last resort.

Result

After applying the above steps—archiving old rows, simplifying the JSON, and converting the column to VARCHAR(2048) —the table size stabilized and the system runs smoothly without requiring massive disk expansion.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

json MySQL logistics Data Archiving Schema Optimization

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.