Dynamic Extension of Fields in Billion‑Row Core Tables: Architecture and Implementation
To avoid lock, page split, and index degradation when adding fields to billion‑row core tables, this article presents a practical solution that uses dynamic field extension via configurable management, JSON‑based extension fields, and an extension‑table architecture integrated with Elasticsearch for storage and retrieval.
When a core table with billions of rows requires a new field, a simple ALTER TABLE can cause table locks, page splits, and index performance decay, potentially leading to production incidents. This article explores how to dynamically extend fields without impacting business operations.
Background
In software projects, new features often need additional columns. Adding columns to a massive table introduces lock risks, page fragmentation, and index degradation, especially when using older MySQL versions that lack native JSON indexing.
Extension Field Approach
The simplest solution is to add an extend column of type JSON to store extra attributes. The table structure looks like:
order_id | extend
111 | {"uid":1,"name":"张三"}
222 | {"uid":2,"name":"李四"}
...Java code manages this column by defining an internal class for the JSON structure:
public class Order {
private Long orderId;
private String extend;
/** For the extend field, define an inner class */
@Data
public static class ExtendObj implements Serializable {
private Long uId;
private String name;
// ... other fields
}
/** Getter and setter for the extend field */
public void setExtendObj(ExtendObj extendObj) {
// Simplified, null checks omitted
this.extend = JSON.toJSONString(extendObj);
}
public ExtendObj getExtendObj() {
// Simplified, null checks omitted
return JSON.parseObject(extend, ExtendObj.class);
}
}Problems with this approach include lack of indexing on JSON fields, concurrency overwrites when multiple updates occur, and repetitive work to update the inner class for each new attribute.
Extension Table Approach
To overcome indexing and concurrency issues, the solution stores each extra attribute as a separate row in an extension table:
order_id | key | value
111 | uId | 1
111 | name | 张三
222 | uId | 2
222 | name | 李四
...When new attributes like age are added, additional rows are inserted, enabling indexing and reducing contention. However, this multiplies row counts, requiring sharding for the extension table as well.
Current Integrated Solution
The final architecture splits the system into three parts: data management, data storage, and data retrieval. Data management handles dynamic field registration, scope, and lifecycle; data storage uses the extension‑table pattern; data retrieval leverages an Elasticsearch cluster with a custom ES management system (ECP) to combine main and extension data for queries.
With this setup, adding a new field only requires a configuration change in the management console, after which the field is available for storage, transmission, and search without any code changes.
Conclusion
Dynamic field extension for massive tables balances flexibility and stability. By separating core and extension data, decoupling management, storage, and retrieval, and using Elasticsearch for unified search, the solution enables rapid business iteration while mitigating technical risks.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.