Designing MySQL Partition Strategies for Billion‑Row Tables: A Complete Guide
This article explains MySQL partition types (RANGE, LIST, HASH, KEY), demonstrates partition pruning, details maintenance commands, presents real‑world use cases such as time‑series and massive tables, and offers best‑practice recommendations for key selection, partition count, automation, and query optimization.
1. Partition Types
RANGE Partition
Applicable scenario: partitioning by time or numeric range. Example:
-- Partition by year example
CREATE TABLE sales (
id INT NOT NULL,
sale_date DATE NOT NULL,
amount DECIMAL(10,2)
)
PARTITION BY RANGE (YEAR(sale_date)) (
PARTITION p2020 VALUES LESS THAN (2021),
PARTITION p2021 VALUES LESS THAN (2022),
PARTITION p2022 VALUES LESS THAN (2023),
PARTITION p2023 VALUES LESS THAN (2024),
PARTITION p_future VALUES LESS THAN MAXVALUE
);LIST Partition
Applicable scenario: discrete values such as region or type. Example:
-- Partition by region example
CREATE TABLE users (
id INT NOT NULL,
username VARCHAR(50),
region_id INT NOT NULL
)
PARTITION BY LIST(region_id) (
PARTITION p_east VALUES IN (1,2,3), -- East region
PARTITION p_west VALUES IN (4,5,6), -- West region
PARTITION p_north VALUES IN (7,8,9), -- North region
PARTITION p_south VALUES IN (10,11,12) -- South region
);HASH Partition
Applicable scenario: uniformly distribute data without a specific query pattern. Example:
-- Partition by user ID hash example
CREATE TABLE orders (
order_id INT NOT NULL,
user_id INT NOT NULL,
order_date DATETIME,
total_amount DECIMAL(10,2)
)
PARTITION BY HASH(user_id)
PARTITIONS 8;KEY Partition
Similar to HASH but uses MySQL's built‑in hash function and supports all data types except TEXT/BLOB as partition keys. Example:
-- Key partition example
CREATE TABLE logs (
id INT NOT NULL PRIMARY KEY,
log_time DATETIME NOT NULL,
message TEXT
)
PARTITION BY KEY(id)
PARTITIONS 10;2. Partition Pruning
When a query's WHERE clause contains the partition key, MySQL's optimizer scans only the relevant partitions, skipping the rest.
-- Example: partition pruning
EXPLAIN SELECT * FROM sales
WHERE sale_date BETWEEN '2022-01-01' AND '2022-12-31';
-- Result scans only partition p2022; other partitions are automatically skipped3. Partition Maintenance Operations
1. Partition Management
-- Add a new RANGE partition
ALTER TABLE sales ADD PARTITION (
PARTITION p2024 VALUES LESS THAN (2025)
);
-- Drop a partition (data is removed)
ALTER TABLE sales DROP PARTITION p2020;
-- Reorganize a partition
ALTER TABLE sales REORGANIZE PARTITION p_future INTO (
PARTITION p2024 VALUES LESS THAN (2025),
PARTITION p_future VALUES LESS THAN MAXVALUE
);
-- Merge partitions
ALTER TABLE users REORGANIZE PARTITION p_east, p_west INTO (
PARTITION p_east_west VALUES IN (1,2,3,4,5,6)
);2. Data Maintenance
-- Rebuild partitions (optimize storage)
ALTER TABLE sales REBUILD PARTITION p2022, p2023;
-- Optimize partition (reclaim space)
ALTER TABLE sales OPTIMIZE PARTITION p2022;
-- Analyze partition (update statistics)
ALTER TABLE sales ANALYZE PARTITION p2022;
-- Check partition integrity
ALTER TABLE sales CHECK PARTITION p2022;
-- Repair partition if needed
ALTER TABLE sales REPAIR PARTITION p2022;3. Partition Information Queries
-- Show partition definition
SHOW CREATE TABLE sales;
-- View partition metadata
SELECT * FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME = 'sales';
-- View row count per partition
SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME = 'sales';4. Typical Use Cases
Scenario 1: Time‑Series Data Management
-- Large table partitioned by day/month
CREATE TABLE sensor_data (
id BIGINT NOT NULL AUTO_INCREMENT,
sensor_id INT NOT NULL,
collect_time DATETIME NOT NULL,
value FLOAT NOT NULL,
PRIMARY KEY (id, collect_time)
)
PARTITION BY RANGE COLUMNS(collect_time) (
PARTITION p202301 VALUES LESS THAN ('2023-02-01'),
PARTITION p202302 VALUES LESS THAN ('2023-03-01'),
PARTITION p_future VALUES LESS THAN MAXVALUE
);Advantages: DROP PARTITION can be up to 1000× faster than DELETE, time‑range queries run faster, and archiving becomes straightforward.
Scenario 2: Massive Table Management
-- Partition design for a billion‑row user table
CREATE TABLE big_users (
user_id BIGINT NOT NULL,
created_at DATETIME NOT NULL,
data JSON,
PRIMARY KEY (user_id, created_at)
)
PARTITION BY HASH(user_id DIV 1000000) PARTITIONS 100;Advantages: physical split into smaller files, parallel query execution, and more flexible backup/restore.
Scenario 3: Hot‑Data Isolation
-- Separate active and historical orders
CREATE TABLE orders (
order_id BIGINT NOT NULL,
user_id INT NOT NULL,
status TINYINT NOT NULL,
created_at DATETIME NOT NULL,
PRIMARY KEY (order_id, created_at)
)
PARTITION BY LIST(status) (
PARTITION p_active VALUES IN (1,2,3), -- active orders
PARTITION p_completed VALUES IN (4,5), -- completed orders
PARTITION p_cancelled VALUES IN (6,7) -- cancelled orders
);5. Best Practices and Performance Tips
5.1 Partition Key Selection
Choose columns that appear frequently in WHERE clauses.
Ensure the partition key is part of the primary key (composite primary keys must include the partition key).
Avoid nondeterministic functions as partition keys.
5.2 Control Number of Partitions
Recommended to keep the number of partitions below 100. Excessive partitions increase open file descriptors, memory usage, and optimizer overhead.
5.3 Automated Maintenance
DELIMITER $$
CREATE PROCEDURE maintain_partitions()
BEGIN
-- Run on the 1st of each month to create next month’s partition
IF DAY(CURDATE()) = 1 THEN
SET @next_month = DATE_FORMAT(DATE_ADD(CURDATE(), INTERVAL 1 MONTH), '%Y%m');
SET @sql = CONCAT(
'ALTER TABLE sensor_data REORGANIZE PARTITION p_future INTO (',
'PARTITION p', @next_month, ' VALUES LESS THAN (''', DATE_FORMAT(DATE_ADD(CURDATE(), INTERVAL 2 MONTH), '%Y-%m-01'), ''')',
', PARTITION p_future VALUES LESS THAN MAXVALUE)'
);
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END IF;
-- Drop data older than one year
SET @old_partition = DATE_FORMAT(DATE_SUB(CURDATE(), INTERVAL 13 MONTH), '%Y%m');
SET @sql = CONCAT('ALTER TABLE sensor_data DROP PARTITION p', @old_partition);
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END$$
DELIMITER ;
CREATE EVENT partition_maintenance ON SCHEDULE EVERY 1 DAY DO CALL maintain_partitions();5.4 Query Optimization
-- Good: query includes partition key, enabling pruning
SELECT * FROM sales WHERE sale_date >= '2023-01-01';
-- Bad: query cannot use partition pruning
SELECT * FROM sales WHERE amount > 1000;
-- Index on the partition key remains important
CREATE INDEX idx_sale_date ON sales(sale_date);Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Senior Xiao Ying
Dedicated to sharing Java backend technical experience and original tutorials, offering career transition advice and resume editing. Recognized as a rising star in CSDN's Java backend community and ranked Top 3 in the 2022 New Star Program for Java backend.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
