Databases 17 min read

How to Speed Up Massive MySQL User‑Log Tables: Partitioning, Indexing, and Migration Strategies

This article examines performance problems with a 20‑million‑row MySQL user‑log table on Alibaba Cloud RDS, outlines three solution paths—optimizing the existing database, migrating to a MySQL‑compatible high‑performance service, and adopting a big‑data engine—and provides detailed guidance on schema design, indexing, partitioning, and practical SQL tweaks.

ITPUB

Sep 9, 2020

How to Speed Up Massive MySQL User‑Log Tables: Partitioning, Indexing, and Migration Strategies

Problem Overview

Alibaba Cloud RDS for MySQL 5.6 stores a user‑access log table with ~20 million rows for six months and ~40 million rows for a year. Queries become extremely slow and cause daily system hangs. The original schema and SQL are poorly designed.

Solution Options

Optimize the existing MySQL database (no code changes, low cost, limited scalability).

Migrate to a MySQL‑compatible high‑performance service (minimal code changes, higher cost).

Adopt a big‑data platform (high scalability, requires code changes).

Approach 1 – Optimizing the Existing MySQL Database

Table Design Recommendations

Avoid NULL columns; use default numeric values.

Prefer INT over BIGINT; use UNSIGNED for non‑negative values.

Replace string columns with ENUM or integer codes.

Prefer TIMESTAMP to DATETIME.

Keep column count below 20.

Store IP addresses as integers.

Indexing Guidelines

Create indexes only on columns used in WHERE or ORDER BY clauses; verify with EXPLAIN.

Do not index columns that are frequently NULL.

Avoid indexing low‑cardinality columns (e.g., gender).

Use prefix indexes for long VARCHAR columns.

Avoid primary keys on large VARCHAR columns.

Enforce foreign‑key logic in application code.

Minimize UNIQUE constraints unless required.

When using composite indexes, match the column order to query predicates.

SQL Optimization Tips

Limit result sets with LIMIT.

Avoid SELECT *; list needed columns.

Prefer JOIN over sub‑queries.

Break large DELETE / INSERT statements into smaller batches.

Enable slow‑query logging to identify bottlenecks.

Move column calculations to the right side of predicates.

Keep each statement simple to reduce lock time.

Replace OR with IN (logarithmic vs linear cost).

Handle complex logic in application code, not in triggers or functions.

Avoid leading wildcards in LIKE patterns.

Minimize the number of joins.

Compare values of the same type.

Avoid != or <> in WHERE clauses.

Prefer BETWEEN for continuous ranges.

Paginate large result sets with reasonable page sizes.

Partitioning

MySQL 5.1+ supports horizontal partitioning. Initial RANGE partitioning by month (12 partitions) gave ~6× speedup. Switching to HASH partitioning on id with 64 partitions yielded dramatic performance gains. PARTITION BY HASH (id) PARTITIONS 64; Example query after partitioning:

SELECT * FROM readroom_website WHERE MONTH(accesstime)=11 LIMIT 10;

Execution time dropped from several seconds to under one second.

Partition Limits

Maximum 1024 partitions per table.

Primary‑key or unique‑key columns must be part of the partition key.

Partitioned tables cannot have foreign keys.

NULL values prevent partition pruning.

All partitions must use the same storage engine.

Supported types: RANGE, LIST, HASH, KEY.

Sharding and Database Splitting

Horizontal sharding splits a large table into many smaller tables (e.g., tableName_id%100) and requires code changes. Vertical sharding separates columns into different tables, also requiring development effort. Database‑level read/write separation adds operational complexity and is not recommended for this case.

Approach 2 – Migrating to a MySQL‑Compatible High‑Performance Service

Open‑source options: TiDB ( https://github.com/pingcap/tidb) and CUBRID. Cloud services evaluated:

Alibaba Cloud POLARDB – 100% MySQL compatible, up to 100 TB storage, up to 6× MySQL performance, cost‑effective.

Alibaba Cloud OceanBase – MySQL‑compatible HTAP engine, higher cost, suited for mixed OLTP/OLAP workloads.

Tencent Cloud DCDB – MySQL‑compatible distributed database with automatic sharding.

Testing POLARDB showed ~10× performance improvement with minimal migration effort.

Approach 3 – Switching to a Big‑Data Engine

When data exceeds hundreds of millions of rows, consider:

Open‑source Hadoop ecosystem (HBase, Hive) – high operational cost.

Alibaba Cloud MaxCompute + DataWorks – serverless, pay‑as‑you‑go, suitable for batch processing. Implemented ~300 SQL lines and solved the problem for under ¥100.

MaxCompute provides SQL, MapReduce, Python, and shell interfaces; DataWorks offers workflow orchestration.

Conclusion

For workloads below the hundred‑million‑row threshold, start with MySQL schema and query optimization, then evaluate POLARDB if further performance is needed. Only migrate to a big‑data solution when relational databases can no longer handle the data volume.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Cloud Migration Indexing MySQL database optimization Partitioning

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Problem Overview

Solution Options

Approach 1 – Optimizing the Existing MySQL Database

Table Design Recommendations

Indexing Guidelines

SQL Optimization Tips

Partitioning

Partition Limits

Sharding and Database Splitting

Approach 2 – Migrating to a MySQL‑Compatible High‑Performance Service

Approach 3 – Switching to a Big‑Data Engine

Conclusion

ITPUB

How this landed with the community

Was this worth your time?

0 Comments

Approach 1 – Optimizing the Existing MySQL Database

Approach 2 – Migrating to a MySQL‑Compatible High‑Performance Service

Approach 3 – Switching to a Big‑Data Engine