Deep Dive into MySQL Partitioning, GTID Bugs, and TokuDB Optimize Table Anomalies
This article examines why MySQL 5.5/5.6 fails to retrieve non‑null partitioned rows after a physical upgrade, how the ALTER TABLE … PARTITION BY KEY ALGORITHM=1 fix works, the cause of MySQL client crashes when dump output mixes warnings, GTID_PURGED replication bugs, replicate‑filter side effects, and the behavior of TokuDB’s OPTIMIZE TABLE.
TokuDB Feature Analysis – Optimize Table
The following demonstrates a data‑loss scenario caused by a physical upgrade that changes the hash algorithm used for non‑null key values in MySQL 5.1 versus 5.5/5.6.
Phenomenon
drop table t1;
create table t1 (c1 int , c2 int)
PARTITION BY KEY (c2) partitions 5;
insert into t1 values(1,1785089517),(2,null);
MySQL> select * from t1 where c2=1785089517;
+------+------------+
| c1 | c2 |
+------+------------+
| 1 | 1785089517 |
+------+------------+
1 row in set (0.00 sec)
MySQL> select * from t1 where c2 is null;
+------+------+
| c1 | c2 |
+------+------+
| 2 | NULL |
+------+------+
1 row in set (0.00 sec)Running the same instance under MySQL 5.5/5.6 shows the non‑null row cannot be found:
alter table t1 PARTITION BY KEY ALGORITHM = 1 (c2) partitions 5;
MySQL> select * from t1 where c2 is null;
+------+------+
| c1 | c2 |
+------+------+
| 2 | NULL |
+------+------+
1 row in set (0.00 sec)
MySQL> select * from t1 where c2=1785089517;
Empty set (0.00 sec)Root Cause Analysis
MySQL 5.1 uses a different hash algorithm for non‑null values than 5.5/5.6. The 5.1 algorithm ( my_hash_sort_bin) places the value (1,1785089517) in partition 3, while the newer algorithm ( my_hash_sort_simple) puts it in partition 5, causing the query miss.
void my_hash_sort_bin(const CHARSET_INFO *cs __attribute__((unused)),
const uchar *key, size_t len,ulong *nr1, ulong *nr2)
{ ... } void my_hash_sort_simple(const CHARSET_INFO *cs,
const uchar *key, size_t len,
ulong *nr1, ulong *nr2)
{ ... }Both versions share the same handling for NULL values, so NULL rows are still found correctly.
The change stems from a charset‑strategy adjustment in MySQL 5.5/5.6 (see WL#2649).
Compatibility Fix
MySQL 5.5.31+ provides the syntax ALTER TABLE … PARTITION BY KEY ALGORITHM=1 to retain the old hash behavior. Applying it restores correct query results:
alter table t1 PARTITION BY KEY ALGORITHM = 1 (c2) partitions 5;
Query OK, 2 rows affected (0.02 sec)
select * from t1 where c2=1785089517;
+------+------------+
| c1 | c2 |
+------+------------+
| 1 | 1785089517 |
+------+------------+
1 row in set (0.00 sec)The CHECK TABLE … FOR UPGRADE command also detects the old algorithm and suggests the same ALTER statement.
MySQL Client Crash During Large Dump Import
Background
A client crash occurs when importing a ~50 GB dump that mixes warning messages with SQL statements because the user redirected both stdout and stderr to the dump file ( mysqldump … >test.dmp 2>&1).
The embedded warning line contains an unmatched single quote, causing MySQL.cc::add_line to keep reading until memory is exhausted and the process segfaults.
Additional observations:
Maximum allowed packet limits the size of CLOB columns (1 GB max in MySQL 5.5).
mysqldump splits large INSERT statements based on opt_net_buffer_length, forcing a new statement when the buffer size is exceeded.
The client allocates up to 1 GB for a batch buffer ( #define MAX_BATCH_BUFFER_SIZE (1024L * 1024L * 1024L)).
In practice, the limits of max_allowed_packet and MAX_BATCH_BUFFER_SIZE align, so normal imports succeed.
GTID_PURGED Breaks AUTO_POSITION Replication
Bug Description
In MySQL 5.6.22, setting SET GLOBAL GTID_PURGED='…' on the master with GTIDs that the slave has not yet executed does not raise an error; the master silently sends those GTIDs to the slave, leading to data loss.
The master rotates to a new binlog after the GTID_PURGED operation, recording the new GTID set in Previous_gtids_log_event. Because the old binlog contains no GTID events, the slave’s scan stops at the first binlog without error.
Fix
The fix adds a pre‑check: if gtid_purged is larger than the slave’s executed set, the server aborts with error 1236 (same message as condition 2).
Replication Filter Combined with GTID Causes Gaps
Problem
When using replicate-do-db to filter databases, the slave’s Executed_Gtid_Set becomes fragmented because only events for the selected databases are applied, leaving gaps in the GTID sequence.
These gaps can cause errors if the master later purges old binlogs or if a new replica needs the missing GTIDs.
Solution
A bugfix (revno 5860) makes the SQL thread record an empty transaction for filtered events, preserving GTID continuity. For row‑based binlog format, the existing check_table_map already creates empty transactions.
TokuDB Optimize Table Behavior
A user observed that repeated OPTIMIZE TABLE on a TokuDB table caused the main index file size to grow (47 MB → 63 MB → 79 MB) instead of shrinking.
Explanation: TokuDB writes dirty pages to the end of the file, leaving the original blocks as fragments. During checkpoint, these fragments are reclaimed for future writes, so the apparent growth does not indicate wasted space.
Consequences:
During OPTIMIZE TABLE, TokuDB flushes internal nodes to leaf nodes via toku_ft_flush_some_child.
Because TokuDB is a “no‑fragmentation” engine, OPTIMIZE TABLE provides little benefit and is generally unnecessary.
Note: The original article was reproduced from MySQL.taobao.org with credit to the 21CTO community.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
