Databases 20 min read

Inside MySQL 5.6 Parallel Replication: Code Walkthrough and Design

This article explains how MySQL 5.6 introduced parallel replication to overcome the bottleneck of a single SQL thread, detailing the underlying binlog events, configuration parameters, key data structures, worker coordination, checkpoint mechanisms, and potential limitations, all from a source‑code perspective.

21CTO

Feb 25, 2016

Inside MySQL 5.6 Parallel Replication: Code Walkthrough and Design

MySQL 5.6 replication enables higher performance, scalability, and availability; many large sites rely on it to surpass single‑instance limits and support billions of users. This article analyzes the implementation from a code‑level perspective.

Master‑slave synchronization uses binlog replay on the replica: the I/O thread fetches the master binlog into the relay log, and the SQL thread replays events. A single SQL thread becomes a bottleneck under heavy master load, causing inevitable replica lag.

To address the n‑to‑1 lag issue, MySQL 5.6 introduced parallel replication, allowing multiple SQL threads to run concurrently.

Design details can be found in worklogs WL#4648, WL#5563, WL#5569, WL#5754, WL#5599 and earlier monthly reports.

Prerequisite Knowledge

binlog

Binlog records database changes as a sequence of events, e.g.:

Query_log
Table_map
Write/Delete/Update_row_event
Xid

Refer to the official documentation for the meaning of each event.

Configuration

Parallel replication can be tuned via several parameters:

slave_parallel_workers – number of worker threads

slave-checkpoint-group – how many transactions trigger a checkpoint

slave-checkpoint-period – time interval between checkpoints

slave-pending-jobs-size-max – maximum size of pending events for workers

Concept Terminology

MTS – Multi‑Threaded Slave (parallel replication)

group – a set of events belonging to one transaction in the binlog

worker – execution thread introduced by MTS

Coordinator – the former SQL thread that now distributes work

checkpoint – point where the Coordinator collects completed work and advances the execution position

B‑event – transaction start (BEGIN or GTID)

G‑event – events containing database information (Table_map, Query)

T‑event – transaction end (COMMIT/ROLLBACK or XID)

Related Source Files

sql/rpl_rli_pdb.h

sql/rpl_rli_pdb.cc

sql/rpl_slave.cc

sql/log_event.cc

sql/rpl_rli.h

Parallel Execution Principles

The model follows a producer‑consumer pattern: the Coordinator (C) inserts events into each worker’s (W) task queue, and workers pull events for execution.

All events of the same group are sent to the same worker to preserve transaction consistency.

Dispatching is based on the database information in G‑events; other events follow the last assigned worker.

Important Data Structures

db_worker_hash_entry

– maps a database name to a worker; stored in the Coordinator’s hash table (APH). slave_job_item – an item in a worker’s job queue, containing a binlog event. circular_buffer_queue – a dynamic array‑based ring buffer used by several queues. Slave_job_group – tracks a transaction’s metadata (log positions, worker ID, checkpoint info, completion flag, etc.). Slave_committed_queue – a subclass of circular_buffer_queue that holds Slave_job_group objects. Slave_jobs_queue – each worker’s task queue, also a subclass of circular_buffer_queue. Slave_worker – represents a worker thread; contains its job queue, coordinator pointer, and execution state. Relay_log_info – the Coordinator’s extended structure (formerly the SQL thread) that holds mapping tables, worker array, pending‑job counters, GAQ, and checkpoint configuration.

Other Methods

map_db_to_worker()

– maps a database to a worker. get_least_occupied_worker() – selects the least loaded worker. wait_for_workers_to_finish() – synchronizes workers before switching to serial execution. append_item_to_jobs() – enqueues an event into a worker’s job queue. mts_move_temp_table_to_entry() and mts_move_temp_tables_to_thd() – handle temporary table transfer.

Initialization

Compared with single‑threaded SQL, MTS initializes additional variables and starts worker threads via slave_start_workers(), which sets up the Coordinator’s hash tables, GAQ, and worker structures, then calls slave_start_single_worker() for each worker. Workers run handle_slave_worker(), repeatedly invoking slave_worker_exec_job() to process assigned events.

Coordinator Dispatch Coordination

The Coordinator repeatedly calls exec_relay_log_event(), which reads the next event ( next_event()) and applies it ( apply_event_and_update_pos()). If MTS is enabled, get_slave_worker() determines the target worker.

Event classification:

B‑event – BEGIN/GTID (transaction start)

G‑event – contains database info (Table_map, Query)

P‑event – pre‑G events (int_var, rand, user_var, etc.)

R‑event – row events following G‑event

T‑event – COMMIT/ROLLBACK or XID (transaction end)

Dispatch logic:

B‑event: increment mts_groups_assigned, enqueue a new group in GAQ, store the event in curr_group_da (no DB info yet).

G‑event: use map_db_to_worker() to find or create a mapping; if the mapping already exists and is free, reuse it; otherwise resolve conflicts or create a new entry, possibly evicting unused mappings when the hash exceeds its soft limit.

Other events: use the last assigned worker.

When to Switch to Serial Execution

If a G‑event references more than MAX_DBS_IN_EVENT_MTS (16) databases or involves tables with foreign‑key dependencies, the group is executed serially on worker 0 after all other workers finish.

Worker Execution

Workers process jobs via slave_worker_exec_job():

Dequeue an event.

Update worker‑local state (group parts, relay log positions, GAQ index).

Execute the event with do_apply_event_worker(), which ultimately calls each event’s do_apply_event().

If the event is a T‑event, call slave_worker_ends_group() to commit positions, update the corresponding Slave_job_group, and clear the worker’s group parts.

Update Coordinator statistics (pending jobs, memory usage).

Adjust overrun/underrun status.

Checkpoint Process

The Coordinator periodically runs mts_checkpoint_routine() based on time ( mts-checkpoint-period) or the number of dispatched groups ( slave-checkpoint-group). It advances the low‑water‑mark (lwm) by scanning GAQ and removing completed groups via Slave_committed_queue::move_queue_head(). The diagram below illustrates the flow:

Stopping the Slave

Executing STOP SLAVE terminates both Coordinator and workers. The Coordinator first calls slave_stop_workers(), which signals each worker to stop, waits for them to finish, performs a final checkpoint, and releases resources (hash tables, GAQ, etc.). Workers stop after completing their current group.

Abnormal Termination

If a worker encounters an error, it signals the Coordinator, clears its job queue, and sets its status to NOT_RUNNING. The Coordinator then stops remaining workers without performing a final checkpoint. If the Coordinator itself is killed, it follows a similar procedure.

Recovery

After a normal or abnormal shutdown, the slave restarts by using the Coordinator and each worker’s recorded state to restore a consistent position before resuming parallel execution.

Open Issues

MySQL 5.6 MTS dispatches at the database level, which can limit concurrency when only one database is used. A simple improvement is to dispatch by dbname + tablename. For hot‑spot tables where most events target a single table, a transaction‑level dispatch strategy could further increase parallelism, though it would require more extensive code changes.

Source: Database Kernel Monthly Report Original: http://mysql.taobao.org/monthly/2015/08/09/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mysql replication MTS parallel-replication

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.