How Spider Engine Enables Transparent Sharding and XA Transactions in MySQL/MariaDB
Spider is a MariaDB storage engine that adds built‑in sharding, table linking, XA‑compatible distributed transactions, and a pluggable architecture, allowing MySQL queries to be transparently routed to multiple backend nodes while preserving compatibility and offering performance‑optimised DirectSQL execution.
Spider is a special storage engine for MySQL/MariaDB that provides built‑in sharding. Integrated since MariaDB 10.0, it acts as a proxy, distributing tables across multiple backend nodes while remaining fully compatible with MySQL syntax.
1. Table Link
Spider implements the SQL/MED standard (ISO/IEC 9075‑9:2008). Creating a Spider table uses COMMENT or CONNECTION clauses to specify remote servers. Example:
CREATE TABLE s(id INT NOT NULL AUTO_INCREMENT, code VARCHAR(10), PRIMARY KEY(id));
CREATE TABLE s(id INT NOT NULL AUTO_INCREMENT, code VARCHAR(10), PRIMARY KEY(id))
ENGINE=SPIDER COMMENT 'host "192.168.0.1", user "user1", password "pwd1", port "3307"';When the Spider node accesses the table for the first time, it pulls metadata from the remote node and caches it locally. The system table spider_tables stores shard locations, enabling cross‑node joins.
2. Transactions
Spider supports both local and XA distributed transactions. It implements the XA interface, acting as the Transaction Manager (TM) while backend nodes act as Resource Managers (RM). The XA command sequence is:
XA START 'trx-id';
-- actual work
XA END 'trx-id';
XA PREPARE 'trx-id';
XA COMMIT 'trx-id';Transaction states are recorded in system tables spider_xa and spider_xa_members. The state machine moves through NOT YET → PREPARED → COMMITTED, rolling back if any participant fails.
3. Pluggable Engine
Spider is a MySQL pluggable storage engine that implements the handler interface. It does not store data itself; instead it forwards queries to backend nodes. Two access paths exist: the traditional MySQL query‑processing pipeline and a DirectSQL path that pushes the query directly to the shards, reducing double parsing and improving performance (≈50 % gain in tests).
4. Read/Write Flow
Spider follows MySQL’s iterator (volcano) execution model. SELECT, INSERT, DELETE, and UPDATE operations are translated into SQL statements sent to the appropriate shards.
SELECT
The server generates a SELECT that pulls rows from each shard, stores them in spider_db_result, and merges them via spider_db_fetch. Index scans use a priority queue to interleave rows from different partitions.
INSERT
Spider’s write_row builds an INSERT statement per shard. For bulk inserts, it groups rows by partition key and sends batched statements.
DELETE
DELETE is performed by first issuing a SELECT to obtain partition keys, then constructing DELETE statements for the relevant shards. Batch deletion is supported to reduce network overhead.
UPDATE
UPDATE follows the same pattern as DELETE, possibly issuing an extra DELETE/INSERT when the partition key changes.
In summary, Spider provides a transparent sharding middleware as a MySQL plug‑in, offering table linking, XA‑compatible distributed transactions, a pluggable engine architecture, and optimized read/write paths that avoid double parsing and leverage parallel execution on backend nodes.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
