Why MySQL Excels at Transactions but Falters in Analytics – DuckDB to the Rescue
The article examines why MySQL’s transaction layer remains robust while its analytical capabilities lag, outlines enterprise demands for simultaneous OLTP and OLAP workloads, introduces DuckDB’s lightweight yet powerful engine, describes AliSQL’s integration of DuckDB for full MySQL compatibility, and presents TPCH benchmark results showing a dramatic performance boost for analytical queries.
MySQL: Strong Transactions, Weak Analytics
MySQL has been one of the most successful database products for over two decades, thanks to its clear three‑layer architecture (connection, execution engine, storage engine) and thread‑based execution model, which give it high stability and performance for high‑concurrency, small‑transaction workloads typical of internet applications.
Transaction stability
Since the InnoDB engine matured, MySQL provides strong ACID guarantees, row‑level locking, and efficient concurrency control, making it the de‑facto standard for online transaction processing (OLTP) in the cloud era.
Analytical pain points
MySQL’s analytical processing (AP) capability is limited. Its storage engine (row‑store, B‑Tree) favors selective reads and does not handle large sequential scans well. The execution engine relies on scalar, waterfall processing, which cannot fully exploit modern SIMD‑friendly CPUs. The optimizer is tuned for OLTP workloads, so complex joins, inaccurate statistics, and huge tables often produce sub‑optimal plans.
Large scans pollute the buffer pool and evict hot OLTP data.
Long‑running queries hold resources, amplifying lock and scheduling contention.
CPU and I/O are consumed by analytical tasks, causing noticeable transaction latency jitter.
In production, these analytical queries can cripple transaction performance. Meanwhile, modern enterprises increasingly require a single database that can serve both OLTP and OLAP workloads.
User demand: mixed workloads
According to DB‑Engines, there are now more than 400 database products worldwide, and users expect three things from a database: support for mixed OLTP + OLAP workloads, resource isolation between the two, and full compatibility with existing MySQL applications.
Mixed OLTP & OLAP
Product data must be used for order processing and also for analytical reporting.
Order tables drive payment and fulfillment while also feeding business intelligence dashboards.
User‑behavior logs need to support real‑time decisions and offline analysis.
MySQL’s buffer pool caches both data and indexes; heavy OLAP scans consume large portions of memory, reducing OLTP hit rates. The optimizer’s join and aggregation strategies are not tuned for big‑table analytics, and joins such as hash join, nested loop join or sort‑merge join often degrade to inefficient plans.
Typical solutions copy transactional data to a separate analytical system via ETL, which introduces latency, complexity, and higher operational cost.
DuckDB: Small footprint, high compute power
DuckDB is an in‑process, embedded OLAP database often called “SQLite for analytics”. It is released under the MIT license.
https://github.com/duckdb/duckdbSmall size
The binary is only a few dozen megabytes. In a local test the DuckDB binary is 55 MiB, whereas MySQL 8.0’s binary is about 1.3 GiB – a 25× difference.
55M duckdb
1.3G mysql/binDuckDB has no separate server process, no cluster manager, and can be linked directly into an application, giving it a very low deployment and operational overhead.
High compute power
Columnar storage
Vectorized execution engine
Batch operator processing
CPU‑friendly memory layout
On typical analytical operators (scan, filter, aggregation, join) DuckDB fully utilizes CPU caches and SIMD instructions. Its design focuses on read‑intensive, compute‑heavy workloads rather than high‑concurrency writes, making it a natural complement to MySQL.
AliSQL + DuckDB: Meeting all three demands
AliSQL, the open‑source brand of Alibaba Cloud RDS MySQL, has been revived and now integrates DuckDB as an embedded analytical engine. The architecture provides:
One database instance with two engines (InnoDB for OLTP, DuckDB for OLAP) and physical isolation.
100 % MySQL protocol and syntax compatibility – applications need no code changes.
The following diagram illustrates the AliSQL‑DuckDB architecture:
Performance test with TPCH
The TPCH benchmark (v3.0.1, 100 GB) was run on an Alibaba Cloud RDS MySQL instance (8 Cores, 16 GB RAM, high‑performance SSD). DuckDB was used as a read‑only analytical engine.
Test method
TPCH version 3.0.1, scale factor 100 (≈100 GB).
22 analytical queries, no data updates.
Data generation
./dbgen -s 100 -f -vTable creation (MySQL)
CREATE TABLE region (
r_regionkey INT NOT NULL,
r_name CHAR(25),
r_comment VARCHAR(152),
PRIMARY KEY (r_regionkey)
) ENGINE=InnoDB;
-- (similar CREATE statements for nation, supplier, part, partsupp, customer, orders, lineitem) ...Data import
load data local infile '/data/tpch-v3/dbgen/region.tbl' into table region fields terminated by '|';
load data local infile '/data/tpch-v3/dbgen/nation.tbl' into table nation fields terminated by '|';
... (remaining tables) ...Sample query (Q1)
-- Q1
SELECT
l_returnflag,
l_linestatus,
sum(l_quantity) AS sum_qty,
sum(l_extendedprice) AS sum_base_price,
sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
avg(l_quantity) AS avg_qty,
avg(l_extendedprice) AS avg_price,
avg(l_discount) AS avg_disc,
count(*) AS count_order
FROM lineitem
WHERE l_shipdate <= CAST('1998-09-02' AS date)
GROUP BY l_returnflag, l_linestatus
ORDER BY l_returnflag, l_linestatus;All 22 TPCH queries were executed on both engines. Results:
InnoDB (MySQL) took ~4 000 seconds for the 22 queries.
DuckDB completed the same 22 queries in 48.2 seconds, roughly two orders of magnitude faster.
Because the workloads and execution environments differ, the numbers are not a strict apples‑to‑apples comparison, but they clearly demonstrate DuckDB’s analytical advantage when co‑located with MySQL.
Conclusion
MySQL’s architecture makes it exceptionally reliable for transactional workloads, yet its design hampers analytical performance. Enterprises now demand a single system that can handle both OLTP and OLAP while preserving MySQL compatibility and operational stability. By embedding DuckDB, AliSQL delivers exactly that: resource‑isolated analytical processing, full MySQL protocol compatibility, and a dramatic speedup for complex queries without changing application code.
References
https://github.com/duckdb/duckdb
https://duckdb.org/pdf/SIGMOD2019-demo-duckdb.pdf
https://duckdb.org/2025/06/06/advanced-sorting-for-fast-selective-queries
https://github.com/alibaba/AliSQL/wiki
http://www.tpc.org/tpch
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
