Databases 23 min read

Why MySQL Excels at Transactions but Falters in Analytics – DuckDB to the Rescue

The article examines why MySQL’s transaction layer remains robust while its analytical capabilities lag, outlines enterprise demands for simultaneous OLTP and OLAP workloads, introduces DuckDB’s lightweight yet powerful engine, describes AliSQL’s integration of DuckDB for full MySQL compatibility, and presents TPCH benchmark results showing a dramatic performance boost for analytical queries.

ITPUB
ITPUB
ITPUB
Why MySQL Excels at Transactions but Falters in Analytics – DuckDB to the Rescue

MySQL: Strong Transactions, Weak Analytics

MySQL has been one of the most successful database products for over two decades, thanks to its clear three‑layer architecture (connection, execution engine, storage engine) and thread‑based execution model, which give it high stability and performance for high‑concurrency, small‑transaction workloads typical of internet applications.

Transaction stability

Since the InnoDB engine matured, MySQL provides strong ACID guarantees, row‑level locking, and efficient concurrency control, making it the de‑facto standard for online transaction processing (OLTP) in the cloud era.

Analytical pain points

MySQL’s analytical processing (AP) capability is limited. Its storage engine (row‑store, B‑Tree) favors selective reads and does not handle large sequential scans well. The execution engine relies on scalar, waterfall processing, which cannot fully exploit modern SIMD‑friendly CPUs. The optimizer is tuned for OLTP workloads, so complex joins, inaccurate statistics, and huge tables often produce sub‑optimal plans.

Large scans pollute the buffer pool and evict hot OLTP data.

Long‑running queries hold resources, amplifying lock and scheduling contention.

CPU and I/O are consumed by analytical tasks, causing noticeable transaction latency jitter.

In production, these analytical queries can cripple transaction performance. Meanwhile, modern enterprises increasingly require a single database that can serve both OLTP and OLAP workloads.

User demand: mixed workloads

According to DB‑Engines, there are now more than 400 database products worldwide, and users expect three things from a database: support for mixed OLTP + OLAP workloads, resource isolation between the two, and full compatibility with existing MySQL applications.

Mixed OLTP & OLAP

Product data must be used for order processing and also for analytical reporting.

Order tables drive payment and fulfillment while also feeding business intelligence dashboards.

User‑behavior logs need to support real‑time decisions and offline analysis.

MySQL’s buffer pool caches both data and indexes; heavy OLAP scans consume large portions of memory, reducing OLTP hit rates. The optimizer’s join and aggregation strategies are not tuned for big‑table analytics, and joins such as hash join, nested loop join or sort‑merge join often degrade to inefficient plans.

Typical solutions copy transactional data to a separate analytical system via ETL, which introduces latency, complexity, and higher operational cost.

DuckDB: Small footprint, high compute power

DuckDB is an in‑process, embedded OLAP database often called “SQLite for analytics”. It is released under the MIT license.

https://github.com/duckdb/duckdb

Small size

The binary is only a few dozen megabytes. In a local test the DuckDB binary is 55 MiB, whereas MySQL 8.0’s binary is about 1.3 GiB – a 25× difference.

55M      duckdb
1.3G      mysql/bin

DuckDB has no separate server process, no cluster manager, and can be linked directly into an application, giving it a very low deployment and operational overhead.

High compute power

Columnar storage

Vectorized execution engine

Batch operator processing

CPU‑friendly memory layout

On typical analytical operators (scan, filter, aggregation, join) DuckDB fully utilizes CPU caches and SIMD instructions. Its design focuses on read‑intensive, compute‑heavy workloads rather than high‑concurrency writes, making it a natural complement to MySQL.

AliSQL + DuckDB: Meeting all three demands

AliSQL, the open‑source brand of Alibaba Cloud RDS MySQL, has been revived and now integrates DuckDB as an embedded analytical engine. The architecture provides:

One database instance with two engines (InnoDB for OLTP, DuckDB for OLAP) and physical isolation.

100 % MySQL protocol and syntax compatibility – applications need no code changes.

The following diagram illustrates the AliSQL‑DuckDB architecture:

AliSQL DuckDB architecture diagram
AliSQL DuckDB architecture diagram

Performance test with TPCH

The TPCH benchmark (v3.0.1, 100 GB) was run on an Alibaba Cloud RDS MySQL instance (8 Cores, 16 GB RAM, high‑performance SSD). DuckDB was used as a read‑only analytical engine.

Test method

TPCH version 3.0.1, scale factor 100 (≈100 GB).

22 analytical queries, no data updates.

Data generation

./dbgen -s 100 -f -v

Table creation (MySQL)

CREATE TABLE region (
  r_regionkey INT NOT NULL,
  r_name CHAR(25),
  r_comment VARCHAR(152),
  PRIMARY KEY (r_regionkey)
) ENGINE=InnoDB;

-- (similar CREATE statements for nation, supplier, part, partsupp, customer, orders, lineitem) ...

Data import

load data local infile '/data/tpch-v3/dbgen/region.tbl' into table region fields terminated by '|';
load data local infile '/data/tpch-v3/dbgen/nation.tbl' into table nation fields terminated by '|';
... (remaining tables) ...

Sample query (Q1)

-- Q1
SELECT
    l_returnflag,
    l_linestatus,
    sum(l_quantity) AS sum_qty,
    sum(l_extendedprice) AS sum_base_price,
    sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
    sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
    avg(l_quantity) AS avg_qty,
    avg(l_extendedprice) AS avg_price,
    avg(l_discount) AS avg_disc,
    count(*) AS count_order
FROM lineitem
WHERE l_shipdate <= CAST('1998-09-02' AS date)
GROUP BY l_returnflag, l_linestatus
ORDER BY l_returnflag, l_linestatus;

All 22 TPCH queries were executed on both engines. Results:

InnoDB (MySQL) took ~4 000 seconds for the 22 queries.

DuckDB completed the same 22 queries in 48.2 seconds, roughly two orders of magnitude faster.

Because the workloads and execution environments differ, the numbers are not a strict apples‑to‑apples comparison, but they clearly demonstrate DuckDB’s analytical advantage when co‑located with MySQL.

Conclusion

MySQL’s architecture makes it exceptionally reliable for transactional workloads, yet its design hampers analytical performance. Enterprises now demand a single system that can handle both OLTP and OLAP while preserving MySQL compatibility and operational stability. By embedding DuckDB, AliSQL delivers exactly that: resource‑isolated analytical processing, full MySQL protocol compatibility, and a dramatic speedup for complex queries without changing application code.

References

https://github.com/duckdb/duckdb

https://duckdb.org/pdf/SIGMOD2019-demo-duckdb.pdf

https://duckdb.org/2025/06/06/advanced-sorting-for-fast-selective-queries

https://github.com/alibaba/AliSQL/wiki

http://www.tpc.org/tpch

MySQLOLAPOLTPDuckDBTPCHAliSQL
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.