Databases 15 min read

How Ctrip Built a Fast MySQL-Based Time‑Series Storage Engine (CFL)

This article details Ctrip's motivation, design, implementation, performance evaluation, and future plans for a custom MySQL storage engine called CFL that efficiently stores time‑series data by leveraging MySQL's replication and a sequential write‑optimized file format.

ITPUB
ITPUB
ITPUB
How Ctrip Built a Fast MySQL-Based Time‑Series Storage Engine (CFL)

Introduction

The talk, originally presented at DTCC 2016, describes Ctrip's practical experience of building a dedicated time‑series storage engine (CFL) on top of MySQL to overcome limitations of existing solutions.

Time‑Series Database Background

Time‑series data are records stamped with timestamps. Typical use cases include monitoring data from power plants, industrial equipment, and server metrics such as CPU and disk usage. Commercial products like Graphite, OpenTSDB, and InfluxDB already exist, but they suffer from non‑uniform interfaces and operational overhead.

Motivation for a Custom Engine

Ctrip needed a solution that could reuse existing MySQL operational expertise, avoid rewriting application code when switching databases, and provide high‑performance sequential writes without the complexity of external time‑series systems.

CFL Engine Architecture

The engine is implemented as a MySQL storage engine that plugs into the MySQL server. Applications continue to read/write to master or slave databases as usual; the CFL engine intercepts writes, stores data in its own file format, and relies on MySQL's native replication to propagate changes to slaves.

Key components:

Buffer queue that batches incoming rows before flushing to disk.

Separate index and data files; the index stores timestamps and control headers, while the data file holds the raw time‑series values.

Sequential write pattern ensures high insert throughput.

Design Details

Design goals focus on fast insertion and query, index‑data separation, sequential writes, and minimal transaction support (no ACID guarantees). The engine does not support updates; data are immutable once written. Recovery simply reads the latest complete file.

File layout: first write the data block, then the timestamp, followed by a control header. When a new timestamp arrives, the control header is updated, guaranteeing consistency even after crashes.

Performance Evaluation

Benchmarks on a dual‑core SSD VM compare CFL with MyISAM and InnoDB using single‑, three‑, and six‑thread workloads. CFL shows superior insert throughput, especially when the buffer queue is sufficiently large.

Limitations & Constraints

No support for row updates.

Delete performance is lower than InnoDB.

Cannot ingest out‑of‑order timestamps (no expiration handling).

Insertion uses a table‑level lock, limiting concurrency.

Requires a dedicated key_timestamp column for indexing.

Future Work

Add multi‑column (tag) indexes similar to InfluxDB.

Implement batch insert optimizations.

Improve delete efficiency.

Introduce multi‑threaded disk writes.

MySQL Handler Integration

The engine implements the MySQL Handler interface, providing essential DDL and DML functions:

DDL: ha_open, ha_close, create, drop_table, truncate.

DML: write_row, delete_row, scan (full‑table scan).

Index scan interfaces: index_read, index_next, enabling range queries.

Handler objects are pooled per table; each session obtains its own Handler, allowing concurrent access while reusing handlers when idle.

Case Study

A real‑world scenario with an 8K‑partitioned table caused memory pressure due to per‑partition Handler allocation in MySQL 5.6. Compressing the partition size reduced memory usage from an estimated 400 GB to within the available 128 GB.

Conclusion

CFL demonstrates that a MySQL‑based storage engine can efficiently handle time‑series workloads by exploiting sequential writes and existing MySQL replication, though it currently lacks update support and advanced indexing features.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Storage EnginemysqlTime SeriesCFL
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.