Databases 8 min read

Optimizing Time-Series Storage: Files, LSM Trees, and B‑Tree Strategies

This article examines the evolution and challenges of time‑series storage, compares file‑based, LSM‑tree, and B‑tree approaches, and proposes a vector‑based method to efficiently handle writes, reads, query semantics, dimensions, and aggregation for modern big‑data applications.

21CTO
21CTO
21CTO
Optimizing Time-Series Storage: Files, LSM Trees, and B‑Tree Strategies

Time‑series models and graph patterns existed before computers, but they only began to mature in the early 1990s with the advent of MRTG. Their growth is driven by the rise of "big data" beyond sheer volume, the increase of distributed nodes from virtualization and containers, and cloud services challenging Moore's law.

Time‑Series Storage Layer

People often ask themselves:

Is there optimization for writes?

Is there optimization for reads?

What query semantics must a database support to deliver the desired results?

If the answers to the first two questions are "yes," the storage layer must optimize both writes and reads simultaneously.

When attempting multiple series writes to an indexed database—imagine a SQL table with primary key, timestamp, and value—the read path must process massive data. Writing series separately can lead to the worst‑case pattern: high‑frequency cross‑data writes.

I see some developers' approaches:

Use files (e.g., RRD, Whisper)

Use LSM trees for backup (e.g., LevelDB, RocksDB, Cassandra)

Use B‑tree sorting and key/value stores (e.g., BoltDB, LMDB)

These methods have pros and cons. An alternative treats a time series as a multi‑point vector, simplifying the design:

Create a new vector

Find a different vector

Append to the vector (O(1))

Read data from the vector

Using Files

All databases are fundamentally file‑based, but here "file" means a series‑as‑a‑file approach. This yields efficient buffered appends and linear reads. Advanced file systems (ZFS, ext4, XFS) use hash‑based directory lookups, providing O(1) access.

Modern file systems have become complex, requiring specialized tricks for small‑file optimization. Without such tuning, queries on sparse data may traverse many empty pages, impacting backup and batch jobs.

Using Tree‑Based Storage

Tree‑based key/value stores can identify a single series, a point, or a time slice. Regardless of key design, lookups are O(log n) and leave many gaps in tables.

For large‑scale storage, excessive points strain CPU caches; representing data as {series, timestamp} → {point} pairs becomes inefficient. For small datasets, open‑source time‑series stores try similar methods, but file‑based solutions are often simpler and faster unless massive reads are required. O(log n) lookups can be costly for read‑heavy workloads.

Write performance remains a challenge. Double‑B‑tree copy‑on‑write methods struggle with high‑frequency random writes. Frequent mmap and tiny writes cause heavy page reads, pushing f‑msync I/O to high load. LSM may alleviate this, but tree merge and compaction are intensive operations.

Overall, trees can balance writes and reads, offering a good complement.

Query Semantics

The essential semantics for time‑series storage are bulk and parallel reads; additional features like compressed reads, external implementations, and storage constraints are nice‑to‑have.

Dimensions

Dimensions are critical; lacking support can render a product obsolete. Storing all metadata in metric keys and using simple key/value lookups is suboptimal. At minimum, logical AND/OR operations are needed for cross‑dimensional queries, e.g., "az=us-east-1 AND role=db‑master, version=2.1 OR version=2.3". Rich query languages further improve usability.

For the storage layer, supporting arbitrary additional semantics to enable these optimizations is desirable. Indexes may point to large unrelated tuples or embed data, slowing writes. SSDs help but still require reading many pages; data placement remains an issue.

Aggregation, Expiration, etc.

Time aggregation and data expiration are vital in polling schedules, as seen in open‑source tools like rrdtool and Graphite.

Open challenges include handling sparse series, compression/aggregation overhead on writes, and implementing aggregation and expiration as asynchronous strategies in a separate policy layer.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

databasesstorageTime SeriesLSMaggregationfile-based
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.