Databases 16 min read

Can Distributed Logging Revamp Single-Node Database Performance?

This article examines the evolution of write-ahead logging in relational databases, compares MySQL and PostgreSQL implementations, highlights the scalability limits of centralized WAL designs, and explores how distributed logging with global timestamps, dependency tracking, and hole‑avoidance techniques could dramatically improve single‑node database throughput.

dbaplus Community
dbaplus Community
dbaplus Community
Can Distributed Logging Revamp Single-Node Database Performance?

1. Write-Ahead Logging (WAL)

Write‑Ahead Logging is a common transaction‑log technique that guarantees modifications to data files occur only after the corresponding log records have been persisted, enabling fast transaction commit without forcing dirty pages to disk and allowing recovery via redo (REDO) after a crash.

2. WAL in MySQL

MySQL uses a Mini‑Transaction (MTR) system where log records are first written to a private buffer and later flushed to a shared log buffer. The commit process involves two main steps: (1) calculating the redo record length while holding log_sys->mutex (implemented by prepare_write), and (2) copying the log record into the shared buffer ( log_sys->buf). In high‑write workloads this lock becomes a severe bottleneck.

/* Simplified MySQL MTR commit steps */
// 1. Compute redo length and acquire log_sys mutex
prepare_write(...);
// 2. Copy log record to shared log buffer
log_sys->buf = ...;

Optimizations that reduce the critical section, such as abstracting the lock to a few lines of code, can improve throughput by up to 30% as demonstrated in PostgreSQL.

3. WAL in PostgreSQL

PostgreSQL reserves space for log records using a spin‑lock to minimize lock hold time. The algorithm updates byte positions and then releases the lock, allowing parallel log insertion.

SpinLockAcquire(&Insert->insertpos_lck);
startbytepos = Insert->CurrBytePos;
endbytepos = startbytepos + size;
prevbytepos = Insert->PrevBytePos;
Insert->CurrBytePos = endbytepos;
Insert->PrevBytePos = startbytepos;
SpinLockRelease(&Insert->insertpos_lck);

Parallel log insertion reduces transaction wait time, as shown in the accompanying diagrams (Figures 2a and 2b).

4. Problems of Centralized WAL Design

Centralized log managers become hot spots on multi‑core systems, limiting scalability. Even with PostgreSQL’s optimizations, high concurrency still leads to contention and uneven performance, illustrated in Figure 3.

5. Distributed Logging Concept

To alleviate the bottleneck, the log subsystem can be split into N independent log managers, enabling parallel log appends. This design raises three key challenges:

Global ordering: LSNs are no longer comparable across managers, requiring a system‑wide sequence number (SSN) that increments on each insert.

Dependency tracking: Pre‑commit transactions must ensure that dependent transactions do not expose dirty data, which demands explicit dependency graphs.

Log holes: Different managers flush at different rates, risking missing log records during recovery.

6. Potential Solutions

Replacing LSN with SSN provides a monotonic global timestamp. Checkpoint synchronization can merge logs from all managers, while careful scheduling (e.g., hashing transaction IDs to managers) balances load. To avoid log holes, the system can flush all buffers at transaction commit or checkpoint creation.

7. Conclusion

Hardware advances such as multi‑core CPUs and Non‑Volatile Memory have outpaced the original single‑processor, disk‑I/O‑centric DBMS designs. Distributed logging offers a promising path to exploit modern hardware, but requires robust global ordering, dependency management, and hole‑avoidance mechanisms before it can be widely adopted in production systems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance OptimizationmysqlPostgreSQLWALdatabase systemsdistributed logging
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.