Why Oracle Log File Sync Bottlenecks Appear and How to Eliminate Them
During high‑concurrency flash‑sale events, Oracle’s log file sync became a performance bottleneck; the article analyzes storage, OS, and Oracle Disk Manager factors, presents AWR metrics, demonstrates tuning steps—including disabling adaptive log file sync and enabling ODM—and shows measurable latency reductions.
Introduction
Relational databases rely on ACID transactions, and Oracle writes redo information to a log before writing data. Log writes are sequential I/O, while data writes are random I/O; on mechanical disks random I/O is far slower, making the log file sync operation a potential performance bottleneck during high‑concurrency workloads such as flash‑sale events.
Problem Observation
At midnight a flash‑sale started with tens of thousands of concurrent users. The connection pool was exhausted, CPU reached 100%, and many wait events appeared. A 15‑minute AWR snapshot showed:
Redo size 11.8 MB/s
~1 612 transactions per second
~4.3 × 10⁴ executions per second
Log file sync contributed 12.1 % of DB time
Images of the AWR report illustrate these metrics:
Additional screenshots show average wait times: log file sync 44 ms, log file parallel write 9 ms.
Analysis and Measures
1. Storage Layer
LGWR writes to the online redo log. The storage SLA promised 0.5 ms write latency, but observed latency often reached 1‑3 ms, especially for large redo writes. Switching to a faster storage system did not yield noticeable improvement because the underlying file‑system (VXFS) introduced additional locks (vx_rwsleep_rec_lock) that limited throughput.
2. Operating System Layer
Using truss it was discovered that LGWR blocks on KAIO() and pwrite(). Normal pwrite latency is 0.0017 ms, but under load it can exceed 1.5 s. VXFS’s vx_rwsleep_rec_lock() also caused blocking. DTrace was employed to capture the LGWR call stack and confirm the contention points.
3. Oracle Disk Manager (ODM) Background
ODM bypasses the file‑system cache and locks, allowing Oracle to perform direct I/O to raw volumes. This can deliver performance comparable to raw devices while still using manageable file‑system storage.
4. Enabling ODM
Enabling ODM improves write latency but removes the benefit of OS cache for physical reads. After enabling ODM, db file sequential read latency increased from ~2 ms to ~6 ms, though overall transaction throughput improved.
5. Effect of Enabling ODM
Before and after screenshots show the impact:
Before:
After:
Log file parallel write latency dropped from ~1 ms to 0.3 ms, and log file sync latency fell from >1.5 ms to <1 ms. The increase in db file sequential read wait time did not noticeably affect application response.
6. Other Influencing Factors
Additional considerations include process priority (e.g., setting _high_priority_processes for LGWR), log file switch frequency (adjusting log_buffer and online redo log size), and the impact of archive log size mismatches.
7. Recommended Configuration
16 MB ≤ log_buffer ≤ min(128 MB, max(AUTO_SIZE,16 M))
300 MB ≤ online redo log file size ≤ 1024 MB
AUTO_SIZE = (cpu_count/16) × (cpu_count × 128)
These ranges balance buffer size, strand count, and I/O characteristics for typical Oracle deployments.
Disabling Adaptive Log File Sync
When log file sync latency is high (7‑8 ms) while log file parallel write remains low (1‑3 ms), the adaptive mechanism may be the cause. To disable it, execute:
alter system set "_use_adaptive_log_file_sync"=false scope=both;After disabling, log file sync average wait time typically drops from ~7 ms to ~3 ms, and the related AWR or v$sysstat counters go to zero.
Final Results
With ODM enabled, adaptive log file sync disabled, and storage/OS parameters tuned, the average log file sync wait time under peak load fell to ~2 ms, while other metrics (read IOPS, write IOPS, read response time) improved significantly. The overall database performance became stable enough to handle >70 k SQL executions per second and 3 k transactions per second during subsequent flash‑sale events.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
