Understanding Hard Disk Architecture and Performance Optimization
The article explains hard‑disk physical structure, how seek time, rotational latency and transfer rate determine IOPS and throughput, and describes Linux I/O stack optimizations—such as caching, scheduling, and append‑only or LSM‑tree designs—to mitigate random‑access bottlenecks and improve overall storage performance.
Background
Over the past decade, CPU frequencies have exceeded 3 GHz and DDR4 memory is widespread, but traditional hard‑disk I/O speed remains a bottleneck for overall system performance. This article examines the physical structure of hard disks, key performance factors, and OS‑level optimizations.
Physical Structure of Hard Disks
Hard disks consist of platters, a spindle motor, read/write heads, and an actuator arm. Data is stored on the platter surfaces in concentric tracks, cylinders, and sectors (512 B each). Modern disks use Zoned Bit Recording (ZBR) and Logical Block Addressing (LBA) to handle variable sector counts per track.
Factors Affecting Disk Performance
Disk service time consists of seek time, rotational latency, and data‑transfer time. Typical seek times are 3‑15 ms; rotational latency depends on RPM (e.g., 4.17 ms for 7200 RPM). Transfer time is usually negligible compared with the other two.
Performance Metrics
Key metrics are IOPS (I/O operations per second) and throughput. Random I/O is limited by seek and rotation, while sequential I/O is limited mainly by transfer bandwidth. Example IOPS: 7200 RPM ≈ 76 IOPS, 15000 RPM ≈ 166 IOPS.
Operating‑System Layer Optimizations
Linux processes a read request through several layers: VFS → specific filesystem (e.g., Ext2) → Page Cache → Generic Block Layer → I/O Scheduler → Block Device Driver → Physical device.
The VFS provides a uniform interface, while the page cache reduces I/O by caching frequently accessed data. The generic block layer abstracts hardware details, and the I/O scheduler (e.g., CFQ, Deadline, Noop) merges and reorders requests to minimize seek time.
Design Techniques Based on Disk I/O Characteristics
Because random reads are slow, many systems use append‑only writes to turn random writes into sequential writes (e.g., HDFS, Kafka). For read‑heavy workloads, techniques such as log‑structured merge trees (LSM‑tree) and file merging improve write throughput while providing acceptable read performance.
Small‑file handling is optimized by merging many tiny files into large containers and by simplifying metadata (e.g., Taobao’s TFS stores many small files in a 64 MiB block, reducing inode usage).
Conclusion
The article surveys hard‑disk physical characteristics, OS‑level optimizations, and practical design patterns used in open‑source systems to mitigate I/O bottlenecks, offering guidance for storage‑system architects.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Meituan Technology Team
Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
