Why Do Seagate SAS Disks Slow Down on Exadata When Temperature Drops Below 20°C?
A detailed case study examines an Exadata X3‑2 deployment where CALIBRATE tests revealed intermittent disk performance drops linked to low temperatures on Seagate SAS drives, explains why moving disks between nodes can restore performance, and clarifies the normalcy of slight MB/s variations in slots 0 and 1.
Case Overview
A temporary Exadata X3‑2 customer encountered an error during the onecommand initialization at step 6 when the CALIBRATE command failed to assess storage‑node disk performance. The original Oracle hardware engineer could not provide support, so the customer turned to the author, an Oracle Exadata specialist, for analysis.
Problem Statement
Why do disks that previously performed poorly regain normal performance after being swapped to another storage node?
Is it normal for disks in slot 0 or slot 1 to occasionally show MB/s values a few tens lower than other disks?
Data Collected
The customer supplied onecommand error logs, sundiag logs showing a Medium Error on /dev/sdq, and screenshots where MBPS reported 0 for certain disks. All images below illustrate the raw logs and test results.
Analysis of Question 1
The problematic disks were Seagate ST360057SSUN600G SAS drives. Log analysis showed that when the storage‑node temperature fell below 20 °C, these disks exhibited a noticeable performance dip. After moving the same disks to a node where the temperature stayed above 20 °C (or after rebooting the OS), performance returned to normal.
Two conditions were identified that restore normal performance:
The disk temperature must remain at or above 24 °C continuously for at least 30 minutes.
The temperature must be above 20 °C and the operating system must be rebooted.
Analysis of Question 2
Batch CALIBRATE runs across all storage nodes showed occasional slight MB/s reductions for slot 0 and slot 1, but all values stayed within Oracle’s accepted range.
Separate CALIBRATE runs targeting only slot 0 and slot 1 disks did not reveal any degradation, indicating the disks themselves were healthy.
Further testing with the hdparm utility on Cell02’s slot 0 and slot 1 produced average read speeds of 177.456 MB/s and 178.146 MB/s respectively, slightly lower than other disks (≈189.6 MB/s). The difference is attributed to the fact that slots 0 and 1 host the operating system, consuming part of the I/O bandwidth.
Sample hdparm results:
Slot 0 average: (177.06+178.58+175.69+178.64+177.31)/5 = 177.456 MB/s
Slot 1 average: (189.09+176.97+174.36+173.54+176.77)/5 = 178.146 MB/s
Conclusion
1. The intermittent performance loss of the Seagate SAS disks is temperature‑related; keeping the disks above 20 °C (or above 24 °C for 30 minutes) or rebooting the OS restores performance.
2. The occasional MB/s drop observed in slot 0 and slot 1 is normal, caused by OS I/O overhead, and remains within Oracle’s acceptable limits.
Further Thoughts
Three years earlier, the customer’s X3‑2 was fully refreshed with Hitachi drives after Oracle identified similar temperature‑induced issues with Seagate disks. Oracle and Seagate are collaborating on firmware updates to mitigate the temperature‑sensitivity of future SAS drives.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
