How to Diagnose and Fix Oracle RAC/ADG Performance Issues on CentOS 7
This article walks through a real‑world case of an Oracle RAC/ADG deployment on CentOS 7 that suffered severe performance degradation, detailing the root‑cause analysis, OS and listener tuning, patch installation, cache‑reclamation settings, and compatibility fixes to restore stability.
Background
A three‑node Oracle RAC had been running for three years when business growth caused the database to slow down. To offload peak‑time queries, a single‑node Active Data Guard (ADG) was added, but the new read‑only workload quickly triggered multiple failures and even caused the database to crash.
Problem 1 – Listener connection overload
After migrating the read‑only service, the listener received a flood of short connections, producing TNS errors. The root causes were:
OS limits too low (nproc);
Kernel pid_max too small;
Listener queue size insufficient for the high concurrency.
Fixes applied:
sed -i 's/4096/131072/g' /etc/security/limits.d/20-nproc.confEdited /etc/sysctl.conf to increase kernel.pid_max = 131072. Adjusted the listener’s QUEUESIZE parameter (default 128) to 512 after confirming with strace -fo /tmp/queue.log lsnrctl start listener.
Problem 2 – Library cache lock and ORA‑00600 errors
The ADG instance exhibited frequent library cache locks and ORA‑00600 errors (e.g., kgllkde‑bad‑lock, kss_get_type: bad control). The team installed specific Oracle PSU patches (e.g., 24385983, 18515268, 19180394, 17608518) and consulted the “library cache lock” reference note (Doc ID 34578.1) to select the necessary patches.
Problem 3 – CentOS 7 cache reclamation
High‑concurrency queries caused the OS to trigger aggressive filesystem cache reclamation, raising CPU usage and leading to trace file explosion. The default parameters were:
vm.dirty_ratio = 80 vm.dirty_background_ratio = 5To reduce I/O stalls, the ratios were tuned down (e.g., vm.dirty_ratio = 10) and vm.min_free_kbytes = 10485760 was set. A cron job was added to force cache flush every 15 minutes:
*/15 * * * * sync && echo 3 > /proc/sys/vm/drop_cachesProblem 4 – Compatibility between CentOS 7 and Oracle 11gR2
Even after the above adjustments, nightly crashes persisted, generating hundreds of gigabytes of trace files. Investigation revealed incompatibilities between Oracle 11gR2 and CentOS 7. Migrating the ADG to a Red Hat 6 host with Oracle 11gR2 eliminated the crashes and stabilized resource usage.
Conclusion
The issues stemmed from a combination of OS‑level limits, listener configuration, missing Oracle patches, and an unsuitable OS version. By systematically adjusting kernel parameters, expanding listener queues, applying the correct patches, and moving to a compatible OS, the environment achieved stable performance and eliminated the observed failures.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
