Mastering Oracle Data Guard: Design, Tips, and Common Pitfalls
This article provides a comprehensive guide to Oracle Data Guard, covering its disaster‑recovery concepts, RTO/RPO standards, hardware‑to‑application level design considerations, standby design options, sensitive data‑file operations, switchover/failover advice, SQL auditing with Snapshot Standby, and practical tips for building or rebuilding a Data Guard environment.
Oracle Data Guard Overview
Data Guard provides a physical standby database that mirrors the primary database, enabling planned SWITCHOVER and unplanned FAILOVER. It is a core component of Oracle’s Maximum Availability Architecture (MAA) alongside RAC.
Design Considerations for the Standby
Hardware
Standby hardware is often weaker; insufficient CPU, memory, or disk space can cause performance bottlenecks during switchover.
Hardware failures on the standby prevent timely failover.
System Layer
Operating‑system version mismatches (e.g., Red Hat 6 vs 4) may cause compatibility issues.
Kernel parameters that affect redo apply must be tuned on the standby.
Firewall rules should be identical on primary and standby.
Network Layer
Stable bandwidth is required; large redo files can be lost on poor networks.
Keep tnsnames.ora and listener.ora consistent between primary and standby, preferably using hostnames so only /etc/hosts needs updating.
Unified listener ports simplify failover procedures.
Remote management (iLO) should be reachable on the standby.
Database Layer
Ensure identical Oracle software versions and installation options.
Align the compatible parameter on both sites.
Provide a temporary tablespace on the standby; missing temp tablespaces cause query failures.
Validate DB links and SGA/memory parameters for consistency.
In Oracle 11g, special handling is required for the standby during the MOUNT phase.
Architecture Layer
Avoid running Data Guard and logical backups on the same host to prevent resource contention.
Multiple standbys on a single host increase failover complexity.
DG Broker automates configuration but cannot replace thorough validation; deleted archived logs on the primary may create gaps that the broker cannot detect.
Standalone standbys may need to serve additional roles (e.g., reporting).
Application Layer
Heavy query workloads on the standby can raise CPU load above the primary.
Typical Standby Design Options
One‑primary‑one‑standby in the same data center.
One‑primary‑two‑standbys (local and remote) for added resilience.
Active Data Guard allows read‑only access on the standby; Snapshot Standby enables read‑write operations for testing and SQL auditing.
Sensitive Data‑File Operations on the Standby
Creating a tablespace on the primary may stop the standby MRP if standby_file_management is set to MANUAL.
CREATE TABLESPACE testdata DATAFILE '/DATA/app/oracle/oradata/test04/testdata01.dbf' SIZE 100M;Set standby_file_management=AUTO to avoid this.
Setting tablespace ONLINE/OFFLINE on the primary is not propagated to the standby, leading to mismatched DBA_DATA_FILES entries.
Trailing spaces in data‑file names can cause silent corruption.
Dropping a datafile in Oracle 10.2.0.4 may raise ORA‑00600 (see Bug 5623467). Rebuilding the standby is the recommended fix.
Switchover and Failover Recommendations
Use hostnames in listener.ora and tnsnames.ora so only /etc/hosts needs updating during a role change.
Avoid online hostname changes on the primary.
Ensure identical firewall rules on both primary and standby.
SQL Auditing with Snapshot Standby
Snapshot Standby provides a read‑write standby that can be used to execute and validate SQL statements, exposing logical errors and performance issues that traditional error‑based auditing misses.
Practical Tips for Building or Rebuilding Data Guard
Prefer DG Broker for automated configuration; it handles most log_archive_dest settings.
If the primary loses archived redo, use an RMAN incremental backup to catch up instead of rebuilding the standby.
In multi‑standby environments, consider a Cascade Standby to reduce WAN traffic—only one standby receives redo directly from the primary.
Since Oracle 11g, limited cross‑platform Data Guard configurations are supported (see Oracle Doc ID 1602437.1).
Key Diagnostic Queries
Check DG Broker configuration: DGMGRL> SHOW CONFIGURATION; Query the Data Guard status view for errors:
SELECT TIMESTAMP, FACILITY, SEVERITY, MESSAGE
FROM V$DATAGUARD_STATUS
WHERE SEVERITY='ERROR'
AND TIMESTAMP > SYSDATE - INTERVAL '1' DAY;dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
