Databases 11 min read

How to Rescue a Failed Exadata Database Using AMDU and DUL: Step‑by‑Step Guide

When an Exadata diskgroup fails to mount, this article walks through using AMDU to extract control, data, and log files, handling OMF and custom formats, advancing SCN, and finally opening the database, while also recommending preventive tools and regular maintenance.

dbaplus Community
dbaplus Community
dbaplus Community
How to Rescue a Failed Exadata Database Using AMDU and DUL: Step‑by‑Step Guide

Incident Overview

At 01:30 a critical Exadata X2 failure prevented the ASM diskgroup from mounting, reported cell disk status “unknown”, ASM disk headers were invalid and several physical disks were damaged, affecting ~10 TB of data.

Pre‑Recovery Recommendations

Ensure recent backups and a Data Guard configuration before attempting recovery, especially on aging Exadata hardware.

Recovery Strategy

Three‑phase approach: (1) extract control, data and log files from the damaged diskgroup using AMDU, (2) attempt to mount and open the database, (3) if the open fails, use DUL or related utilities for deeper recovery.

Extracting Files with AMDU

AMDU (ASM Metadata Dump Utility) works on Oracle 11g and later without compilation.

Locate startup parameters (including control_file) in alert.log and create a temporary pfile (e.g., /tmp/pfile).

Identify the control file path from the pfile and run: amdu -diskstring '/o/*/*' -extract data.266 This creates DATA_266.f (the control file) and report.txt.

Mount the database using the extracted control file.

Query the mounted instance to list data and redo log files:

select name from v$datafile;</code><code>select member from v$logfile;

Data files appear in two naming conventions:

OMF format – automatically generated numeric suffixes, e.g. +DATA/exdb/datafile/system.256.278946847955.

Custom format – user‑defined names, e.g. +DATA/exdb/datafile/users_2013084.dbf.

Extract OMF files with a command similar to: amdu -diskstring '/o/*/*' -extract data.256 For custom‑named files, dump the metadata of the corresponding diskgroup (e.g., DATA.6) and locate the file number: amdu -extract DATA.6 -diskstring 'o/*/DATA' After obtaining DATA_6.f, read block information to map file numbers to names, for example:

for i in $(seq 1 14); do
  kfed read DATA_6.f blknum=$i | egrep 'name|fnum' >> aa.out
done

Repeat the extraction for all discovered data files. A 3 TB bigfile required ~24 hours to extract over a slow NFS mount.

Validate the extracted files with dbv to detect physical bad blocks.

Opening the Database

Attempt to open the mounted database. Common errors include ORA‑1555 and ORA‑704. Required actions:

Gather initialization parameters from the pfile or alert log.

Recover missing rollback segments. Options: (a) search the system file with strings, or (b) use DUL/AUL/ODU/GDUL to dump SYS_UNDO.dmp, import it into a temporary user, and remove entries with status 1 or 2.

If ORA‑1555 persists, advance the SCN. Locate the SCN address with oradebug poke, add an offset (e.g., 1,000,000 decimal → hex), then repeatedly execute: alter database open upgrade; Optional low‑level fixes: modify corrupted blocks with bbed and delete stale dictionary records.

After applying these steps the database opened and data was recovered.

Additional Recovery Utilities

Exadata maintenance tools that can be run periodically (e.g., bi‑weekly) to detect hardware or firmware issues include:

sundiag

ExaWatcher

Diskinfo, IBCardino, Iostat, Netstat, Ps, Top, Vmstat

Exachk

CheckHWnFWProfile

OracleASMDatabase RecoveryExadataData GuardAMDUDUL
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.