Recovering Deleted MySQL Data Files in CDH: A Step‑by‑Step Guide
This guide walks through diagnosing a MySQL metadata loss in a CDH cluster, locking the database, locating deleted file handles, restoring shared and independent tablespace files, fixing table structures, verifying consistency, and explains the underlying Linux and MySQL mechanisms.
Background
MySQL serves as the metadata store for CDH, holding Hive table definitions, partitions, data locations, and permissions. When the MySQL 5.5.6 replica failed and the primary data directory was mistakenly deleted, the entire big‑data platform became inoperable.
1. Lock the Database
Prevent further writes while recovering:
SET GLOBAL read_only = 1;</code>
<code>FLUSH TABLES WITH READ LOCK;2. Locate Deleted File Handles
Identify the data directory and list open file descriptors that still reference the deleted files:
SHOW VARIABLES LIKE 'datadir';</code>
<code>lsof | grep deleted | grep "/usr/local/mysql-5.5.56-linux-glibc2.5-x86_64/data" | awk '{print $2, $5, $10}' > /tmp/deleted_files.txt3. Recover Data Files
3.1 Restore shared tablespace files
# Recover ibdata1 (use the actual fd number)
cat /proc/$pid/fd/64 > /var/lib/data/ibdata1
chown mysql:mysql /var/lib/data/ibdata1
# Recover log file (e.g., ib_logfile0)
cat /proc/$pid/fd/65 > /var/lib/data/ib_logfile0
chown mysql:mysql /var/lib/data/ib_logfile03.2 Restore independent tablespace files
# Example: recover mydb.tab1.ibd
cat /proc/$pid/fd/66 > /var/lib/data/mydb/tab1.ibd
chown mysql:mysql /var/lib/data/mydb/tab1.ibd3.3 Restore table‑structure (.frm) files
# Copy .frm from backup (must match the original table definition)
cp /path/to/backup/mydb/tab1.frm /var/lib/data/mydb/
chown mysql:mysql /var/lib/data/mydb/tab1.frm4. Verify .frm and .ibd Consistency
For each restored table:
# 1. Discard the current tablespace (table must exist)
ALTER TABLE table_name DISCARD TABLESPACE;
# 2. Import the recovered tablespace; MySQL will automatically check that the .frm matches the .ibd
ALTER TABLE table_name IMPORT TABLESPACE;5. Technical Principles
5.1 Linux file‑deletion mechanism – rm only removes the directory entry; the file’s data remains on disk as long as a process holds an open file descriptor, which can be accessed via /proc/<pid>/fd.
5.2 MySQL’s reliance on file handles – InnoDB keeps .ibd files open while the server runs; after deletion the files appear as “deleted” but their contents stay until the handle is closed.
5.3 Version differences
MySQL 5.6+ enables innodb_file_per_table by default, using independent .ibd files.
MySQL 8.0+ removes .frm files; table definitions are stored in the system tablespace ( mysql.ibd).
6. Summary and Recommendations
The incident exposed the fragility of metadata management and gaps in operational procedures. Because CDH heavily depends on MySQL, loss of metadata cripples the cluster. Recommended safeguards include:
Treat MySQL as a “core production component” and apply strict access controls (e.g., sudo audit).
Implement multi‑level backups (local, cloud, binary logs).
Automate health checks such as tablespace integrity verification.
Upgrade legacy versions (MySQL 5.5) to avoid tool‑chain incompatibilities.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
