How I Rescued a Production MySQL Database After a Fatal rm -rf Disaster
After a mistaken rm -rf command wiped an entire production server—including MySQL data—the author chronicles a step‑by‑step recovery using ext3grep, custom scripts, and binlog restoration, highlighting lessons learned and best practices for future incident handling.
Accident Background
A junior staff member was tasked with reinstalling Oracle on a production server. While attempting to uninstall, she executed rm -rf $ORACLE_BASE/* without setting the ORACLE_BASE variable, which effectively became rm -rf /*, deleting the entire filesystem, including Tomcat, MySQL databases, and other critical services.
The team discovered that offline backups were outdated (last good backup from December 2013) and only a 1 KB dump file existed, leaving the production system in a dire state.
Rescue with ext3grep
Searching online revealed ext3grep , a tool capable of recovering files deleted on ext3 filesystems. After unmounting the affected volume to prevent further writes, the team ran: ext3grep /dev/vgdata/LogVol00 --dump-names This listed all deleted files and paths, giving hope that the data could be restored.
Since ext3grep cannot restore by directory, they executed a full restore: ext3grep /dev/vgdata/LogVol00 --restore-all Disk space ran out, so they restored individual files, e.g.:
ext3grep /dev/vgdata/LogVol00 --restore-file var/lib/mysql/aqsh/tb_b_attench.MYDTo automate restoration of MySQL tables, they dumped the file list:
ext3grep /dev/vgdata/LogVol00 --dump-names > /usr/allnames.txtand filtered MySQL‑related entries into mysqltbname.txt. Then they used a shell script:
while read LINE; do echo "begin to restore file $LINE"; ext3grep /dev/vgdata/LogVol00 --restore-file $LINE; if [ $? != 0 ]; then echo "restore failed, exit"; fi; done < ./mysqltbname.txtThe script recovered about 40 files, but many MySQL table files were still missing.
Attempt with extundelete
They also tried extundelete :
extundelete /dev/vgdata/LogVol00 --restore-directory var/lib/mysql/aqshUnfortunately, the files were corrupted and could not be recovered.
Binlog Recovery
Remembering that MySQL binlogs were enabled, the team located three binlog files ( mysql-bin.000001, mysql-bin.000009, mysql-bin.000010). After a failed attempt with the first file, the last binlog ( mysql-bin.000010) was successfully applied: mysqlbinlog /usr/mysql-bin.000010 | mysql -uroot -p This restored the most recent transactions, bringing the application back online.
Post‑mortem and Lessons Learned
Never let an untrained person perform production maintenance without clear instructions and supervision.
Ensure automated backups are verified regularly; a 1 KB dump is insufficient.
Implement monitoring and alerting to detect anomalies early.
Never operate critical services as the root user; use principle‑of‑least‑privilege accounts.
The incident highlighted the importance of proper change management, reliable backups, and rapid incident response.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
