Why Do Files Disappear After a Cloud VM Restart? Common Pitfalls and Fixes
This article outlines five typical cloud‑host file‑loss scenarios—including tmpfs data loss after reboot, accidental rm commands, and disk damage from fio or dd—explains their causes, and provides practical prevention and recovery recommendations for operations engineers.
In daily cloud‑host operations, engineers often face user‑reported file‑loss issues that can cause significant inconvenience. The following typical scenarios are summarized to help avoid these risks and prevent repeated mistakes.
Scenario 1: File loss after cloud VM restart
Phenomenon: Users report that data stored on the VM disappears after a reboot.
Analysis: The missing files are located in the /dev directory, which is a tmpfs temporary file system stored in RAM and not persisted to disk, so a reboot clears the data.
Explanation:
What is tmpfs? It is an in‑memory temporary file system offering high performance.
Common tmpfs mounts include /dev, /dev/shm, /sys/fs/cgroup, /run/user/0, etc., typically sized to half of total RAM. Use df to identify tmpfs volumes.
Suggestion:
Use tmpfs only for program or application caches.
Avoid storing persistent data on tmpfs unless data loss is acceptable.
Scenario 2: Accidental rm command execution
Phenomenon: After a reboot, the VM cannot ping, critical services fail to start, and essential system files are missing.
Analysis:
Console shows the OS stuck at boot with many "command not found" errors.
Inspecting /root/.bash_history reveals destructive commands such as rm -rf /, rm -rf ./*, etc., which delete large portions of the system.
Running rpm -Va | grep miss can verify missing files, though files not installed via RPM cannot be checked.
Suggestion:
Use rm cautiously; add the -i flag for interactive confirmation, especially with -r and -f.
If accidental deletion occurs, stop any further writes to the disk immediately to preserve recoverable data.
Restore missing system files by copying them from a healthy machine or reinstalling packages with yum reinstall <package> or yum update <package>.
Scenario 3: File‑system damage caused by fio
Phenomenon: Users report MySQL database anomalies on the cloud host.
Analysis:
High IOWAIT observed, suggesting I/O throttling issues.
Even after lifting disk limits, MySQL continues crashing; logs show corrupted data files.
Investigation reveals that the fio tool was used directly on the block device /dev/vdb, bypassing the file system and overwriting real data, leading to severe corruption.
The affected /data volume on both primary and standby MySQL nodes became unreliable and required reformatting and data restoration from backups.
Suggestion:
Use fio only in production after thorough testing.
When running fio, avoid specifying a raw block device; instead, create a regular file (e.g., with touch) and target that file with the --filename option.
Scenario 4: File‑system damage caused by dd
Phenomenon: Users report abnormal file access on the cloud host.
Analysis:
Files on the data volume become inaccessible, and ls reports "Structure needs cleaning" errors.
System logs ( dmesg, /var/log/messages) show numerous XFS errors.
Root's command history reveals the use of dd to write zeros directly to /dev/vdb, damaging the underlying block device.
Suggestion:
Use dd cautiously in production; test beforehand.
Never specify a raw block device for the of parameter; instead, write to a regular file.
Scenario 5: Accidental deletion of a data disk
Phenomenon: Users report that a cloud disk was mistakenly deleted.
Analysis:
Check the deletion timestamp in the cloud portal; resources are often retained for a short period and can be recovered.
Verify the disk state in the virtualization console; if the state is "Ready", it may still be recoverable.
If not yet deleted, re‑attach the disk to the host and synchronize disk information via the portal.
Suggestion:
Handle disk deletion operations with extreme caution; once deleted, data may be unrecoverable.
Before deletion, confirm the disk is not in use (e.g., using lsof -n, df -h, lsblk, etc.) and unmount it, removing any entries from /etc/fstab.
If accidental deletion occurs, contact recovery support immediately to prevent physical cleanup.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
