Recovering Lost MySQL Data with Binlog, binlog2sql & MyFlash: A Real‑World Case Study
This article recounts a February 2020 MySQL data‑loss incident, detailing how the author identified the offending binlog, used open‑source tools like binlog2sql and MyFlash to reconstruct and restore the missing records, and reflects on operational lessons learned.
Background and Approach
On February 25, 2020 a major system failure affected a large‑scale WeChat service, causing core production data to be unavailable for 36 hours. The author, an architect at a major asset‑management firm, revisited a similar incident from two weeks earlier where a developer mistakenly deleted production data, using it as a basis to document the troubleshooting process for operational teams.
The incident timeline began at 23:00 on February 13 when a request arrived to restore data. The environment consisted of RHEL 7.5, an open‑source Activity workflow platform, and a MySQL 5.7 community edition cluster (one primary, two replicas).
Data Recovery Process and Technical Analysis
The recovery plan followed a structured eight‑step methodology:
Identify who performed which operation and when.
Assess the impact of the operation on the system and other services.
Determine the time range of affected binlog entries.
Reproduce the failure in a simulation environment.
Design a technical recovery solution and validate it in the simulation.
Verify application functionality after simulated recovery.
Apply the validated recovery steps to production after backing up current data.
Perform a final green‑light test before declaring the restoration complete.
Key technical steps included:
Querying the developer, who reported a REST call around 20:20 that deleted a workflow template, causing all associated process instances to disappear.
Locating the relevant binlog file (mysql‑bin.000011) and extracting it for analysis.
Parsing the binlog with the command:
mysqlbinlog -v --base64-output=decode-rows --skip-gtids=true --start-datetime='2020-02-13 20:10:00' --stop-datetime='2020-02-13 21:30:00' -d {$DBNAME} mysql-bin.000011 >> aa.logConfirming that the initial 20:20 segment contained no massive delete statements, prompting deeper investigation.
Observing a surge of DELETE and UPDATE statements starting at 20:30, indicating the problematic window.
Using the open‑source tool binlog2sql (https://github.com/danfengcao/binlog2sql) to translate binlog events into SQL and reverse SQL statements.
Deploying the reverse SQL in the simulation environment, encountering errors due to a longblob column that could not be inserted.
Generating a reverse binary binlog with the MyFlash project (https://github.com/Meituan-Dianping/MyFlash) and applying it to the database, which successfully restored the missing data.
Reflection
The author questions why a traditional backup‑restore was not used: although daily full backups exist, restoring to a point‑in‑time would overwrite other workflow engines' data, causing collateral loss. Table‑level restoration was also rejected because the workflow platform’s tightly coupled data model would break referential integrity.
Root‑cause analysis highlighted two main issues:
Developers bypassed the staging environment and deployed directly to production without rigorous validation.
Insufficient familiarity with the Activity platform among developers, leading to improper template deletions.
Follow‑up Recommendations
Automate the release pipeline to minimize manual intervention.
Standardize deployment scripts and enforce validation before production rollout.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
