Databases 8 min read

Why a Missing mysql-bin.index Crashed MySQL HA and How to Recover It

A detailed post‑mortem of a MySQL dual‑master cluster failure caused by a missing mysql-bin.index file, showing step‑by‑step diagnostics, container logs, replication position fixes, and preventive measures for high‑availability deployments.

ITPUB

Mar 26, 2023

Why a Missing mysql-bin.index Crashed MySQL HA and How to Recover It

Incident Overview

The author describes a production incident in a test environment where a MySQL dual‑master cluster (two MySQL nodes with Keepalived for HA) stopped serving traffic. Both MySQL containers disappeared from docker ps, prompting a systematic investigation.

Root Cause Investigation

Checked Keepalived status; it was running and repeatedly restarting MySQL.

Used docker ps -a to see that MySQL containers exited after a restart.

Examined container logs with docker logs <container_id>, which reported that mysql-bin.index was missing.

The missing index file prevented the slave from locating the correct binary log, causing replication errors.

Fixing the Missing mysql-bin.index

Created the required log directory and set permissions:

mkdir log
chmod 777 log -R

After the directory existed, Keepalived could restart MySQL successfully.

Adjusting Replication Position

On the master (node55) the current binlog file and position were obtained:

FLUSH TABLES WITH READ LOCK;
SHOW MASTER STATUS;
UNLOCK TABLES

On the slave (node56) the replication was re‑configured to the correct file and offset:

# Stop slave
STOP SLAVE;
# Set correct master info
CHANGE MASTER TO MASTER_HOST='10.2.1.55',
MASTER_PORT=3306,
MASTER_USER='vagrant',
MASTER_PASSWORD='vagrant',
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS=117748;
# Start slave
START SLAVE;

After applying these commands the I/O thread resumed, and replication became healthy.

Lessons and Improvements

The root cause was an accidental deletion of the /var/lib/mysql/log directory (the "log" database) during a previous migration, which also removed the mysql-bin.index file. To avoid similar issues, the author suggests:

Separate the binary log directory from the data directory (e.g., log_bin = /var/lib/mysql/log).

Use single‑database sync instead of full‑cluster overwrite to prevent accidental data loss.

Implement proactive alerting for MySQL failures, such as Keepalived email notifications or log‑based monitoring.

After fixing the directory and replication settings, both nodes operated normally, and the issue was confirmed by connecting with Navicat.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mysql replication binary log Keepalived

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.