MongoDB Cluster Point-in-Time Recovery (PITR) Procedure Using Shard and Config Server Restoration
This article demonstrates a step‑by‑step point‑in‑time recovery of a MongoDB sharded cluster by restoring shard instances, replaying oplog entries to a specific timestamp, updating metadata, and finally rebuilding the config server and mongos to achieve a consistent snapshot with max(id)=15.
1. Background
Most online examples cover PITR for a single‑node MongoDB instance, and the official documentation provides recovery steps for a MongoDB cluster but lacks a concrete PITR workflow. This article presents an experimental setup that simulates an online environment and performs a PITR on a MongoDB sharded cluster.
Original cluster topology
172.16.129.170 shard1 27017 shard2 27018 config 37017 mongos 47017
172.16.129.171 shard1 27017 shard2 27018 config 37017 mongos 47017
172.16.129.172 shard1 27017 shard2 27018 config 37017 mongos 47017For the demo we restore each shard as a single instance (sufficient for developer queries). The restored topology becomes:
172.16.129.173 shard1 27017 shard2 27018 config 37017 mongos 47017
172.16.129.174 config 37017
172.16.129.175 config 37017The MongoDB version is Percona 4.2.13. Both each shard and the config server run scheduled hot‑backup scripts and oplog‑backup scripts. Test data is created via mongos by creating a hashed‑sharded collection and inserting 10 documents:
use admin
db.runCommand({"enablesharding":"renkun"})
sh.shardCollection("renkun.user", { id: "hashed" } )
use renkun
var tmp = [];
for(var i =0; i<10; i++){
tmp.push({ 'id':i, "name":"Kun " + i});
}
db.user.insertMany(tmp);After taking physical hot‑backups of shard1, shard2 and the config server, another 10 documents are inserted, bringing the total to 20 rows (id 0‑19). The goal is to restore the cluster to the point where max(id)=15.
2. Restoring Shard Instances
2.1 Identify the target snapshot
For each shard we dump the oplog BSON file to JSON and locate the entry that inserted id=15:
bsondump oplog.rs.bson > oplog.rs.json
more oplog.rs.json | egrep "\"op\":\"i\",\"ns\":\"renkun\.user\"" | grep "\"Kun\n15\""The matching entry on shard2 shows a timestamp {"t":1623408268,"i":6} . Because mongorestore --oplogLimit uses an open interval, we add one to the increment, resulting in 1623408268:7 as the effective limit.
2.2 Create a temporary restoration user
Log in as root on each shard and create a user with the __system role, named internal_restore . This user is required for oplog replay, modifying admin.system.version , and dropping the local database.
use admin
db.createUser({ user: "internal_restore", pwd: "internal_restore", roles: ["__system"] })Log in as internal_restore and drop the local database:
use local
db.dropDatabase()2.3 Replay oplog up to the target timestamp
mongorestore -h 127.0.0.1 -u internal_restore -p "internal_restore" --port 27017 \
--oplogReplay --oplogLimit "1623408268:7" --authenticationDatabase admin \
/data/backup/202106111849_27017/local/oplog.rs.bson
mongorestore -h 127.0.0.1 -u internal_restore -p "internal_restore" --port 27018 \
--oplogReplay --oplogLimit "1623408268:7" --authenticationDatabase admin \
/data/backup/202106111850_27018/local/oplog.rs.bsonVerify the restored data on each shard:
--shard1 27017
> db.user.find().sort({"id":1})
{ "_id" : ObjectId("..."), "id" : 3, "name" : "Kun 3" }
... (records up to id 12)
--shard2 27018
> db.user.find().sort({"id":1})
{ "_id" : ObjectId("..."), "id" : 0, "name" : "Kun 0" }
... (records up to id 15)After replay, the shards contain 16 documents with the maximum id equal to 15.
2.4 Update admin.system.version
Because the shards have changed from three‑node replica sets to single instances, the metadata must be corrected. Using internal_restore on each shard:
use admin
db.system.version.deleteOne({ _id: "minOpTimeRecovery" })
db.system.version.find({"_id" : "shardIdentity" })
db.system.version.updateOne(
{ "_id" : "shardIdentity" },
{ $set : { "configsvrConnectionString" : "configdb/172.16.129.173:37017,172.16.129.174:37017,172.16.129.175:37017" } }
)Before deletion the old records still pointed to the original config servers (172.16.129.170‑172.16.129.172).
3. Restoring the Config Server
Transfer the physical backup of the config server, extract it to the data directory, and start it as a single instance. Create the same internal_restore user with the __system role.
3.1 Replay oplog
mongorestore -h 127.0.0.1 -u internal_restore -p "internal_restore" --port 37017 \
--oplogReplay --oplogLimit "1623408268:7" --authenticationDatabase admin \
/data/backup/202106111850_37017/local/oplog.rs.bson3.2 Modify metadata
Log in as internal_restore and update the shard entries in the config.shards collection to point to the restored shard hosts:
use local
db.dropDatabase()
use config
db.shards.find()
db.shards.updateOne({ "_id" : "repl" }, { $set : { "host" : "172.16.129.173:27017" } })
db.shards.updateOne({ "_id" : "repl2" }, { $set : { "host" : "172.16.129.173:27018" } })
db.shards.find()3.3 Start the config server replica set
Stop the single‑instance config server, then start it in replica‑set mode with the following configuration snippet:
sharding:
clusterRole: configsvr
replication:
oplogSizeMB: 10240
replSetName: configdbInitialize the replica set and add the two remaining members:
rs.initiate()
rs.add("172.16.129.174:37017")
rs.add("172.16.129.175:37017")The config server is now a three‑node replica set.
4. Configuring mongos
Copy the original mongos configuration file and adjust the sharding and net.bindIp parameters to reflect the new config server addresses:
sharding:
configDB: "configdb/172.16.129.173:37017,172.16.129.174:37017,172.16.129.175:37017"
net:
port: 47017
bindIp: 127.0.0.1,172.16.129.173After starting mongos , query the renkun.user collection; it returns 16 documents with max(id)=15, confirming a successful PITR.
mongos> use renkun
switched to db renkun
mongos> db.user.find().sort({"id":1})
{ "_id" : ObjectId("..."), "id" : 0, "name" : "Kun 0" }
... (up to id 15)5. Summary
MongoDB 4.x introduced multi‑document transactions, and version 4.2 added cross‑shard transactions. Restoring data involved in such transactions requires special tools not covered in this guide; therefore the presented PITR procedure applies to non‑transactional workloads.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.