Databases 20 min read

Mastering MongoDB Clusters: Setup, Monitoring, Migration, and Optimization

This comprehensive guide explains MongoDB cluster architecture, component roles, common use cases, monitoring commands, essential maintenance operations, data migration steps, troubleshooting of typical production issues, and practical optimization recommendations for high‑performance deployments.

Efficient Ops
Efficient Ops
Efficient Ops
Mastering MongoDB Clusters: Setup, Monitoring, Migration, and Optimization

MongoDB Cluster Overview

MongoDB is a distributed file‑storage database designed for scalable high‑performance web applications. The article introduces the most common three‑node cluster architecture and links to official docs.

Cluster Components

mongos (router) – entry point for client requests, routes queries to appropriate shards and merges results.

config server – stores metadata about sharding and collection structure; usually deployed with multiple instances for redundancy.

shard – stores a subset of data; balancer automatically migrates chunks to keep distribution even.

replica set – provides high availability; consists of a primary and secondary nodes, optionally an arbiter.

arbiter – participates in elections without holding data.

Typical Use Cases

Website data with real‑time inserts, updates and queries.

Cache layer for high‑performance read/write.

Large low‑value datasets where relational databases would be costly.

Highly scalable scenarios with dozens or hundreds of servers.

Storage of JSON‑like documents using BSON.

Why Choose MongoDB

Its BSON format, horizontal scalability, simple horizontal expansion, and strong performance for massive data volumes.

Cluster Monitoring

1. Storage statistics

Enter a mongos or shard container and run:

docker exec -it mongos bash;
mongo --port 20001;
use admin;
db.auth("root","XXX");

Then execute:

db.stats();

2. Server status

db.runCommand({ serverStatus: 1 })

3. Replica set status

rs.status();

Basic Operations

1. Set and view slow queries

# set slow query threshold
db.setProfilingLevel(1,200);
# view level
db.getProfilingLevel();
# view recent slow ops
db.system.profile.find({ ns: 'dbName.collectionName'}).limit(10).sort({ ts: -1 }).pretty();

2. Find long‑running operations

db.currentOp({"active":true,"secs_running":{"$gt":2000}});

3. Adjust log level and cache size

# log level
db.adminCommand({ "getParameter":1, "logLevel":1 });
# cache size (WiredTiger)
db.adminCommand({ "setParameter":1, "wiredTigerEngineRuntimeConfig":"cache_size=4G" });

4. Add / remove replica set members

# list members
rs.status().members;
# add
rs.add('127.0.0.1:20001');
# remove
rs.remove('127.0.0.1:20001');

5. Enable sharding for a database and collection

# enable sharding on a database
sh.enableSharding("dbName");
# shard a collection
sh.shardCollection("dbName.collectionName", { fieldName: 1 });

6. Add / remove shards

# view shard status
sh.status();
# add shard (replica set or single instance)
db.runCommand({ addshard:"rs1/ip-1:20001,ip-2:20001,ip-3:20001" });
# remove shard
db.runCommand({ removeShard:"shardName" });
# refresh router config
db.runCommand("flushRouterConfig");

Note: removing a shard may require two executions until the state becomes

{"draining":true}

and finally

removeshard

succeeds.

7. Data import / export

# export
mongoexport -h 127.0.0.1 --port 20001 -u xxx -p xxx -d xxx -c mobileIndex -o XXX.txt
# import
mongoimport -h 127.0.0.1 --port 20001 -u xxx -p xxx -d xxx -c mobileIndex --file XXX.txt

Data Migration

1. Migrate a replica‑set member

Shut down the mongod instance.

Copy the dbPath to the new host.

Start mongod on the new host with the copied data directory.

Connect to the current primary and, if the host changed, run

rs.reconfig()

to update the member address.

2. Migrate the primary node

Step down the primary with

rs.stepDown()

or

replSetStepDown

, then move the former primary as a secondary and follow the member‑migration steps.

3. Recover data from another node

Stop the target node, copy its dbPath, start the node with the copied files, add it back to the replica set with

rs.add()

, and optionally remove the old member.

Common Production Issues and Solutions

1. Index build locks the database

Identify long‑running ops with

db.currentOp()

, kill them, and rebuild the index in background (

{background:true}

).

2. Uncontrolled memory usage

WiredTiger cache defaults to 50 % of RAM minus 1 GB (minimum 256 MB). Adjust with

db.adminCommand({ setParameter:1, wiredTigerEngineRuntimeConfig:"cache_size=xxG"})

.

3. Deleted data does not free disk space

Use

db.collection.runCommand("compact")

or perform a full resync of a secondary after removing data.

4. High server load

Temporarily remove the secondary from the replica set to free I/O, then consider adding memory, SSDs, or sharding.

5. Poor shard key choice causing hot spots

Adjust the balancer window or redesign the shard key (hash or random) to distribute writes evenly.

Optimization Recommendations

Application level

Ensure queries use indexes; verify with

explain()

.

Design appropriate shard keys (incremental, random, or compound) to avoid hot partitions.

Enable profiling (

db.setProfilingLevel()

) to capture slow operations.

Hardware level

Keep hot data and indexes within RAM.

Prefer modern filesystems (ext4, xfs) over ext3.

Architecture level

Separate primary and secondary nodes onto different machines to reduce I/O contention.

Conclusion

MongoDB offers high performance and easy scalability, but careful attention to shard key selection, memory sizing, and disk I/O is essential for optimal operation.

Original link: https://www.jianshu.com/p/f05f65d3a1dc
monitoringOptimizationshardingReplicationClusterbackupMongoDB
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.