Master MongoDB Sharding: From Crash Troubleshooting to Cluster Setup
This guide walks through diagnosing a MongoDB outage, explains sharding concepts, shows when to shard, and provides step‑by‑step instructions for configuring config servers, mongos routers, converting replica sets, sharding collections, tracking chunks, and managing the balancer.
Incident troubleshooting
When MongoDB refused connections, check processes with ps -aef|grep mongo. If no mongod processes are running, the disk is likely full. Verify with df -TH. Free space by removing log files ( rm -rf * in the log directory) and restart MongoDB.
If startup hangs with the message "about to fork child process, waiting until server is ready for connection", kill lingering processes:
ps -aef|grep mongo | grep -v grep | awk '{print $2}' | xargs kill -9Delete mongod.lock and diagnostic.data in the data directory, then start with the script mongos_start.sh (which runs mongod --config data/mongodb.conf).
What is MongoDB sharding?
Sharding partitions data across multiple machines. MongoDB supports manual sharding (application manages connections to independent servers) and automatic sharding, where mongos routing processes present the cluster as a single server, maintain a catalog mapping shard‑key ranges to shards, forward client requests, and merge results.
Creating a test sharded cluster
Using the mongo shell with --nodb --norc, instantiate ShardingTest:
st = ShardingTest({
name: "one-min-shards",
chunkSize: 1,
shards: 2,
rs: { nodes: 3, oplogSize: 10 },
other: { enableBalancer: true }
});name : identifier for the cluster
shards : number of shards (2)
rs : each shard is a replica set of 3 nodes
enableBalancer : start the balancer after deployment
The test creates two replica‑set shards, a config‑server replica set, and a mongos (default port 20009). Clients can connect to any process.
When to use sharding
Insufficient RAM on a single node
Insufficient disk space
High CPU or I/O load
Throughput exceeds the capacity of a single MongoDB instance
Production sharded cluster setup
1. Config server
Start config servers before any mongos: mongod -f config.conf Use writeConcern: "majority" for writes and readConcern: "majority" for reads to guarantee metadata consistency.
2. mongos router
Start each router with a config file that specifies the config replica set:
mongod -f mongos.conf # mongos.conf contains configdb=configReplSet/host1,host2,host3Deploy multiple mongos instances close to the shards for lower latency.
3. Convert replica set to shard
From MongoDB 3.4 onward, each shard member must be started with the --shardsvr (or shardsvr=true in the config file) option.
4. Shard a collection
Example: shard the worker collection in the test database on the name field.
Enable sharding for the database: sh.enableSharding("test") Shard the collection:
sh.shardCollection("test.worker", { "name": 1 })If the collection already exists, ensure an index on the shard key; otherwise MongoDB creates one automatically.
Chunk management
Chunks
MongoDB groups documents into chunks based on the shard‑key range. Each chunk resides on a single shard and chunks never overlap.
Chunk splitting
When a chunk exceeds the configured size (default 64 MB), the primary mongod of the shard requests the global chunk size from the config server, splits the chunk, and updates the metadata. Excessive split attempts can cause a “split storm” and degrade performance.
Balancer
The balancer runs on the primary of the config‑server replica set (MongoDB 3.4+). It periodically checks the distribution of chunks across shards and migrates chunks from overloaded shards to under‑utilized ones. The balancer activates only when a shard’s chunk count exceeds a migration threshold.
Following these steps—troubleshooting service failures, understanding sharding concepts, configuring config servers and routers, converting replica sets, sharding collections, and managing chunks and the balancer—enables building and maintaining a robust MongoDB sharded cluster.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
