MongoDB Replication Set and Sharding Configuration Guide
This article provides a comprehensive step‑by‑step guide to setting up MongoDB replica sets and sharded clusters, explaining the architecture, member roles, configuration files, initialization commands, and operational procedures for ensuring data redundancy, high availability, and horizontal scaling.
This guide explains how to build a production‑grade MongoDB deployment using replica sets for redundancy and sharding for horizontal scaling.
1. Replica Set Overview
A replica set is a group of mongod processes that maintain the same data set, providing data redundancy and automatic failover.
1.1 Purpose
Replica sets guarantee that data remains available even if a single node fails, and they improve read capacity by allowing secondary nodes to serve read requests.
1.2 Architecture
Typical deployments use three members: one primary, one or two secondaries, and optionally an arbiter that participates only in elections.
1.3 Member Types
Primary : receives all writes and replicates them to secondaries.
Secondary : replicates from the primary and can serve reads.
Arbiter : votes in elections but holds no data.
Priority 0 : never becomes primary.
Hidden : not visible to drivers, used for backup or reporting.
Delayed : hidden node with a configurable replication lag.
1.4 Configuring a Replica Set
Prepare the environment (CentOS 6.9, disabled firewall, SELinux disabled) and install MongoDB 3.2.8.
# create mongod user
useradd -u800 mongod
passwd mongod
# install MongoDB
mkdir -p /mongodb/bin
cd /mongodb
wget http://downloads.mongodb.org/linux/mongodb-linux-x86_64-rhel62-3.2.8.tgz
tar xf mongodb-linux-x86_64-3.2.8.tgz
cp mongodb-linux-x86_64-3.2.8/bin/* /mongodb/bin
chown -R mongod:mongod /mongodb
su - mongodCreate data, log and config directories for each instance (ports 28017‑28020):
for i in 28017 28018 28019 28020; do
mkdir -p /mongodb/$i/conf
mkdir -p /mongodb/$i/data
mkdir -p /mongodb/$i/log
doneWrite a mongod.conf for the first instance (example shows port 28017, storage, replication settings, etc.) and copy it to the other instances, adjusting the port number with sed :
sed -i "s#28017#28018#g" /mongodb/28018/conf/mongod.confStart all instances:
for i in 28017 28018 28019 28020; do
mongod -f /mongodb/$i/conf/mongod.conf
doneInitialize the replica set from the primary shell:
config = { _id: 'my_repl', members: [
{ _id: 0, host: '10.0.0.152:28017' },
{ _id: 1, host: '10.0.0.152:28018' },
{ _id: 2, host: '10.0.0.152:28019' }
]}
rs.initiate(config)Test replication by inserting documents on the primary and reading them from a secondary (after enabling rs.slaveOk() ).
2. Sharding Overview
Sharding distributes large collections across multiple servers to increase storage capacity and throughput. MongoDB uses a routing process ( mongos ) and config servers to keep metadata.
2.1 Architecture
Config Server : stores cluster metadata; typically three nodes.
Mongos : query router that forwards client operations to appropriate shards.
Shard (mongod) : stores actual data; each shard is itself a replica set.
2.2 Chunk Management
Data within a shard is divided into chunks . When a chunk exceeds the configured chunkSize (default 64 MiB), it is split. The balancer moves chunks between shards to keep them evenly distributed.
2.3 Shard Key Selection
Choose a shard key that is indexed, immutable, and no larger than 512 bytes. Options include:
Increasing key : simple but creates write hotspots.
Random key : distributes writes evenly.
Hashed key : provides uniform distribution at the cost of range query efficiency.
2.4 Deploying a Sharded Cluster
1. **Config Server Replica Set** – create mongod.conf files with sharding.clusterRole: "configsvr" and start them on ports 28018‑28020. 2. **Shard Replica Sets** – create two replica sets (e.g., sh1 on ports 28021‑28023 and sh2 on ports 28024‑28026) with sharding.clusterRole: "shardsvr" . 3. **Mongos** – configure mongos.conf to point to the config server replica set and start it on port 28017.
cat > /mongodb/28017/conf/mongos.conf <<'EOF'
systemLog:
destination: file
path: /mongodb/28017/log/mongos.log
logAppend: true
net:
bindIp: 10.0.0.152
port: 28017
sharding:
configDB: configReplSet/10.0.0.152:28018,10.0.0.152:28019,10.0.0.152:28020
processManagement:
fork: true
EOF
mongos -f /mongodb/28017/conf/mongos.confConnect to mongos and add the shard replica sets:
db.runCommand({ addshard: "sh1/10.0.0.152:28021,10.0.0.152:28022,10.0.0.152:28023", name: "shard1" })
db.runCommand({ addshard: "sh2/10.0.0.152:28024,10.0.0.152:28025,10.0.0.152:28026", name: "shard2" })Enable sharding for a database and a collection, then create the appropriate index:
db.runCommand({ enablesharding: "test" })
use test
db.vast.ensureIndex({ id: 1 })
db.runCommand({ shardCollection: "test.vast", key: { id: 1 } })2.5 Balancer Control
Check balancer state:
sh.getBalancerState() // true/falseEnable or disable the balancer:
sh.setBalancerState(true) // start
sh.setBalancerState(false) // stopConfigure a balancer active window (e.g., 00:00‑05:00) in the config.settings collection:
use config
db.settings.update({ _id: "balancer" }, { $set: { activeWindow: { start: "00:00", stop: "05:00" } } }, { upsert: true })Disable balancing for a specific collection:
sh.disableBalancing("students.grades")Re‑enable it with sh.enableBalancing() and verify with:
db.getSiblingDB("config").collections.findOne({ _id: "students.grades" }).noBalanceThese commands allow administrators to fine‑tune data distribution, avoid performance degradation during peak hours, and maintain a stable, highly available MongoDB deployment.
Laravel Tech Community
Specializing in Laravel development, we continuously publish fresh content and grow alongside the elegant, stable Laravel framework.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.