Master‑Slave, Replica Set, and Sharding: How MongoDB Achieves High Availability
This article explains MongoDB's evolution from Master‑Slave to Replica Set and Sharding architectures, detailing how each model provides high availability, data reliability, and scalability, and offers practical configuration tips to ensure strong consistency and minimal downtime in production deployments.
MongoDB Overview
MongoDB is a distributed document database widely used for high‑throughput workloads and ranks among the top‑5 databases in the DB‑Engines popularity list.
High‑Availability Concepts
High Availability (HA) aims to minimize service downtime caused by maintenance or failures. Service Level Agreements (SLA) quantify availability; for example, a 99.9 % SLA permits 8.76 hours of downtime per year, 99.99 % permits 52.6 minutes, and 99.999 % permits 5.26 minutes.
MongoDB High‑Availability Modes
Master‑Slave (Deprecated)
The original Master‑Slave architecture uses a single primary for writes and replicates the oplog to read‑only secondaries. Failover is manual, requiring an operator to promote a secondary, which leads to a window of unavailability and possible data inconsistency during replication lag. This mode has been deprecated since MongoDB 3.6.
Example startup commands:
mongod --master --dbpath /data/masterdb/ mongod --slave --source <masterhostname>:<port> --dbpath /data/slavedb/Replica Set
A replica set is a group of mongod instances with three possible roles:
Primary : accepts writes and replicates data to other members.
Secondary : maintains a copy of the data and can become primary through automatic election.
Arbiter : participates in elections without storing data, ensuring an odd number of voting members.
Read operations default to the primary; clients can explicitly read from secondaries using readPreference. Automatic elections eliminate manual failover windows, improving availability. The number of voting members must be odd to avoid split‑brain scenarios.
Typical write concern for reliability:
writeConcern: { w: "majority" }Sharding
Sharding provides horizontal scalability by partitioning data across multiple shard clusters, each of which is itself a replica set. The architecture consists of three layers:
Routing layer (mongos) : a stateless router that receives client requests, determines the target shard based on the sharding key, and forwards the operation.
Config servers : a replica set that stores the cluster topology and the mapping of chunks to shards.
Shard clusters : each shard is a replica set that stores a subset of the data.
Data is distributed using a sharding key . The key is processed by a sharding strategy to produce a value range that is split into chunks . Each chunk is assigned to a specific shard; the mapping resides in the config servers. This indirection allows chunks to be moved between shards without changing the application‑level key.
Supported sharding strategies:
Hashed Sharding : the key is hashed to a numeric value; chunks are evenly sized, giving good load balance but poor range‑query performance.
Range Sharding : the raw key value defines the range; enables efficient range queries and sorting but can create hotspot shards if many keys share a common prefix.
Recommended Usage Patterns
Ensuring HA : Prefer the sharding architecture (mongos) which hides replica‑set failover from the client, or implement client‑side periodic refresh of the replica‑set topology.
Ensuring Data Reliability : Configure writeConcern: { w: "majority" } so that a write is considered successful only after a majority of replica‑set members acknowledge it.
Ensuring Strong Consistency : Use the same majority write concern together with readPreference: "primary" (or the "strong" mode) so that reads are served exclusively by the primary.
Conclusion
The article covered three MongoDB high‑availability architectures: Master‑Slave (now deprecated), Replica Set, and Sharding.
Replica Set provides automatic failover and higher reliability through multiple data copies.
Sharding adds virtually unlimited storage capacity and further improves availability by distributing data across many replica sets.
Proper client configuration—majority write concern and primary‑only reads—ensures both data reliability and strong consistency.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
