Analysis and Resolution of MongoDB Sharding Balancer Chunk Migration Failures in Version 3.4.x
A MongoDB client reported severe chunk imbalance and nightly balancer migration failures in a sharded cluster, which were traced to a known bug causing conflicting operations, and the issue was resolved by disabling the balancer for the affected collection and upgrading the cluster to version 3.4.11 or later.
Background : A client observed severe chunk distribution imbalance across shards in a MongoDB sharding cluster and nightly balancer migration attempts failed, prompting an investigation.
Impact : Data on each shard was heavily skewed, preventing automatic rebalancing.
Environment : MongoDB 3.4.9 with three mongos, three config servers, and three shard replica sets named shard1, shard2, and shard3.
Diagnosis Process : The sh.status() output revealed that collections db01_xxx.col01_xxxx_info_2019 and db01_xxx.col01_xxxx_info had highly uneven chunk counts, with the balancer attempting to move chunks from the most loaded shard to the least loaded shard.
Log analysis showed errors such as:
2019-05-27T00:04:06.140+0800 I SHARDING [Balancer] Balancer move db01_xxx.col01_xxxx_info_2019: [{ col01_column_1: "3177000047924787", sharedDate: new Date(1546561546000) }, { billingContractNo: "3177000049293528", sharedDate: new Date(1548383450000) }], from shard2, to shard1 failed :: caused by :: ConflictingOperationInProgress: Unable to start new migration because this shard is currently donating chunk [{ col01_column_1: "3177000525560227", sharedDate: new Date(1527215797000) }, { col01_column_1: "3177000525560227", sharedDate: new Date(1527217436000) }) for namespace db01_xxx.col01_xxxx_info to shard3The failures were due to conflicting automatic migrations: one collection’s chunk migration conflicted with another’s, leading to ConflictingOperationInProgress errors.
This behavior matches a known MongoDB bug (SERVER-29423) where multiple collections cannot simultaneously act as source and destination for balancer migrations.
Fix Verification : The issue can be temporarily mitigated by disabling the balancer for the problematic collection:
// Disable automatic balancing
sh.disableBalancing("db01_xxx.col01_xxxx_info");
// Enable automatic balancing
sh.enableBalancing("db01_xxx.col01_xxxx_info");Conclusion : The balancer initiates migrations when shard counts change (e.g., after removeShard) or when chunk count differences exceed configured thresholds (<20 chunks → threshold 2, <80 → 4, >=80 → 8). The observed bug is fixed in MongoDB 3.4.11 and later (including 3.6), so upgrading the cluster resolves the issue permanently.
Related Links :
SERVER-29423 – Sharding balancer schedules multiple migrations with the same conflicting source or destination
MongoDB Sharding Balancer Administration – Migration Thresholds
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
