Databases 10 min read

Designing a Scalable Comment Service: Vivo’s MongoDB Architecture Deep Dive

This article details Vivo's journey in building a company‑wide comment platform, explaining why MongoDB was chosen over MySQL, how its sharded cluster is structured, the challenges of shard‑key selection, and the practical steps taken to scale and maintain the system.

ITPUB
ITPUB
ITPUB
Designing a Scalable Comment Service: Vivo’s MongoDB Architecture Deep Dive

Database Selection

Comment service requires dynamic schema, massive horizontal scalability, high availability, and does not need strong transactional guarantees. MongoDB cluster was chosen over MySQL.

MongoDB Cluster Architecture

Components

mongos : routing server that forwards client reads/writes to appropriate shards.

config : replica set storing metadata about sharded collections.

shard : replica set (mongod) that holds the actual chunk data.

Shard Key

Collections are split by a shard key into chunks distributed across shards. Two shard‑key types are supported:

Hash Sharding : uses a hash algorithm for even distribution; supports single‑ or multi‑field hashes.

Range Sharding : distributes data based on key ranges, suitable for range queries.

Practical Experience in the Comment Platform

Cluster Expansion

Each business client is assigned a logical cluster; a physical cluster can host many logical clusters. A routing layer built on Spring MongoTemplate with connection‑pool management enables dynamic selection of the target MongoDB cluster.

Introduce logical and physical cluster concepts.

Add a routing layer so applications can switch between MongoDB clusters.

Separate shard clusters provide physical isolation and tailored tuning per business.

Shard Key Selection

Initially the comment collection used a single‑field shard key topicId. Testing revealed two problems:

Jumbo chunk : a hot topic can generate >1 GB of data, preventing the chunk from splitting and causing imbalance.

Unique key issue : MongoDB requires unique indexes to include the shard key; a global unique _id alone is insufficient across shards.

The collection was recreated with a compound shard key {topicId, _id}, which resolves both issues.

Migration and Scaling

When a chunk exceeds the size limit, MongoDB automatically splits it. The balancer then migrates chunks to keep distribution even. Balancer activity can be limited to low‑traffic windows:

db.settings.update(
  { _id: "balancer" },
  { $set: { activeWindow: { start: "HH:MM", stop: "HH:MM" } } },
  { upsert: true }
)

Adding a new shard is performed with: sh.addShard("replicaSetName/hostname:port") Chunk migration may temporarily affect availability; schedule such operations during off‑peak periods.

Conclusion

The MongoDB cluster (MongoDB 4.0.9) has been in production for over a year, handling >100 million comments from ten business clients with stable performance. While sharded clusters provide horizontal scalability and flexible schema, they impose index and sharding constraints; for many workloads a replica set may be sufficient.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ScalabilityshardingDatabase designMongoDBComment SystemCluster Architecture
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.