Evolution of a High‑Scale Push Notification System at Tongcheng Travel

This article chronicles the multi‑year architectural evolution of Tongcheng Travel's push notification platform, detailing early batch‑job designs, successive redesigns using Redis, MongoDB, Kafka, Go, and .NET, and the performance, scalability, and operational improvements achieved through each major version.

Tongcheng Travel Technology Center
Tongcheng Travel Technology Center
Tongcheng Travel Technology Center
Evolution of a High‑Scale Push Notification System at Tongcheng Travel

1. Initial System Shape

The original push service relied on two alternating JOBs: one to initialize messages by extracting iOS devices from a daily‑cleaned device table and inserting formatted data into a push‑info table, and another to run every minute, pulling pending tasks and invoking a third‑party API for delivery. This simple architecture handled 200‑300k daily pushes with peak volumes around 700k.

2. Major Refactor of the Legacy Architecture

To eliminate duplicate pushes caused by a hidden bug, the Android push mechanism switched from tag‑based to per‑device iOS‑style delivery, and a Redis queue with lpop/rpop was introduced to guarantee exactly‑once processing. A high‑performance thread‑pool framework replaced the once‑per‑minute JOB, dynamically scaling consumer units based on queue depth, enabling ten‑fold capacity growth and stable operation since August 2014 with only four machines handling millions of daily pushes.

3. Full‑Link Push Support Refactor

The third version added a unified push API so that existing SMS‑based reminder services could be migrated with minimal code changes; requests are transformed and enqueued in Redis. After its November 2014 launch, the system quickly handled increasing reminder traffic and, by Christmas 2014, achieved a record 64 million daily pushes, stressing the database beyond a single‑billion‑row limit.

4. Storage Upgrade under Explosive Business Growth

In early 2015, a storage overhaul introduced a pipeline where device‑latest messages are cached in Redis, queued, and batch‑written to MongoDB every 15 seconds or 5,000 records. The merge of MongoDB data with Redis cache yields sub‑10 ms response times, allowing the fourth version to sustain ten‑million‑plus daily pushes without alerts.

5. System Slimming and Efficiency Gains

Each push node runs 400‑1,200 active consumer units that pull from Redis and send via an Agent to the push API. By consolidating Agent instances, introducing an internal queue, and redesigning the iOS Agent with a connection‑pool‑like model, latency dropped from ~3 seconds to milliseconds. Additionally, the MongoDB write buffer was replaced with an in‑process pool, saving ~30 GB of Redis memory.

6. Lighter and Faster New Challenges

The sixth version, rewritten in Go, introduced an Adapter layer for data‑format conversion, enabling easy onboarding of new apps. Deployed alongside the .NET version in September 2016, the Go service demonstrated lower resource usage and higher throughput, eventually becoming the sole platform after migrating marketing batch pushes in early 2017; current peak consumption exceeds 3 million pushes per minute.

7. Continuous Momentum

By March 2017 the system decoupled message storage from the push service, using Kafka to pre‑write marketing messages and off‑load MongoDB write/read peaks, further boosting interface performance. Future plans aim to fully separate reminder‑type storage, leaving the push service to handle only delivery, completing the next round of architectural refinement.

Overall, three years of iterative development transformed a modest batch‑job push system into a highly scalable, low‑latency, multi‑language platform capable of handling tens of millions of daily notifications with minimal operational overhead.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

push notificationsScalabilityRedisGoKafkaMongoDB
Tongcheng Travel Technology Center
Written by

Tongcheng Travel Technology Center

Pursue excellence, start again with Tongcheng! More technical insights to help you along your journey and make development enjoyable.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.