Iterative Development and Scaling of ZuanZuan's Push Notification System
This article details the end‑to‑end evolution of ZuanZuan's push notification platform, covering terminology, architecture, large‑scale holiday pushes, real‑time data handling, performance optimizations, multi‑channel integration, AB testing, and monitoring to achieve high throughput and reliability.
Introduction : The article explains the iterative development of ZuanZuan's push notification system, defining key terms such as push scope, target devices, channels, tokens, delivery rate, and click‑through rate, and describing the existing architecture that supports backend, business, and personalized pushes.
Existing Architecture : The current design uses multiple vendor channels (Xiaomi, Huawei, Meizu, APNs) and stores token information. Backend pushes handle high‑volume, short‑duration campaigns; business pushes are triggered by events with high priority; personalized pushes rely on user profiles and complex strategies.
Project Iteration Process :
2.1 Origin – PM wants to push operational activities : Steps include extracting user sets from a big‑data platform and calling push APIs. Problems such as multiple PM requests, sudden spikes, and weekly activities are addressed by building a backend service for ID‑based uploads, time‑based scheduling, and cross‑platform support.
2.2 Large‑scale holiday push : Requirements to push 100 million users within an hour, covering recent active users. Analysis of data volume (≈5 GB for 50 million devices) and performance (target QPS ≈27 k) leads to solutions using real‑time Kafka ingestion, Redis Zset for token storage, and multi‑threaded sending.
2.3 AB testing : Introduces an AB testing layer to select the best copy based on early click‑through results before full rollout.
2.4 Multi‑channel integration : Adds support for major vendor channels, determines the optimal channel per device, and stores token‑channel mappings using a token service with sharding and caching, improving delivery rates by about 10%.
2.5 Real‑time monitoring : Monitors dimensions such as product line, client OS, metrics (send, delivery, click rates), templates, periods, and channels, with visual dashboards (images omitted).
Performance Improvements : Identifies bottlenecks like single‑threaded sending and vendor channel latency. Solutions include separating Android and iOS pipelines, pushing data to a message queue, and scaling consumers, achieving push QPS >30 k.
Operational Challenges : After scaling, instantaneous traffic spikes cause timeouts and page failures. Mitigations involve adding servers, multithreading, rate limiting, caching core app data, and simplifying activity flows.
Data Storage Comparison :
Scenario
Description
Redis Zset
MySQL
Data Initialization
Bulk write of full data
Direct import
Direct import
Add
Real‑time new user token
Direct write
InsertOrUpdate
Update
Update user info
Overwrite
InsertOrUpdate
Delete
Remove tokens older than 90 days
Simple
Complex
Batch read performance
Fetch user set (e.g., 20 M active monthly)
~20 s
~60 s
Single‑key read performance
Get user info by token
>30 k QPS
~20 k QPS
Scalability
Ability to expand quickly
Yes
No
The final choice was to use Redis Zset for storage due to its superior performance and scalability.
Final Summary : The push system now achieves near‑real‑time delivery for millions of users, supports pause/preview, AB testing, offline notifications, anti‑disturbance, priority channels, and leverages pre‑loading, caching, multithreading, efficient data structures, batch processing, and special tracking points. Trade‑offs include asynchronous uploads, rate limiting, degradation handling, layered decoupling, and compensatory notifications.
Author : Wang Jikuan, Business Lead of the ZuanZuan platform.
Zhuanzhuan Tech
A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.