Precise Push Notification Architecture and Algorithm Optimization at 58.com
This article describes the evolution of 58.com's user‑set service architecture, the transition from MongoDB to RoaringBitmap storage, and the machine‑learning‑driven algorithm optimizations that enable real‑time, multi‑dimensional, and localized push notifications for millions of users.
In the era of big data and algorithms, content distribution has shifted from simple channel pushes to precise identification of user preferences, enabling highly targeted recommendations that capture user attention.
Precise push can be divided into two categories: on‑site (ads, feed recommendations, product suggestions) and off‑site (advertising, SMS, push). This article focuses on off‑site push, particularly push notifications.
The precise user‑set push service originates from the 58.com user‑profile system, generating target user groups based on profile data and managing the full lifecycle of messages and ads through an internal push platform, launched at the end of 2016.
User‑Set Service Architecture Evolution
Basic Functions
The service provides: (1) logical filtering (AND, OR, NOT) of user sets based on profile dimensions for tasks within 15 minutes; (2) real‑time estimation of the filtered audience size; (3) statistical and visual analysis of the full or filtered audience.
Real‑time audience size preview:
Statistical visualization of user dimensions:
Architecture Prototype
The service uses user‑profile data as its source, supporting billions of users and over 2,000 profile tags. It consists of three services: a synchronous user‑set filtering service, an asynchronous analysis service, and a real‑time estimation service that uses sampled user sets for fast response.
Initially stored in MongoDB, the backend switched to an in‑memory Bitmap implementation written in C/C++ for faster computation. Each pair becomes a bit in a massive sparse matrix, enabling fast logical operations (AND/OR/NOT) over the entire user base.
To improve availability, the storage layer is horizontally scaled with multiple instances behind a load balancer, and updates are handled via an active‑standby dual‑replica scheme.
Architecture Evolution
As user count and tag dimensions grew, the original Bitmap storage scaled linearly. To compress memory while retaining speed, the team evaluated several solutions and adopted the open‑source RoaringBitmap, which uses a hybrid of Array and Bitmap containers. For sparse data, the Array container reduces memory usage dramatically without sacrificing operation efficiency.
Benchmarks show that for data under 4 KB, the Array container occupies less space than a pure Bitmap, allowing substantial memory compression in the company's sparse business scenario.
Precise Push Algorithm Optimization
Application Background
By combining user behavior features from multiple business lines, various machine‑learning models are used to deliver real‑time, personalized push messages.
Mechanism Design
The system provides customized push services for different business scenarios, such as full‑time recruitment, used‑car browsing, and new‑car news, by mapping business‑specific signals (e.g., employer interactions, increased communication volume, correlated browsing patterns) to user‑profile attributes.
Features considered include personal attributes (age, gender, income, interests), behavioral traits (browsing preferences, activity level, call frequency, chat usage), and network properties (friend circles, location).
Data from thousands of dimensions across multiple subsidiaries are fused; after initial high‑information feature selection and labeling, statistical methods (correlation, significance testing) and feature‑engineering techniques (derived variables, discretization, text‑behavior fusion) are applied before feeding the data into classifiers.
The model is continuously retrained using online conversion feedback to improve prediction accuracy.
Multi‑Dimensional Fusion
Precise push emphasizes three characteristics: scenario‑based, localized, and real‑time.
Scenario‑based: Users interact via web, mobile web, and app across various sub‑platforms (e.g., city‑specific services). IDs are mapped to aggregate behavior, enabling the system to push at the optimal time and place.
Local‑based: As a local‑life service platform, 58.com ensures that push content matches the user's geographic context, which is critical for housing and recruitment scenarios.
Real‑time: Using device location, the system infers work address, desired rent range, travel time, etc., and instantly delivers the most suitable recommendation when a need arises.
Future Outlook
The current system combines traditional machine‑learning models, time‑series analysis, and deep‑learning structures (CNN, RNN, CRNN, capsule networks, BiLSTM‑CRF). Emerging ideas from advertising, such as Audience Selection Models, bidding strategies, and reinforcement‑learning approaches (DQN, OpenAI‑style adversarial training), are being explored to further enhance real‑time push capabilities.
Conclusion
Precise push now serves daily operations and campaign activities across all 58.com business lines, fulfilling the need for individualized content delivery and contributing to user growth through advanced profiling and algorithmic techniques.
58 Tech
Official tech channel of 58, a platform for tech innovation, sharing, and communication.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.