How SuNing Financial Scaled Its Red Envelope System for Billion-User Peaks
This article details SuNing Financial’s red‑envelope platform architecture and evolution, covering high‑concurrency challenges, traffic shaping, asynchronous processing, multi‑level caching, Redis‑based distributed locks, task scheduling, payment‑chain isolation, high‑availability deployment, multi‑layer flow control, monitoring, and future scalability directions.
Technical Challenges of the Red Envelope System
Red envelopes are an upgraded flash‑sale system that must retain flash‑sale characteristics. Massive user participation brings high concurrency, data‑consistency, and security challenges, such as preventing over‑issuance, loss, duplication, and ensuring fund safety.
Performance Optimization (High Performance & Data Consistency)
Traffic Shaping (Peak‑Shaving and Valley‑Filling)
Before traffic spikes, the system pre‑warms data during low‑load periods to disperse the burst load when the activity starts. Measures include pre‑opening accounts for active members and opening accounts when users enter the activity aggregation page.
Asynchrony
Red‑envelope recharge and order persistence are handled asynchronously: recharge tasks are queued and processed by multiple machines, reducing pressure on upstream systems and preventing bottlenecks.
Multi‑Level Cache (Hot Data Global Caching)
Local EHCACHE is used for read‑heavy, write‑light data to reduce database and distributed‑cache accesses. Database caches enlarge hot‑table cache size. Global Redis stores hotspot data such as orders and grab records, using sharding and its single‑threaded high‑QPS nature to ensure thread‑safe, high‑throughput access.
Distributed Lock Component
To prevent a user from opening the same red envelope twice when clicking rapidly, a distributed lock is applied on both the envelope ID and user ID. Three common implementations are database locks, Zookeeper locks, and Redis locks; Redis is chosen for its atomic set command with EX/PX options and Lua‑based lock release.
High‑Availability Architecture Practice
Distributed Task Scheduling
Tasks such as timeout refunds and exception compensation are handled by a unified scheduling platform. The system scans eligible orders, performs refunds for overdue envelopes, and compensates for failed deliveries.
Payment Chain & Account Separation
A dedicated minimal‑checkout for red envelopes separates envelope accounts from the normal payment chain, preventing pressure on core payment, membership, and accounting services during large promotions.
High‑Availability Deployment Architecture
The system uses a single‑cluster deployment typical of SuNing’s distributed, high‑availability architecture. Front‑end traffic passes through HLB and WAF, then to Nginx HTTP servers, which proxy to backend application servers. Services communicate via RPC, with Kafka for messaging, Mycat for distributed database access, and sharded Redis for caching.
Front‑end traffic is load‑balanced by HLB and filtered by WAF.
HTTP services are served by Nginx.
RPC framework handles service discovery and calls.
Kafka provides inter‑service messaging.
Mycat accesses distributed databases; Redis provides multi‑shard caching.
Large‑Scale Promotion Guarantees
System Monitoring
Business monitoring is performed by SuNing’s self‑developed platform, tracking per‑second call counts, success rates, and latency, with alerting via SMS or email. Middleware monitoring uses Zabbix for server metrics, Prometheus for Redis, and a custom database platform for slow‑SQL detection. Distributed log platforms provide real‑time exception visibility.
Multi‑Level Flow Control
Flow control is applied at several layers: firewall for HTTP traffic, global RPC flow control, per‑node RPC token‑bucket control, and user‑level control using Redis to limit request frequency per user.
System Degradation
Using a Zookeeper‑based distributed configuration platform, non‑core features can be toggled off during overload, protecting the core red‑envelope flow. Switches can be changed within half a minute to quickly safeguard or restore functionality.
Future Challenges and Directions
Since its launch in 2016, the red‑envelope system has faced evolving challenges: supporting new envelope玩法, enhancing Redis persistence and cache compensation, and achieving cross‑IDC multi‑active deployment for true disaster tolerance. Ongoing work will focus on higher concurrency, availability, and consistency.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
