How Suning Scaled Its Membership System for Double‑11: Architecture Evolution and Multi‑Active Deployment
The article details Suning’s decade‑long evolution of its membership platform for the Double‑11 shopping festival, covering early offline CS architecture, the transition to a WebSphere‑based e‑commerce system, vertical module splitting, data migration with Spark and Hive, and the implementation of same‑city multi‑active deployment.
Architecture Evolution of Suning Membership Platform
The membership platform has been refactored through four major phases to support rapid O2O growth and large‑scale events such as Double‑11.
1. Early Offline‑Centric CS Architecture
Each physical store ran a Windows client that communicated over TCP to an IIS‑based server. The server accessed a Sybase database via ODBC. This monolithic client‑server model could not scale as store count and transaction volume increased.
2. Partnership‑Based E‑Commerce System (Yigou)
To overcome the scalability ceiling, Suning adopted a commercial stack:
Application server: WebSphere Application Server (WAS) clustered behind an F5 load balancer.
Database: DB2 on a high‑reliability mini‑computer.
Integration: Enterprise Service Bus (ESB) for synchronous services and MQ for asynchronous messaging.
Functional decomposition into five modules: communication adapters, transaction processing, broadcast services, batch processing, and administration.
Although this architecture supported the initial e‑commerce surge, it introduced several constraints:
Limited flexibility due to proprietary components.
Single‑data‑source design prevented horizontal scaling.
Read/write splitting and Redis caching were insufficient for peak loads.
3. Self‑Developed Vertical Splitting (Suning Framework)
Suning built its own stack to replace the commercial suite:
Suning Framework (SNF) – core application framework.
Data Access Layer (DAL) – unified DB access.
Remote Service Framework (RSF) – real‑time inter‑service calls.
Message middleware: ActiveMQ and Kafka .
Configuration management: Suning Configuration Management (SCM) .
Authentication: User Account and Authentication (UAA) .
Task scheduling: Unification Task Server (UTS) .
Functionality was split into independent sub‑systems:
Pre‑combination (cross‑module orchestration).
Level management.
Account profile.
Account.
Reporting.
Each sub‑system runs on its own WildFly cluster with MySQL (one master, two slaves). This vertical segmentation isolates workloads, improves performance, and enables independent scaling.
4. Data Migration Strategy
Migration from DB2 to MySQL leveraged Spark and Hive in a multi‑step pipeline:
Extract all partitioned DB2 tables into Hive “A‑class” tables using Spark.
Classify extracted rows:
B‑class : invalid data.
C‑class : clean data after filtering.
D‑class : data that failed validation.
Insert B‑class rows into historical tables of the new system.
For C‑class rows:
Simple transformation – generate INSERT SQL directly and load into MySQL.
Complex transformation – write intermediate results to Hive “E‑class” tables, then query to generate INSERT statements.
When fixing specific member records, delete the corresponding rows in MySQL, re‑run the transformation, and re‑insert.
5. Route Switching Procedure (Zero‑Downtime Migration)
The migration was performed in two phases to avoid service interruption:
Phase 1 – Low‑Latency Query Interfaces
Old system continues to handle writes; each create/update writes a record to a pending_tasks table.
Historical data is bulk‑loaded into the new MySQL clusters.
After data verification, a scheduled job processes pending_tasks, invoking the new system’s APIs.
Phase 2 – High‑Latency Queries and Writes
New member IDs are allocated from a distinct numeric range to avoid collisions.
Extensive monitoring and bug‑fix cycles are run while gradually routing remaining interfaces to the new system.
6. Same‑City Multi‑Active Deployment
To eliminate the single‑datacenter bottleneck, a secondary data center (sub‑room B) was built. The vertically split sub‑systems (account profile, level, account) are deployed in both locations. Routing between sites uses RSF; most business logic stays within a single site, except for cross‑site operations such as member registration. Database consistency is maintained via MySQL binlog replication.
7. Future Direction – Cross‑City Multi‑Active Architecture
The next evolution targets geographically distributed multi‑active deployment:
Full member data synchronization across all sites.
Each site can act as primary or backup, enabling automatic failover.
Continued platformization to integrate additional member services and expose a rentable “membership‑as‑a‑service” offering.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
