Ctrip’s Journey: Transforming the Account System from Monolith to Multi‑Region Platform
This article examines Ctrip’s evolution of its account system—from a monolithic service to a domain‑driven, middle‑platform architecture with multi‑region deployment—detailing the motivations, domain restructuring, read/write comparison process, configuration‑driven capabilities, and routing strategies that enable scalable, reliable user management.
Introduction
Early Internet account systems combined user profile management, authentication, and auxiliary data (avatars, points, levels) in a single monolith. As business grew, these responsibilities were split into independent domain services, a common industry practice.
Domain‑Driven Refactoring
Rapid micro‑service expansion fragmented the account system into many specialized applications (third‑party login, real‑name information, platform‑specific access). This caused:
Excessive RPC hops, degrading latency and stability.
Non‑atomic operations, leading to dirty data.
Proliferation of services, increasing development, testing and operational overhead.
The refactoring goals are to:
Partition domains so that related logic stays cohesive.
Make the migration transparent to downstream business services.
Full Domain Re‑partition
Account functionality is grouped into three logical domains:
Core functions : account query, registration, binding/unbinding of phone, email, third‑party credentials, OpenID generation, password management, etc.
Login‑state functions : generation, verification, renewal, eviction and expiration handling of login tokens.
Logging & monitoring : business event logging and health monitoring.
Supporting services include token issuance/validation and captcha generation. Access layers comprise BFF composition, front‑end UI/SDK, and internal service integration.
Read‑Write Comparison for Transparent Migration
Because the account system is a critical backbone, migration must be 100 % complete and isolated from the live cluster.
Read comparison : Live traffic is mirrored to an offline “Old” cluster, which forwards each request to a parallel “New” cluster. Responses from both clusters are compared for consistency.
Write comparison : Production traffic is recorded (including request payload, response, and emitted messages). Two offline clusters (Old and New) are deployed with identical database snapshots. The recorded traffic is replayed against both clusters; outputs, storage changes, and messages are compared to ensure exact parity.
Middle‑Platform Consolidation
As the organization grew, multiple brands and independent account systems emerged. A middle‑platform abstracts common capabilities, reduces architectural complexity, and accelerates onboarding.
UID‑Based Routing and Simplified Mapping
Account IDs encode a “system ID” that isolates business domains, governs policies, and determines storage routing. A globally unique identifier (UID) embeds the system ID, eliminating large mapping tables and simplifying cross‑region routing. UID uniqueness also speeds up troubleshooting and technical support.
Configuration‑Driven Capabilities
The middle platform provides reusable capabilities:
Account lifecycle management (registration, verification, deregistration, binding/unbinding, password handling, OAuth data).
Multiple login methods (password, SMS code, one‑click, QR code, social logins such as WeChat, Alipay, QQ, Weibo, Huawei).
Login‑state management (token generation, validation, renewal, eviction).
Security & monitoring (event logging, real‑time risk control, slide‑captcha integration).
New requirements should be expressed as configurable features rather than custom code.
Diverse Integration Options
UI integration : Unified front‑end provided by the platform, invoked by business teams as needed.
Front‑end SDK : Lightweight SDK for minor customizations (logo, protocol tweaks) that wraps all platform flows.
Back‑end integration : Teams can build a BFF that composes platform services together with auxiliary systems (captcha, token).
After the middle‑platform is in place, onboarding a new account system involves selecting required capabilities and adjusting configuration, typically within hours.
Multi‑Region Deployment
A two‑data‑center, three‑center architecture improves fault tolerance and latency for region‑specific users. The platform must route requests to the correct region while keeping deployments homogeneous.
User Identification & Routing Strategies
Two approaches were evaluated:
Gateway performs a user‑to‑region lookup on each request, adding latency and a single point of failure.
During login, the region tag is attached to the client token; the gateway then routes based on this tag, eliminating per‑request lookups.
Region‑Aware Architecture
Components are deployed per region:
Gateway layer : Routes requests using the login‑issued region tag.
Internal services : Operate within a single region, forming a closed loop to avoid cross‑region latency.
Data layer : Identical DB schemas replicated via DRC with filtered synchronization; Redis caches are region‑local and cleared via sync messages when data changes.
Conclusion
The account system evolved from a monolithic application to a domain‑driven, middle‑platform service with multi‑region deployment. The transformation demonstrates how systematic domain partitioning, configuration‑driven capabilities, and region‑aware routing can achieve high performance, data consistency, and rapid onboarding for large‑scale internet platforms.
Code example
相关阅读:Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
