Designing a Billion‑User User Center: Architecture, Interface Design, Token Degradation, Data Security, and Monitoring
This article presents a comprehensive engineering guide for building a high‑availability, high‑performance, and secure user‑center system that can serve hundreds of millions of users, covering service architecture, API design, sharding, token fallback, data protection, asynchronous processing, and observability.
1. Service Architecture
The user center is a core internet subsystem handling login, registration, profile management, token generation, and validation. To meet billion‑scale demands, it is split into three independent microservices: a gateway service (aggregates business logic and external calls), a core service (handles simple logic and data storage, directly accesses Redis or DB), and an asynchronous consumer service (processes message queues).
With this design, new features can be deployed by updating only the gateway service, keeping core and consumer services stable, though the call chain becomes longer and compatibility testing is required.
2. Interface Design
Interfaces are divided into Web and App APIs, each with distinct cross‑origin, encryption, signature, and token verification mechanisms. Critical APIs such as login receive special treatment: data models are split into a lightweight core user table (userId, username, phone, password, salt) and a separate profile table for auxiliary fields. The login path is shortened to rely primarily on read‑only DB access, with automatic degradation to fallback strategies (e.g., password‑only verification) when upstream services like anti‑fraud or SMS fail.
App APIs also implement replay‑attack protection and request signing, leveraging big‑data‑driven user behavior profiles for additional validation (phone verification, real‑name, facial checks, etc.).
3. Sharding and Partitioning
When user data exceeds 100 million records, vertical and horizontal partitioning are applied. Core user fields remain in a vertically split table, while event logs (login, password changes, etc.) are moved to separate databases. For high‑frequency queries, the front‑end uses indexed lookups; back‑office analytics may employ Elasticsearch for batch queries, balancing consistency and performance.
The article also describes two horizontal sharding strategies: an index‑table method mapping mobile/username to UID, and a “gene” method that embeds a generated N‑bit mobile hash into the UID, then uses modulo arithmetic to route records to specific shards.
Generate an N‑bit gene from the mobile number: mobile_gen = f(mobile);
Create a globally unique M‑bit ID;
Combine M and N bits to form the UID;
Use the N‑bit portion to determine the target database via modulo;
During lookup, extract the N‑bit suffix of the UID to locate the correct shard.
4. Token Flexible Degradation
Two token types are used: Web tokens (often stored in cookies) and App tokens (generated after credential verification). Tokens are built from userId, phone, random code, and expiry, then encrypted and cached in Redis. If Redis is unavailable, the server creates a special‑format token; validation falls back to decrypting the token, extracting the embedded data, and verifying against the database, with rate‑limiting to protect DB performance.
5. Data Security
Sensitive data is stored separately and encrypted with multiple layers (salted passwords, bcrypt/scrypt). Password strength is enforced via blacklist checks. Bcrypt provides per‑hash random salts; scrypt adds memory‑hard computation, both resisting rainbow‑table attacks at the cost of higher CPU/memory usage.
6. Asynchronous Consumption Design
After login or registration, user actions are logged and published to a message queue. Downstream services (e.g., points, coupons) consume these events asynchronously, decoupling the user center from heavy business logic and enabling compensation mechanisms when the queue is unavailable.
7. Flexible Monitoring
Critical metrics such as QPS, memory usage, GC time, service latency, database binlog rates, and Zipkin‑based end‑to‑end traces are monitored. Alerts trigger on abnormal drops or spikes, allowing rapid response to protect login/registration availability.
8. Summary
The article outlines a holistic design for a billion‑user user center, covering service decomposition, API strategies, token fallback, data protection, sharding, async processing, and observability. It also notes remaining challenges like auth‑service separation, monitoring granularity, and continuous improvement of security, availability, and performance.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.