Databases 9 min read

Designing Scalable Login Schemas for Billions of Users

This article explains how to design a flexible, extensible database schema and login flow for a system with a billion users, covering multi‑credential handling, sharding strategies, hash‑based routing, and practical considerations such as password updates and caching.

21CTO

Feb 27, 2022

Designing Scalable Login Schemas for Billions of Users

In an interview scenario, a candidate is asked to design a table structure and login process for a system with a billion users who can log in via phone number, email, nickname, or third‑party accounts.

Key assessment points are the candidate's experience with similar business logic and solid database design skills, as well as understanding of high‑concurrency sharding (分库分表).

Table Structure Design

When multiple login methods exist, a user may have several associated accounts. A naïve design adds a column for each credential (e.g., phone, email, nick_name), resulting in a table like: id | name | phone | email | nick_name | desc This approach has two major drawbacks:

Adding a new login method requires altering the table schema and updating code, increasing the risk of bugs.

A more flexible solution is to store login credentials in a separate authorization table, where each credential is a row rather than a column. The user table then holds only non‑login related information.

Example schema:

The user table stores basic profile data:

Sample data for the authorization table (Author):

id   user_name          type  passwd  user_id
10001 siyuanwai          01    xxxxx   1
10002 [email protected] 02    xxxxx   1

Sample data for the user table (User):

id  nick_name  logo_url      user_number  user_names
1   四猿外    /pic/xyz.png  xxxxx        siyuanwai,[email protected]

Note: Changing a password requires updating both rows, which must be handled carefully.

Login Process for a Billion‑User System

With a billion users, high concurrency and large data volume demand sharding. Assuming a MySQL instance can reliably handle ~1,000 TPS and a table up to ~10 million rows, we aim to keep each shard below ten million rows, resulting in roughly 100 tables.

Distribute these tables across 10 databases (10 tables per DB) to stay well within capacity.

Sharding strategy:

Use user_id modulo operations to determine database and table. For example, user_id % 10 selects the database, and user_id % 100 selects the table.

Ensure the number of databases and tables are co‑prime (e.g., 11 databases and 100 tables) to avoid uneven distribution.

For the authorization table, which stores login credentials, sharding is based on a hash of user_name:

db_index = hash(user_name) % 11
table_index = hash(user_name) % 100

User enters credential (e.g., email) and password.

System determines the credential type.

Information is sent to the server.

A routing layer computes hash(credential) % 11 and % 100 to locate the correct database and table.

The server queries the authorization table to verify the password.

In real‑world scenarios, additional layers such as external caches (Redis) and search indexes (Elasticsearch) are often added to store frequently accessed user metadata and routing information, reducing load on the sharded databases.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Sharding High concurrency Scalable Architecture login system

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.