Databases 23 min read

Comprehensive Database Development Standards for Consistency, Performance, and Maintainability

This guide presents a complete database development specification—covering naming, design, data types, index, and SQL rules—with mandatory, recommended, and reference levels to ensure consistent, high‑performance, and maintainable database schemas.

Architect's Journey
Architect's Journey
Architect's Journey
Comprehensive Database Development Standards for Consistency, Performance, and Maintainability

Overview

This document defines a database development specification based on Alibaba’s standards, organized into five core dimensions—naming, design, type, index, and SQL. Each rule is classified as mandatory, recommended, or reference to guarantee schema consistency, high performance, and long‑term maintainability.

1. Naming Conventions

Boolean fields : Use the is_xxx pattern with unsigned tinyint (1 = yes, 0 = no). Example: is_deleted. Avoid delete_flag or isDelete.

Table/column names : Lower‑case letters, digits, and underscores only; no leading digit; no _digit_ pattern; max 32 characters; words separated by underscores. Example: health_user. Reject HealthUser, rdcConfig, level_3_name.

Singular table names : Use singular nouns to match DO class names. Example: user, order. Plural forms ( users, orders) are prohibited.

Reserved words : MySQL reserved keywords (e.g., desc, range, match) must not be used for database, table, or column names.

Index naming : Primary key pk_<column>, unique index uk_<column>, ordinary index idx_<column>. Example: pk_id, uk_user_phone, idx_create_time.

Database name : ≤30 characters, letters, digits, underscores, start with a letter, all lower‑case; format <system>_<subsystem>. Example: health_user, trade_payment.

Essential columns : Every table must contain id (bigint unsigned, auto‑increment), is_delete, gmt_create, and gmt_modified. id step size ≥ number of data centers; timestamps are datetime.

Full‑name requirement : Use full English words unless a default abbreviation exists; abbreviate only when length exceeds 30 characters (e.g., description → desc).

2. Design Guidelines

Character set (mandatory): Create databases with explicit charset, only utf8 or utf8mb4. Example: CREATE DATABASE crm_fund DEFAULT CHARACTER SET 'utf8'; Principle of least privilege (recommended): Application accounts should have single‑database access, no DROP privilege, and avoid cross‑database operations.

Temporary and backup databases : Prefix tmp_ or bak_ followed by date. Example: tmp_user_20251128, bak_order_20251127.

3. Table & Field Design

Table creation (mandatory): Specify charset ( utf8/utf8mb4) and storage engine (default InnoDB).

Field redundancy (mandatory): Allow redundancy to improve query speed if the field is rarely updated, not a long varchar/text, and redundancy ≤ 20 % of total fields. Example: duplicate category name in product table.

Sharding threshold (recommended): Split when rows > 5 million or size > 2 GB; otherwise defer.

Primary key (recommended): id must be monotonic; use MySQL auto‑increment for small systems, global ID or unique identifier for large or sharded systems.

Id type (mandatory): Use bigint unsigned; never use int even for small data volumes.

NOT NULL & defaults (recommended): Set NOT NULL with sensible defaults (empty string for text, 0 for numbers, 0 for booleans).

Comments (recommended): Every table and column must have clear comments describing purpose and special rules.

Time format (recommended): Store timestamps as yyyy-MM-dd HH:mm:ss in datetime.

Update gmt_modified (mandatory): On any row update, set gmt_modified to the current time.

Field count limits (mandatory): ≤ 50 fields recommended, hard limit 100.

Sensitive data (mandatory): Never store plain‑text passwords; encrypt them. Large binary data (images, files) should reside on file servers, with only the path stored in the DB.

Required time fields (suggested): Include gmt_create and gmt_modified in every table.

Consistency of related fields (suggested): Columns storing the same data must have identical names and types across tables (e.g., user_id bigint unsigned in both user and order tables).

Reserved fields (suggested): Allocate 1‑3 placeholder columns for future extensions.

Concise naming (mandatory): Use clear English words or common abbreviations (e.g., corp_id instead of corporation_id).

4. Type Specifications

Status fields (mandatory): Use TINYINT UNSIGNED for values 0‑255; avoid ENUM. Document each status in the comment.

Boolean mapping (mandatory): Represent booleans with TINYINT(1) (MySQL has no native BOOLEAN).

Unsigned for non‑negative numbers (mandatory): Add UNSIGNED to any numeric column that cannot be negative.

Date & time (mandatory): Use DATE, YEAR, or DATETIME instead of strings.

String types (mandatory): Define VARCHAR(N) with the smallest feasible N (e.g., code VARCHAR(32)); avoid BLOB / TEXT unless absolutely necessary, and isolate them in separate tables if used.

Precise numeric (mandatory): Use DECIMAL for monetary values; never use FLOAT / DOUBLE.

Type selection principle (mandatory): Do not use BIGINT unless the value range truly requires it; always provide column descriptions and enumerate possible values for enums.

Storage efficiency (recommended): Choose the smallest type that satisfies business needs (e.g., TINYINT for 0‑100).

Character length adaptation (reference): Adjust VARCHAR length based on actual business scenarios to save space and improve index selectivity.

5. Index Guidelines

Unique index (mandatory): Fields with business‑level uniqueness must have a unique index, including composite keys.

Join limit (mandatory): Prohibit joins involving more than three tables; join columns must share the same type and be indexed.

Varchar index length (mandatory): Specify a prefix length that achieves ≥ 90 % distinctness (commonly 20 characters); calculate distinctness via COUNT(DISTINCT LEFT(col, len))/COUNT(*).

Fuzzy query restriction (mandatory): Disallow leading‑wildcard searches ( %value)—use a search engine instead.

Index rebuild (mandatory): Create a new index before dropping the old one, especially for unique indexes.

Index count (reference): Limit to ≤ 6 indexes per table to avoid update overhead.

Order‑by alignment (recommended): Place the ORDER BY column as the last column of a composite index to avoid filesort. Example: WHERE a=? AND b=? ORDER BY c → index a_b_c.

Covering index (recommended): Prefer indexes that contain all columns needed by the query; verify with EXPLAIN showing Using index.

Pagination optimization (recommended): Use delayed association or sub‑query to fetch IDs first, then join to retrieve full rows.

Performance level target (recommended): Aim for CONST (primary/unique match) → REF (ordinary index) → RANGE (range scan) as the minimum.

Composite index order (recommended): Order columns by high selectivity → short length → frequent use; equality columns first.

Index invalidation avoidance (recommended): Prevent implicit type conversion (e.g., comparing string to numeric) that disables index usage.

Index column selection (recommended): Prioritize columns appearing in WHERE, ORDER BY, GROUP BY, DISTINCT, and join conditions; favor a single composite index over many single‑column indexes.

6. SQL Guidelines

Basic query (mandatory): Use COUNT(*) for row counts; avoid COUNT(col) which skips NULL values.

Distinct & sum : COUNT(DISTINCT col) counts non‑NULL distinct rows; COUNT(DISTINCT col1, col2) returns 0 if any column is all NULL. Use IF(ISNULL(SUM(g)),0,SUM(g)) to avoid NULL sum.

NULL checks : Use ISNULL() instead of direct comparison.

Pagination : If COUNT returns 0, skip further pagination; add LIMIT 1 when only one row is expected.

Foreign keys & cascade (mandatory): Do not use foreign keys or cascade actions; implement referential logic in the application layer.

Stored procedures (mandatory): Prohibited due to debugging difficulty and lack of portability.

Data audit : Query primary key before delete/update and use it in the DML to prevent accidental data loss.

Large‑table limits : Tables > 5 million rows must not be joined; tables > 500 k rows must not use aggregation or GROUP BY. Large‑table queries must use high‑selectivity indexes and return ≤ 100 rows.

Batch operation limits : Split bulk UPDATE/DELETE/INSERT into batches with LIMIT; INSERT ... VALUES(...) cannot contain more than 5 000 rows.

Index column expression : Do not apply functions to indexed columns in WHERE (e.g., WHERE DATE(create_time)='2025-11-28').

Multi‑table update (mandatory): Disallow a single statement that updates multiple tables.

Transaction control : Keep transactions small (< 5 statements), avoid long‑running locks, use read‑only queries without a transaction, and base updates on primary/unique keys.

Subquery optimization : Prefer JOIN over subqueries that generate temporary tables.

DML condition : Every UPDATE/DELETE must have a WHERE clause using an index; unconditional modifications are forbidden.

IN clause limit : Limit the number of elements to ≤ 1 000; MyBatis IN must check for empty collections to avoid full scans.

7. Recommended SQL Practices

Batch INSERT using multi‑row syntax to improve throughput.

Avoid implicit type conversion (e.g., comparing a bigint column to a quoted string).

Always specify column names in INSERT statements.

Prefer UNION ALL over UNION; limit UNION clauses to ≤ 5.

Minimize ORDER BY; when necessary, ensure an appropriate index exists.

Avoid INSERT ... ON DUPLICATE KEY UPDATE in high‑concurrency scenarios.

Optimize multi‑table joins by placing high‑selectivity columns first and leveraging covering indexes.

8. SQL Performance Summary

① Put highly selective WHERE conditions early; ② Avoid GROUP BY, DISTINCT, joins, and subqueries when possible; ③ Use index‑friendly operators ( =, >, <=, BETWEEN, LIKE 'abc%'); ④ Avoid SELECT * and fetch only needed columns; ⑤ Replace large IN lists with BETWEEN for continuous ranges; ⑥ Keep LIMIT small to prevent inefficient pagination.

9. Example DDL

CREATE TABLE health_package (
  id bigint unsigned NOT NULL AUTO_INCREMENT COMMENT '序号(主键)',
  package_id int unsigned NOT NULL COMMENT '套系id',
  module_id int unsigned NOT NULL COMMENT '模块id',
  package_name varchar(64) NOT NULL DEFAULT '' COMMENT '套系名称',
  price decimal(10,2) NOT NULL DEFAULT 0.00 COMMENT '套系价格',
  status tinyint unsigned NOT NULL DEFAULT 0 COMMENT '状态:0-未上架,1-已上架,2-已下架',
  is_delete tinyint unsigned NOT NULL DEFAULT 0 COMMENT '是否删除:0-未删除,1-删除',
  gmt_create datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
  gmt_modified datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '更新时间',
  PRIMARY KEY (id),
  UNIQUE KEY uk_package_id (package_id),
  KEY idx_module_id_status (module_id, status),
  KEY idx_gmt_create (gmt_create)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='健康业务套系与模块关联表';
PerformanceMySQLdatabase designNaming ConventionsSQL Best PracticesIndex GuidelinesSchema Standards
Architect's Journey
Written by

Architect's Journey

E‑commerce, SaaS, AI architect; DDD enthusiast; SKILL enthusiast

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.