Why Auto‑Increment IDs, Redundant Fields, and Index Limits Matter in Database Design
This article explores key database design decisions—including the benefits and drawbacks of auto‑increment primary keys, the trade‑offs of redundant columns, optimal index counts, and when to choose VARCHAR versus CHAR—offering practical guidelines to improve performance, storage efficiency, and maintainability.
Database is the foundation of a system; its design directly impacts performance and scalability. This article examines Suning's supply‑chain procurement platform over five years, summarizing database design issues and extracting four design guidelines.
01 Auto‑Increment ID as Primary Key
Analysis of InnoDB table and index structures shows B+‑tree leaf nodes are ordered linked lists. Using an auto‑increment integer as the primary key brings several advantages:
Faster inserts – new rows are always appended to the leaf, avoiding page splits caused by random keys such as UUID.
Space efficiency – a 4‑byte INT occupies far less space than VARCHAR, allowing more rows per 16 KB page and reducing I/O.
Note: the length specification (e.g., INT(1) vs INT(11)) does not limit the numeric range.
Critical view : Not all tables should use an auto‑increment key. When queries rely on secondary indexes, a “back‑track” to the primary key adds overhead, and during recovery using binlog the auto‑increment value may clash.
02 Redundant Fields to Speed Retrieval
As system functionality grows, many queries need data from multiple tables. Adding redundant columns reduces join complexity and improves response time, at the cost of extra storage.
Resource waste – extra columns consume disk space, which can be significant at billions of rows.
Increased maintenance – schema changes must be applied to every table containing the redundant field.
Potential performance impact – longer rows reduce the number of rows per page, increasing I/O.
03 Limit on Number of Indexes per Table
Adding indexes improves query speed but also consumes disk space and slows data modification because each index must be maintained.
Disk usage – each index stores a copy of the indexed columns and the primary key.
Write performance degradation – insert, update, delete operations must update all indexes.
Guideline: keep the total number of indexes on a single table to five, unless a specific workload (e.g., reporting) justifies more.
04 Choosing Between VARCHAR and CHAR
VARCHAR stores variable‑length strings, allocating only the actual length plus 1‑2 bytes for length metadata, while CHAR is fixed‑length.
Space allocation – VARCHAR saves space for short values; CHAR can be more efficient for truly fixed‑length data such as codes.
Maximum length – VARCHAR up to 65 535 bytes, CHAR up to 255 bytes.
Performance – CHAR avoids length calculations, offering slightly faster reads/writes for short, fixed‑size fields.
In practice, select CHAR for short, invariant fields (e.g., fixed‑length codes) and VARCHAR for variable‑length text. Apply the guidelines flexibly according to business needs and data volume.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Suning Technology
Official Suning Technology account. Explains cutting-edge retail technology and shares Suning's tech practices.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
