Databases 15 min read

Relational Database Design with MySQL: ER Modeling, Normalization, Storage Engine, Charset, Data Types, and Index Design

This article explains relational database design using MySQL, covering ER modeling, BCNF and 3NF normalization, storage engine and charset choices, data type selection, and index design, illustrated with examples and practical guidelines for building a student course selection system.

Architect's Guide
Architect's Guide
Architect's Guide
Relational Database Design with MySQL: ER Modeling, Normalization, Storage Engine, Charset, Data Types, and Index Design

Abstract

This article discusses relational database design topics such as the relational model and table structure design, using a student course selection scenario to illustrate the design process while simplifying the content for readers with a basic understanding of relational databases.

Entity-Relationship Model (E‑R)

First, understand what an E‑R data model is and its purpose.

The E‑R model maps real‑world facts and relationships to a conceptual schema, using three main concepts: entity set, relationship set, and attribute.

实体 (Entity): An object or event distinguishable from others, e.g., each student or each course.

属性 (Attribute): Descriptive properties of an entity, e.g., a student's name or ID.

实体集 (Entity Set): A collection of entities of the same type, such as all students.

关系 (Relationship): Associations between entities, e.g., a student enrolling in a course.

关系集 (Relationship Set): A collection of similar relationships, e.g., all student‑course enrollments.

With this knowledge, we can draw the E‑R diagram for the student course selection scenario.

The diagram shows student attributes (ID, name, age, gender), course attributes (course ID, name, credits), and enrollment attributes (grade). The student‑course relationship is many‑to‑many.

Relationship Table Design

From the E‑R diagram we derive relational schemas, aiming to avoid redundant data while enabling easy information retrieval. Normalization helps achieve this goal.

Boyce‑Codd Normal Form (BCNF)

BCNF requires that for every non‑trivial functional dependency α→β, either α is a superkey or the dependency is trivial.

α→β is a trivial functional dependency (β ⊂ α).

α is a superkey of the relation.

Example schemas:

Student = (student_id, name, age, gender) with student_id → name, age, gender.

Course = (course_id, name, credits) with course_id → name, credits.

Enrollment = (student_id, course_id, grade) with (student_id, course_id) → grade.

All these schemas satisfy BCNF because the left‑hand side of each non‑trivial dependency is a candidate key.

However, not every schema can be decomposed into BCNF while preserving all dependencies, as shown by the Banker‑schema example.

Third Normal Form (3NF)

If BCNF cannot be achieved without losing dependency preservation or lossless join, we can accept 3NF, which guarantees lossless join and dependency preservation.

α→β is trivial.

α is a superkey.

Each attribute in β‑α is part of a candidate key.

The Banker‑schema is not in BCNF but does satisfy 3NF.

Every BCNF relation is also in 3NF because BCNF imposes stricter constraints.

Storage Engine Selection

After fixing the relational schema, the next step is to choose a storage engine. MySQL provides several engines tailored for different use cases.

The two most common engines are MyISAM (default before MySQL 5.5.5) and InnoDB (default since MySQL 5.5.5).

How to Choose?

Choose based on application characteristics; mixing engines is possible but adds complexity, especially for backup and recovery.

InnoDB is preferred for transactional workloads.

MyISAM suits most non‑transactional scenarios.

Memory engine offers in‑memory speed (often replaced by Redis or other NoSQL solutions).

Charset Selection

Choosing the right character set is crucial; a poor choice early on can lead to costly migrations later.

Guidelines

Prefer smaller character sets that meet current and near‑future requirements to save space and improve performance. Use database‑level or table‑level charset settings for flexibility and consistency.

Data Type Selection

Selection Principles

Use the most precise type that fits the data to minimize storage and I/O.

Fixed‑Length vs. Variable‑Length

CHAR vs. VARCHAR

Example comparison:

mysql> CREATE TABLE vc (v VARCHAR(4), c CHAR(4));
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO vc VALUES ('ab  ', 'ab  ');
Query OK, 1 row affected (0.00 sec)

mysql> SELECT CONCAT('(', v, ')'), CONCAT('(', c, ')') FROM vc;
+---------------------+---------------------+
| CONCAT('(', v, ')') | CONCAT('(', c, ')') |
+---------------------+---------------------+
| (ab  )              | (ab)                |
+---------------------+---------------------+
1 row in set (0.06 sec)

In InnoDB, fixed‑length CHAR does not guarantee better performance than VARCHAR; overall row size matters more.

TEXT and BLOB

After heavy deletes/updates, run OPTIMIZE TABLE to defragment.

Avoid querying large TEXT/BLOB columns to reduce I/O.

Consider separating them into dedicated tables.

Floating‑Point vs. Fixed‑Point

MySQL uses FLOAT/DOUBLE for floating‑point and DECIMAL for fixed‑point. Floating‑point offers a larger range but suffers from precision issues, especially for monetary values.

Use fixed‑point (DECIMAL) for financial calculations.

Integer Types

MySQL supports standard INTEGER/SMALLINT and extensions TINYINT, MEDIUMINT, BIGINT. The following table shows storage requirements and ranges.

Index Design

Design Principles

Index columns used in WHERE clauses or join conditions, not just SELECT lists.

Prefer unique indexes; non‑unique indexes are less effective.

Use short indexes or prefix lengths for long string columns.

Leverage leftmost prefix of composite indexes.

Avoid over‑indexing.

Do not index columns used only in functions.

Example

Based on the student‑course E‑R diagram, the following tables are designed:

Student table (Student)

Course table (Course)

Enrollment table (SC)

Student name length set to 40 to accommodate foreign names.

Gender stored as an ENUM for clarity.

Birthdate stored instead of age to keep data current.

Grades stored as DOUBLE (precision not critical).

(student_id, course_id) used as a composite primary key.

References

(美) Abraham Silberschatz et al., *Database System Concepts*, Beijing: Mechanical Industry Press, 2012.

MySQL 5.7 Reference Manual.

eimhe.com – NetEase Tech Department MySQL Chinese Materials.

IndexingStorage EngineMySQLDatabase DesignData TypesnormalizationER Model
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.