Databases 11 min read

Why Leading Companies Avoid NULL Values in MySQL

The article explains why major tech companies discourage using NULL in MySQL, covering its meaning as an unknown state, three-valued logic pitfalls such as NOT IN subqueries, index inefficiencies, aggregate function quirks, storage overhead, Java handling issues, and offers practical alternatives like NOT NULL constraints with sensible defaults.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
Why Leading Companies Avoid NULL Values in MySQL

Introduction

When preparing for big‑company interviews, you often see the question “Why should database design avoid NULL?” Many answer “because NULL queries are slow,” but the pitfalls are deeper.

1. Meaning of NULL

NULL is not an empty string nor zero. In MySQL it represents “unknown value” – a state, not a concrete value. It differs from '' and 0, and occupies storage via an extra flag bit. NULL – unknown / non‑existent, stored with a flag. '' – empty string, known to be empty. 0 – numeric zero, known to be zero.

The biggest difference: any operation with NULL yields NULL, and comparisons return NULL (neither TRUE nor FALSE).

SELECT NULL = 0;   -- result NULL
SELECT NULL = '';  -- result NULL
SELECT NULL != NULL; -- result NULL

This leads to MySQL’s three‑valued logic.

2. Three‑valued logic

Standard Boolean logic has only TRUE and FALSE. MySQL adds a third result, UNKNOWN, when NULL participates.

In a WHERE clause only rows that evaluate to TRUE are kept; FALSE and UNKNOWN are filtered out.

Therefore a NOT IN subquery that returns NULL makes the whole condition evaluate to UNKNOWN and the query returns no rows.

Example:

CREATE TABLE users (id INT, name VARCHAR(20));
INSERT INTO users VALUES (1,'Alice'), (2,'Bob'), (3,NULL);
SELECT * FROM users WHERE id NOT IN (SELECT id FROM users WHERE name = 'Alice');
-- Expected: Bob, but actual result is empty because the subquery returns NULL.

Analysis: the subquery returns 1, but the presence of NULL makes NOT IN compare each row with NULL, turning the condition into UNKNOWN, which filters everything.

Solution: filter NULLs in the subquery ( WHERE column IS NOT NULL) or rewrite using NOT EXISTS.

SELECT * FROM users u
WHERE NOT EXISTS (
    SELECT 1 FROM users
    WHERE id = u.id AND name = 'Alice'
);

3. NULL and index performance

3.1 Index does not store NULL values

In InnoDB B‑Tree secondary indexes, NULL values are not stored (except in unique indexes). Consequently a query WHERE column IS NULL cannot use the secondary index and forces a full‑table scan.

Test: a table with 1 000 000 rows, column status 90 % NULL, 10 % non‑NULL. WHERE status = 'ACTIVE' uses the index (many non‑NULL values). WHERE status IS NULL triggers a full scan and performs poorly.

3.2 Composite index and NULL

In a composite index (a, b), if a is NULL the row is omitted from the index. A condition like WHERE a = 1 AND b IS NULL can only use the leading part of the index; the NULL part cannot be filtered via the index.

4. Hidden traps in aggregate functions

-- Table data: amount column has (100, 200, NULL, 400)
SELECT AVG(amount) FROM orders;   -- (100+200+400)/3 = 233.33, NULL ignored
SELECT COUNT(amount) FROM orders; -- 3, NULL ignored
SELECT SUM(amount) FROM orders;   -- 700, NULL ignored

If you expect COUNT(*) to count all rows but use COUNT(column), NULL rows are omitted, leading to incorrect reports. Using such results in division can produce wrong averages.

5. Storage and performance overhead

In InnoDB each row has a NULL‑flag bit (1 bit per column, 8 columns per byte). The extra space is small, but with many NULL columns and large tables the waste becomes noticeable.

More importantly, NULL prevents certain storage‑engine optimizations such as column compression and page‑level pre‑processing. For frequently updated tables, NULL can cause row overflow and lower cache hit rates.

6. Application‑layer “null‑pointer nightmare”

In Java, retrieving a NULL column requires checking ResultSet.wasNull(). If omitted, getLong() returns 0, which may be a valid business value, causing logic errors. Mapping NULL to a POJO field of type Long without a null check can raise NullPointerException.

Long amount = rs.getLong("amount");
if (rs.wasNull()) {
    // special handling
}

7. Recommended solutions

In most scenarios replace NULL with NOT NULL constraints and sensible default values. age INT NOT NULL DEFAULT 0 (0 means unknown by convention)

name VARCHAR(20) NOT NULL DEFAULT ''
status TINYINT NOT NULL DEFAULT 0

(0 as default state)

price DECIMAL(10,2) NOT NULL DEFAULT 0.00
Note: Default values should match business semantics; e.g., using –1 for unknown age may be clearer than 0.

8. When NULL is appropriate

Optional foreign keys, e.g., coupon_id in an order table when no coupon is used.

Missing information, e.g., middle name not provided during registration.

Data loss in ETL pipelines where the field truly does not exist; NULL conveys absence better than a magic number, but remember to index IS NULL checks if needed.

Conclusion

Big‑tech companies discourage NULL for five main reasons:

Three‑valued logic traps (especially NOT IN subqueries).

Index performance degradation – IS NULL forces full scans.

Aggregate functions ignore NULL, leading to statistical errors.

Additional storage overhead and loss of compression optimizations.

Application‑layer complexity and potential NullPointerExceptions.

Most business scenarios can be handled with NOT NULL plus appropriate defaults. Only use NULL when the “unknown” semantics are explicitly required.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLmysqlDatabase designIndex OptimizationNULLThree-valued logic
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.