Understanding MySQL Internal vs External Temporary Tables and Their Storage Engines
This article explains the two kinds of MySQL temporary tables—external and internal—covers the scenarios that trigger their creation, details how MySQL chooses MEMORY, MyISAM or InnoDB storage engines based on system variables and data size, and describes indexing, hash‑field fallbacks, and optimization hints such as SQL_SMALL_RESULT and SQL_BIG_RESULT.
Overview of MySQL Temporary Tables
MySQL distinguishes two kinds of temporary tables: external temporary tables created explicitly with CREATE TEMPORARY TABLE, and internal temporary tables that MySQL creates automatically during query execution. When the Extra column of an EXPLAIN output shows Using temporary, the query is using an internal temporary table.
For simplicity, the article treats "temporary table" and "internal temporary table" as synonymous, meaning a table MySQL creates automatically during execution.
1. Preparation
Two example tables with identical structures are used: t_recbuf and t_internal_tmp_table. The t_recbuf definition includes various column types (int, varchar, enum, set, bit, blob, decimal, etc.).
CREATE TABLE `t_recbuf` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`i1` int(10) unsigned DEFAULT '0',
`str1` varchar(32) DEFAULT '',
`str2` varchar(255) DEFAULT '',
`c1` char(11) DEFAULT '',
`e1` enum('北京','上海','广州','深圳','天津','杭州','成都','重庆','苏州','南京','洽尔滨','沈阳','长春','厦门','福州','南昌','泉州','德清','长沙','武汉') DEFAULT '北京',
`s1` set('吃','喝','玩','乐','衣','食','住','行','前后','左右','上下','里外','远近','长短','黑白','水星','金星','地球','火星','木星','土星','天王星','海王星','冥王星') DEFAULT '',
`bit1` bit(8) DEFAULT b'0',
`bit2` bit(17) DEFAULT b'0',
`blob1` blob,
`d1` decimal(10,2) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2001 DEFAULT CHARSET=utf8;2. Scenarios That Use Temporary Tables
ORDER BY fields differ from GROUP BY fields.
JOIN statements where ORDER BY or GROUP BY fields are not the first table in the execution plan.
Aggregations with DISTINCT (e.g., COUNT(DISTINCT i1)).
Queries using UNION or UNION DISTINCT.
Derived tables (EXPLAIN shows select_type = DERIVED).
Subquery materialization (both semi‑join and full materialization).
INSERT…SELECT where source and target are the same table.
The list is based on the official MySQL documentation with minor adjustments.
To verify whether a query uses a temporary table, check the Extra column of its EXPLAIN output for Using temporary.
3. Storage Engine Choices for Temporary Tables
MySQL can use three storage engines for temporary tables: MEMORY, MyISAM, and InnoDB. By default, MySQL creates temporary tables with the MEMORY engine.
Two conditions force MySQL to switch to a disk‑based engine:
Any column written to the temporary table is a large object (BLOB).
The system variable big_tables is set to ON.
When big_tables=ON, the engine used is determined by the variable internal_tmp_disk_storage_engine, which can be MyISAM or InnoDB (default InnoDB).
For details on which column types are considered large objects, see the article "MySQL 大对象(BLOB)和字符串的分身术".
To explicitly request a memory engine, add the hint SQL_SMALL_RESULT to the query: SELECT SQL_SMALL_RESULT * FROM t_recbuf; If a disk engine is desired, the hint SQL_BIG_RESULT forces MySQL to use the engine indicated by internal_tmp_disk_storage_engine.
SELECT SQL_BIG_RESULT e1, MIN(i1) FROM t_internal_tmp_table GROUP BY e1;4. When Memory Temporary Tables Convert to Disk
MEMORY tables store fixed‑length records and cannot hold BLOB columns. Variable‑length columns (VARCHAR, VARBINARY) are stored as if they were fixed‑length, using the defined maximum length.
If the total memory used by a temporary table exceeds the smaller of tmp_table_size and max_heap_table_size, MySQL automatically converts the table to a disk‑based engine.
Default values: tmp_table_size = 16 MiB (minimum 1 KiB, maximum huge), max_heap_table_size = 16 MiB (minimum 16 KiB).
The conversion process consists of:
Create a new MyISAM or InnoDB temporary table (engine chosen by internal_tmp_disk_storage_engine).
Copy all rows from the MEMORY table to the new disk table.
Insert the pending row that triggered the conversion.
Drop the original MEMORY table.
5. Which Columns Are Written to a Temporary Table?
Temporary tables can be classified into two categories based on the fields they store:
Tables that serve the whole SQL statement.
Tables that serve a single aggregate function.
For the first category, MySQL writes every column returned by the storage engine to the server layer, which may be raw column values or results of functions. Example:
SELECT e1, COUNT(i1) FROM t_internal_tmp_table GROUP BY e1;In this query, the temporary table receives the e1 values and the computed COUNT(i1) for each group.
When ROLLUP is present, the aggregate result is not stored; instead, the columns used as arguments are written.
SELECT a.e1, b.c1, COUNT(a.i1) AS t
FROM t_internal_tmp_table AS a
INNER JOIN t_recbuf AS b ON a.id = b.id
GROUP BY a.e1, b.c1 WITH ROLLUP;For the second category (single aggregate functions), only the columns participating in the aggregate are stored. Example:
SELECT e1, COUNT(DISTINCT i1) AS t FROM t_internal_tmp_table GROUP BY e1;Here the temporary table contains only the i1 values, and MySQL creates a unique index on i1 to enforce distinctness.
6. Index Creation for Temporary Tables
MySQL creates a unique index for the fields used in GROUP BY or DISTINCT to guarantee one record per group or distinct value. Only one index exists per temporary table—either for GROUP BY or for DISTINCT.
6.1 GROUP BY Index
For a query like:
SELECT e1, COUNT(i1) FROM t_internal_tmp_table GROUP BY e1;MySQL creates a unique index named <group_key> on the e1 column. The processing steps are:
Read a row from t_internal_tmp_table and look up e1 in the temporary table.
If the key exists, update the group count.
If the key does not exist, insert a new record with the initial count.
6.2 DISTINCT Index
For a query like:
SELECT e1, COUNT(DISTINCT i1) AS t FROM t_internal_tmp_table GROUP BY e1;MySQL creates a unique index named <auto_key> on the i1 column. The insertion process relies on the uniqueness constraint: a duplicate i1 value causes the insert to fail, which is interpreted as the value already existing.
6.3 Hash Field Fallback
If the number of indexed columns, their lengths, or the total index record length exceed engine limits, MySQL cannot create the unique index. In that case it adds a hidden <hash_field> column and creates a non‑unique index on it.
The hash field is computed by hashing each GROUP BY or DISTINCT column value and then combining the hashes. The lookup process:
Compute the hash for the incoming row.
Search the temporary table using the hash as the key.
If no record is found, insert a new one.
If a record is found, compare the original column values one by one; only if all match is the row considered a duplicate.
7. Statistics of Internal Temporary Table Usage
MySQL increments the status variable Created_tmp_tables each time a temporary table is created, and Created_tmp_disk_tables when a temporary table is created on disk. The ratio Created_tmp_disk_tables / Created_tmp_tables indicates the proportion of disk‑based temporary tables.
Two optimization approaches can reduce the conversion of memory tables to disk:
Increase tmp_table_size and max_heap_table_size to allow larger memory tables.
Set big_tables=ON to force all internal temporary tables to use the disk engine specified by internal_tmp_disk_storage_engine, which is useful for workloads that inevitably produce large temporary tables.
8. Conclusion
The article covered the definition of MySQL temporary tables, common scenarios that trigger their creation, the decision process for choosing MEMORY, MyISAM, or InnoDB storage engines, the threshold‑based conversion from memory to disk, how fields are written, index strategies including the hash‑field fallback, and metrics for monitoring temporary table usage. Understanding these mechanisms helps developers predict when a query will involve temporary tables and tune system variables or use hints ( SQL_SMALL_RESULT, SQL_BIG_RESULT) to control performance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
