Understanding MySQL Character Set Variables
This article explains the purpose and usage of MySQL character set system variables, illustrates how they affect client‑server communication and file handling, and provides practical examples and diagrams to help developers correctly configure character sets for optimal performance.
For MySQL users, handling character sets can be confusing due to the many related concepts and settings; this article aims to clarify the origins and secrets of MySQL character sets.
Characters, whether Chinese ideograms or English letters, are recorded as symbols, and in computers they are represented by numeric codes (e.g., the letter "a" is 97). These numeric collections form a character set.
Historically, different cultures used separate character sets; on a CentOS system you can view supported locales with locale -av . The following image shows part of the locale output, such as the UTF‑8 charset for Hong Kong:
Another example is the ISO‑8859‑1 charset for Belgium:
Part 2 – Introduction
The following image lists the MySQL character set variables:
character_set_client
This variable indicates the character set used by the current client. For example, on CentOS, when the LANG environment variable is en_US.UTF-8 , character_set_client is utf8 ; when LANG is C , it is latin1 .
The client uses this charset to display data, e.g., the MySQL command‑line client may show results in latin1 .
character_set_connection
This variable defines the charset used by MySQL during the transmission between server and client; the server converts the client’s charset to this one for processing and then converts results back to the client’s charset.
character_set_database
After executing USE dbname , this variable matches the default charset of the current database. When running ALTER TABLE … CONVERT TO CHARACTER SET DEFAULT , the table is converted to the charset specified by this variable, not the database’s original default.
character_set_filesystem
This variable indicates the charset of the operating system’s filesystem. When MySQL writes files (e.g., INTO OUTFILE ), it converts the SQL charset to the OS charset.
For example, to open /tmp/x , MySQL must translate the SQL charset to the OS charset.
character_set_results
This variable determines the charset of result sets generated by queries; each row in the result set uses this charset.
character_set_server
This server‑level variable defines the default charset for newly created databases when no charset is explicitly specified.
character_set_system
A read‑only global variable whose default value is utf8 ; it applies to internal identifiers such as column and table names.
character_sets_dir
This variable points to the directory containing MySQL’s XML charset definition files; it is used internally by MySQL and cannot be configured by users.
At this point, all major MySQL charset variables have been introduced.
Part 3 – Conclusion
During execution, the client‑to‑server request involves charset conversion, as illustrated in the following diagram:
When multiple charset conversions occur in a single operation, performance can suffer. Using SET NAMES to align character_set_client , character_set_connection , and character_set_results reduces conversion overhead, but the best practice is to keep stored data and the connection charset consistent.
Agent T concludes that MySQL character sets can be dizzying, but the presented examples aim to make the abstract concepts concrete; feedback is welcome.
Tencent Database Technology
Tencent's Database R&D team supports internal services such as WeChat Pay, WeChat Red Packets, Tencent Advertising, and Tencent Music, and provides external support on Tencent Cloud for TencentDB products like CynosDB, CDB, and TDSQL. This public account aims to promote and share professional database knowledge, growing together with database enthusiasts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.