Why Does Chinese Text Turn into Gibberish on Windows? A Complete Fix for Qt/Cocos2d‑x
This article explains why Chinese characters become garbled on Windows when using Qt or Cocos2d‑x, explores ASCII, GBK, Unicode, UTF‑8 and ANSI encodings, shows how Visual Studio misdetects UTF‑8 files, and provides step‑by‑step solutions to eliminate the issue permanently.
During desktop development with Qt, many developers encounter Chinese garbled text on Windows, a problem that also appears in Cocos2d‑x projects. The article first lists four typical symptoms: fully normal output, direct garble, compilation errors (C4819, C2001, C2143), and intermittent behavior depending on character count.
1. Character Encodings
Understanding the issue starts with character encodings. ASCII uses one byte for English characters. Chinese required multi‑byte encodings, leading to GB2312 (2‑byte), later expanded to GBK and GB18030, collectively referred to as GBK encoding.
These region‑specific encodings are not interchangeable, causing garbled text when a file encoded in GBK is opened as Unicode or vice‑versa.
Unicode was introduced to unify all scripts, representing each character with two bytes (or more) and eliminating the incompatibility.
GBK and Unicode differ fundamentally; this mismatch is the root cause of garbled characters.
2. File Encodings
Unicode’s UTF‑8 stores characters in a variable‑length byte sequence. While UTF‑8 saves space for ASCII, it can double file size for pure Chinese text, prompting the creation of UTF‑8‑BOM (Byte Order Mark) to signal the encoding.
UTF‑8‑BOM adds the three bytes EF BB BF at the file start, helping editors recognize the file as UTF‑8.
Windows Notepad’s default "ANSI" option actually saves using the system’s code page (often GBK). When a UTF‑8 file is opened without a BOM, the compiler may treat it as ANSI (MBCS), leading to misinterpretation.
2.1 UTF‑8 vs. ANSI
Visual Studio’s compiler reads ahead to detect Unicode; if it does not find UTF‑16 signatures, it falls back to MBCS (ANSI). Consequently, a UTF‑8 file without BOM is parsed as GBK, producing garbled characters and even compilation errors.
The compiler when faced with a source file that does not have a BOM reads ahead to see if it can detect any Unicode characters – it specifically looks for UTF‑16 and UTF‑16BE – if it doesn't find either then it assumes MBCS.
3. QString Handling in Qt
Qt’s QString expects UTF‑8 by default. Using QString::fromLocal8Bit() tells Qt to interpret the literal as GBK. Mixing these two approaches causes mismatched byte sequences.
QString str1("中文"); QString str2 = QString::fromLocal8Bit("中文");
When a UTF‑8 file is misread as GBK, the compiler may drop bytes (e.g., a missing semicolon) or replace unknown byte pairs with ?, resulting in further errors.
3.1 Example of Compilation Failure
Consider the code char * str = "中文中"; saved in UTF‑8. If the compiler treats the file as GBK, the nine‑byte UTF‑8 sequence is split into GBK two‑byte units, causing the trailing semicolon to be consumed and the compilation to fail.
Adding a trailing space changes the byte count to ten, which the GBK parser cannot map, so it substitutes 3F ("?") for the unknown byte, leading to visible garble in the resulting string.
4. Permanent Solutions
Configure Qt to save all source files in UTF‑8‑BOM format.
When handling Chinese literals, use QString::fromLocal8Bit() to explicitly treat them as GBK.
These steps ensure the compiler correctly recognises the file encoding and QString receives the intended byte sequence, eliminating the garbled output.
5. Summary
The article covered character encodings (ASCII, GBK, Unicode), file encodings (UTF‑8, UTF‑8‑BOM, ANSI), Visual Studio’s mis‑detection of UTF‑8 files, and Qt’s QString handling. The root cause of Chinese garble in Cocos2d‑x is Qt saving files as UTF‑8 without BOM while the VS compiler interprets them as ANSI. Applying the two‑step fix restores correct display and compilation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
