How to Fix Chinese Character Encoding Issues in Python Crawlers
This article walks through a real‑world Python encoding problem where Unicode \u sequences cause garbled Chinese text, demonstrates the faulty code, and provides a corrected solution using json.dumps with ensure_ascii=False and GBK encoding to produce proper Base64 output.
Introduction
The author received a question in a Python community about decoding JSON data that contains Chinese characters, which were appearing as escaped \u sequences and resulting in unreadable output.
Problem
The original script serialized a dictionary with Chinese keys, removed spaces, and then Base64‑encoded the UTF‑8 string. The resulting output displayed garbled characters when decoded on the asker’s side.
Initial Attempt
import json
import base64
d = {"小明":55, "小爱":111, "嘎嘎":True}
s = json.dumps(d).replace(' ', '')
print(s)
print(base64.b64encode(s.encode('utf-8')))This code produced the expected Base64 string on the author's machine but showed garbled output for the asker.
Solution
By ensuring the JSON serializer does not escape non‑ASCII characters and by encoding the string with the GBK charset before Base64 conversion, the problem is resolved.
import json
import base64
d = {"小明": 55, "小爱": 111, "嘎嘎": True}
s = json.dumps(d, ensure_ascii=False).replace(' ', '')
print(s)
print(base64.b64encode(s.encode('gbk')))The corrected script outputs a readable Base64 string that decodes back to the original Chinese characters.
Conclusion
The article demonstrates how to handle Chinese character encoding in Python by using ensure_ascii=False and the appropriate charset (GBK) before Base64 encoding, enabling fans to correctly decode and display the data.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
