Fundamentals 4 min read

Replace Chinese Commas in Python Strings Using Regex

This article shows how to use Python's regular expressions to replace Chinese comma characters in a long string, explaining the necessary pattern details and providing a concise code example for effective text preprocessing.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
Replace Chinese Commas in Python Strings Using Regex

Introduction

The author shares a question from a Python group about handling a long Chinese string containing many terms separated by Chinese commas.

Original string example:

str1 = "校企联合,省料学,市科学,区科学,社会料学,料学院,料学家,科学实验室,国家料学,国防料学,能源料学,生命科学,计算机料学,国家料学,研究所长,研究所副所长,研究所合作"

Implementation

The solution uses a regular expression to match Chinese commas and replace them, ensuring the pattern uses the correct Unicode comma character.

import re
pattern = r','
result = re.sub(pattern, ',', str1)
print(result)

Key points: select the appropriate regex mode and use the Chinese comma character in the pattern; otherwise the match will fail.

Conclusion

The method successfully converts the Chinese commas to standard commas, demonstrating a simple way to preprocess Chinese text for further analysis with tools like pandas.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

text preprocessing
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.