Fundamentals 6 min read

Chinese Province, City, and Area Mapping Python Module (cpca) – Installation, Usage, and Features

This article introduces the cpca Python module that parses free‑form Chinese address strings into province, city, and district fields, explains how to install it, demonstrates basic and advanced usage with code examples, and describes optional full‑text and map‑drawing features.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Chinese Province, City, and Area Mapping Python Module (cpca) – Installation, Usage, and Features

In a modeling competition the author needed to parse free‑form Chinese address strings into province, city, and district fields, so they created a reusable Python module cpca that performs this conversion with a single command.

Installation : the module supports Python 3 and can be installed via pip install cpca ; more details are in the GitHub README.

Basic usage : the core function cpca.transform accepts any iterable of address strings (e.g., a list or pandas Series) and returns a DataFrame with columns “省”, “市”, “区”, and “地址”. Example code is shown.

Position‑sensitive mode : setting pos_sensitive=True adds columns 省_pos , 市_pos , 区_pos indicating the character positions where each component was extracted; a value of –1 means the component was inferred.

Full‑text mode : by default cut=True uses jieba word segmentation, which is fast but may mis‑segment; disabling it with cut=False performs full‑string matching for higher accuracy at the cost of speed. The article shows examples where full‑text mode resolves segmentation errors.

The module also provides simple map‑drawing utilities based on folium . After installing folium , one can call drawer.draw_locations(df, "df.html") to generate an HTML heat‑map of the parsed locations.

Additional parameters such as lookahead can be tuned for long place names, and further examples are available in the GitHub README under “示例与测试用例”.

Pythondata cleaningpandasaddress parsinggeocodingcpca
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.